## CALIFORNIA INSTITUTE OF TECHNOLOGY VLBA Correlator Memorandum

To: VLBA Correlator Memo Series

From: M. Ewing

2 November 1984

Subj: Register-less readout and per-station phase

A considerable saving (\$300K - based on effective gate cost of \$0.02) will occur if a simplified VLBA correlator readout scheme is implemented (no output latches on the VLSI chip). A further saving (\$50-\$100K) can be realized in the VLBA Correlator if the interferometer phase and delay calculations (including "vernier delay") can be combined and performed on a per-station basis. Beyond "raw" gate costs, there should be savings in interconnections, computer interfacing, etc.

This memo outlines a scheme that would provide these features. We conclude that register-less readout is very worthwhile, but that per-station phase introduces substantial complexity, even it saves hardware. B. Rayhrer, D. Fort, and others have contributed to this work.

#### ACCUMULATOR REGISTERS

Until recently, we have assumed that each correlator data accumulator would have to have a data output register, so that the the correlation outputs could be read out instantly on a single clock cycle. This is required if we are to correlate every input bit ("transparency") and it the correlator clock rate equals the data playback rate. The silicon real estate for these registers is substantial, about equal to that for the counters themselves. There are two alternative approaches to eliminating the registers. In both, there has to be sufficient correlation "dead-time" to permit an ouput processor to access each counter sequentially for some block of correlator channels. An eight-lag correlator chip might require about 2 us for readout. System dead time would be some multiple of this figure, perhaps 64 us.

a. Abandon transparency. If the correlator were allowed to ignore incoming data for about 1% of the time, a 64 us readout cycle could be allowed every 6.4 ms -comparable to the fastest required correlator dump cycle. This correlator dead-time has to be synchronized among all correlators; in particular, the "framing" has to be imposed after the interferometer delay. It is not sufficient to rely on the traming imposed by a data-replacement tape format such as Mk III. The loss of 0.5% in SNR may be acceptable, but there could conceivably be a calibration problem, e.g., at fringe rates that are integrally related b. Reclock data. Full data transparency may be maintained by buffering and reclocking data from the DPS. The correlator logic may not care if its clock rate is up to (say) 15% higher than the DPS playback rate. The correlator will correlate a several millisecond data block and hold while the counters are read out; then it resumes without losing any information.

## PER-STATION PHASE

We have always realized that computing phase on a per-station basis involved much less computation than the "standard" per-baseline method. The problem has been how to apply the phase to the data. It is possible to "rotate" the data at each station to a standard point, such as the center of the earth. Unfortunately, if we correlate two such rotated data streams, we suffer the ~4% SNR loss twice. Intermodulation products can be a prolem, and complex x complex correlators, and doubling of interconnection wiring will be necessary. These minuses more than offset the reduction in lobe rotator silicon.

If the phase is calculated per-station, but communicated to each corresponding baseline, there needs to be no extra signal loss. Each correlator baseline (or other module) can difference the two phases corresponding to its input data. There is no extra SNR loss and no added data wiring. The phases do have to be transmitted and switched in the same manner as their associated data streams, however.

SCENARIO WITH REGISTER-LESS CORRELATORS AND BLANKED READOUT

In this case, the correlators are blanked "off" during each readout period (of order 100 us every 10 ms) and the counters are gated onto the readout bus sequentially. Data arriving in this period is not correlated, although the shift registers will continue normally. There is little if any extra cost incurred by this readout method, and the full "\$300K" savings are realized.

## SCENARIO WITH PER-STATION PHASE

If, on further evaluation, the cost savings seem large, or if data blanking is not allowed for correlator readout, we would suggest the following scheme as a way to incorporate per-station phase and register-less readout. A diagram or signal flow for one channel will help explain (Figure 1).



Figure 1. Simplified Block Diagram, per station and channel.

Several aspects of this schematic differ from previous concepts. First, the station delay and phase calculations are all performed in one processor, which can (should) be considered part of the correlator. The DPS unit accepts precise delay commands from this processor; the DPS does not need to calculate a polynomial model as before. In this way, an important interferometer calculation can be centralized.

Second, the station phase and part of the delay are combined with the data by time multiplexing. This is possible without data loss since the data is being reclocked anyway. Enough delay information must be passed to allow the "vernier" delay to be calculated baseline-by-baseline. Since phase and delay are slowly changing and nearly linear functions of data sample time, we need only to transmit "deltas", i.e., commands in the form "increment", "hold", or "decrement". Furthermore, these commands need only be sent every N data samples, where N is determined by the maximum fringe and delay rates. (N=8 is a reasonable choice.) Phase and delay can be initialized during the correlator readout pause, by sending a suitable code in place of data.

At the correlator baseline module, the data are combined as shown in Figure 2. The incoming streams from station A and B are stripped of their phase and delay deltas. The IF samples are passed on along with their (uneven) clock. Phase and delay commands are decoded in small up/down registers (perhaps a single PAL IC). Current phase and delay are sent to the correlator for conventional processing. Clocking for stripping, decoding, and correlation is no burden since it is the same for the entire correlator.



Figure 2. Simplified Block Diagram, per baseline and channel.

# CONCLUSIONS

Substantial cost savings for the VLBA correlator seem to result from eliminating correlator output holding registers. The simplest way to do without these registers is to blank the correlator during readout, although this means ignoring some data.

If all data must be correlated, the DPS data may be reclocked at a higher speed to provide an "artificial" readout pause. If we must reclock, it is appropriate to consider per-station delay and phase calculations. Phase and vernier delay can be communicated to the correlator with no additional cabling if there is sufficient speedup (~13-15%).

If the reclocking scheme is adopted, the partitioning of functions between DPS and the correlator must be re-examined and the delay command interface (at least) must be reconsidered.

Unless a compelling astronomical reason is found, I would prefer to blank but not reclock.