# NATIONAL RADIO ASTRONOMY OBSERVATORY <br> SOCORRO, NEW MEXICO <br> VERY LARGE ARRAY PROGRAM 

VLA ELECTRONICS MEMORANDUM NO. 207

> THE CORRELATOR SYSTEM FOR THE VERY LARGE ARRAY RADIO TELESCOPE

R. P. Escoffier and C. M. Broadwell

July 1982
ABSTRACT
The digital correlator of the Very Large Array radio astronomy telescope is described. This correlator system is capable of supporting both continuum and spectrometer observations of astronomical radio sources. Data recirculation techniques are used for circuit economy yielding a correlator design requiring only 0.11 IC's/product. High density RAM memory is used to provide long term integration in a design of only 0.03 IC's/24 bit product.

The system includes digital delay lines for array phasing.

### 1.0 INTRODUCTION

The Very Large Array (VLA) radio telescope array is an aperture synthesis instrument constructed by the National Radio Astronomy Observatory (NRAO) in New Mexico [1]. The VLA consists of an array of 27 25-meter radio telescopes deployed in a wye configuration. The antennas are transportable and the 27 elements may be configured into any of four standard arrays with the largest having wye arm lengths of 19, 21 and 21 km . The instrument was built to produce high-resolution, high-sensitivity maps at radio frequencies of astronomical radio sources. Observations in the $1.5,4.75,15$ or 23 GHz bands may be performed using the VLA. Received signals are transmitted to a central control building, via buried circular waveguide [2], [3], where conversion of the received astronomical signals to a 0.2 - to $50-\mathrm{MHz}$ baseband level takes place.

Each antenna provides four wideband outputs which are derived from two orthogonal polarizations for each of two $50-1 \mathrm{Hzz}$ bands. Thus the 27 -element array yields 108 broadband signals for processing in the VLA correlator subsystem. The correlator system includes electronics to sample all of the 108 array wideband outputs at a $100-\mathrm{MHz}$ sample rate for analog-to-digital conversion with two bits per sample. This coarse quantization causes little degradation in accuracy because of the Gaussian statistics of both signal and noise [4], [5]. Sample time phasing and digital delay lines are used to phase the signals received in the dispersed array to a $0.625-\mathrm{nsec}$ resolution. A total delay range of $163.84 \mu \mathrm{sec}$ is provided for each of the signal paths.

Digital multipliers are used to measure the cross-correlation coefficients of the various pairs of antenna signals. In continuum observations 11,772 cross- and auto-products are formed and results integrated for programmed time intervals. In spectral line observations, the signals are stored in digital memory and recirculated multiple times through the 11,772 multiplier circuits producing up to 373,248 independent cross- and auto-products for integration.

Continuum observations yield brightness maps of astronomical objects at radio frequencies within the observing bandwidth whereas spectral line observations yield multiple maps, each over a small frequency interval of the observed bandwidth.

### 2.0 THE CORRELATOR SYSTEM

The VLA correlator system is a large digital system operating at a basic clock frequency of 100 MHz and includes 13 racks of NRAOdeveloped hardware built with approximately 85,000 integrated circuits including two custom-developed IC types. The system dissipates about 50 KW .

Two basic modes of operation are possible and are referred to as continuum and spectral line operation. In continuum operation, the system measures the complex correlation coefficients of the 351 pairings of antennas ( 351 baselines) possible with the $27-e l e m e n t ~ a r r a y$. Measurement of complex coefficients is made possible by developing, from each wideband antenna signal, two components with a quadrature
phase relationship to each other. Various sets of 351 parallel- and cross-quadrature products are then developed and from these integrated products the complex correlation coefficients are obtained. In spectral line operation, products are formed as a function of time delay of one of the multiplied signals. Up to 512 lead and 512 lag products for each of the 351 baselines are produced. The Fourier transform of each c̀ross-correlation function is the cross-power spectrum of each antenna pair. The correlator system can produce, in spectral line mode, 351 sets of 512 point cross-power spectra from one set of 27 antenna outputs with a selected bandwidth of 1.56 MHz . Corresponding fewer points per baseline for wider bandwidths and/or multiple antenna outputs can be developed.

Figure 1 gives a block diagram of the VLA correlator system. This block diagram illustrates the 4 -way symmetry in the correlator where each of the major signal handling subsystems is broken into 4 identical parts. Some subsystems process one of the 4 output channels of the array and are identified with the channel A, B, C, or D array output serviced. Other subsystem breakdowns are given the more arbitrary quadrant $1,2,3$, or 4 nomenclature.

### 3.0 THE SAMPLER SUBSYSTEM

The sampler subsystem is the analog-to-digital conversion point in the VLA signal path. The samplers are driven by the 108 baseband outputs of the array. Each baseband output is a white Gaussian signal with frequency bandwidths selectable by switched analog filters in binary steps from about 100 kHz to almost 50 MHz .

Figure 2 is a block diagram of one sampler module. Input to the sampler is held at a 16.5 dBm level by an automatic level control (ALC).

The quadrature network is an L-C network that divides the wideband input into two outputs in phase quadrature. This network is aligned to yield sin and cos outputs that have a $90^{\circ} \pm 1^{\circ}$ phase relation over a 1 - to $50-\mathrm{MHz}$ input frequency range. Sampler unit-to-unit phase matching is kept to within small tolerances over the entire 0.2 - to $50-\mathrm{MHz}$ input range of a sampler.


Digitization is accomplished in the sampler by sampling the amplitude of the $\sin$ and cos quadrature network outputs each 10 nsec. A 2-bit, 3-level quantization format is used as illustrated in Figure 2. Quantization thresholds of $\pm 0.612$ of the rms voltage being sampled are used to obtain maximum correlator sensitivity with a 2-bit, 3-level digital correlator [5]. The result of sampling is thus $2100-\mathrm{MHz}$ logic signals, labled the + and - sampler outputs. Since two signals must be sampled, a sampler module yields $4100-\mathrm{MHz}$ digital outputs from a single analog input, the $+\sin ,-\sin ,+\cos$, and - cos outputs. The output logic of the sampler module is implemented with ECL 10,000 logic elements and thus the $100-\mathrm{MHz}$ digital outputs are at ECL logic levels.

Quantization is accomplished by driving the analog signals into wideband digital comparators for comparison against the $\pm 0.612 \mathrm{~V}_{\text {rms }}$ reference voltages. The offset clipped outputs of these comparators are asynchronously captured in ECL flip-flops being clocked with a $100-\mathrm{MHz}$ clock.

The ECL sample clock is derived from the $100-\mathrm{MHz}$ system clock via a programmable phase shifter. This phase shifter has 4-bit phase programming resolution and can, via a 4-bit command from the computer system, adjust the precise time of sampling to any of 16 discrete times within a $10-\mathrm{nsec}$ system clock interval. This phase adjustment is equivalent to providing programmable path delay for the analog input and is required for array phasing. An array phasing resolution of 0.625 nsec over a 10 -nsec range is thus provided. Earth rotation is accommodated by updating the phasing code at a $19.2-\mathrm{Hz}$ rate. Digital retiming logic is provided to allow the variable-phase sampled outputs to drive logic in the rest of the correlator system operating on a constant-phase clock.

The + and - bits of each sampled analog can be interchanged at the sampler output by application of a control signal. This interchange effects an equivalent $180^{\circ}$ inversion of the analog input. Offsets in the electronics of the VLA are canceled by a technique called Walsh switching where the antenna signals are perodically inverted at a different rate in each antenna's local oscillator [6].

| Level of signal | DIGITAL | OUTPUT | ASSIGNED |
| :---: | :---: | :---: | :---: |
|  | +8is | -817 | values |
| above $\mathrm{V}_{\text {ref }}$ | 1 | 0 | -1 |
| BETWEEN $V_{\text {REF }} \mathrm{a}^{-V_{\text {REF }}}$ | 0 | 0 | 0 |
| BELOW - $\mathrm{V}_{\text {REF }}$ | 0 | 1 | -1 |

SAMPLER

Figure 2

The known inversion rate at each correlator is then removed by this digital inversion in the sampler.

The output of the sampler system is thus $432100-\mathrm{MHz}$ ECL digital quantizations of 108 wideband inputs.

### 4.0 THE DELAY SUBSYSTEM

Variable digital delay lines must be provided to equalize the total time delay of each signal from the astronomical source to each multiplier. This time delay varies with earth rotation and is also a function of antenna distance from the central control building. Wye arm lengths of up to 21 km will yield differentials in total travel time from source to control building of over $100 \mu s e c$. To obtain array phasing, independently programmable digital delay lines are provided for the sampler outputs. A delay resolution of 1 bit, or 10 nsec, is obtainable with a $100-\mathrm{MHz}$ clock rate and a total delay range of $163.83 \mu \mathrm{sec}$ is provided for each signal. The delay resolution and range is further refined by the sampler phase shifters to give a 0.625 -nsec resolution. Sets of 10818 -bit delay values are produced by the computer system with a $19.2-\mathrm{Hz}$ update rate to program the antenna signal paths. The samplers receive the 4 less significant bits of these delay values and the 14 more significant bits go to program the delay lines.

To provide such a large delay range at a $100-\mathrm{MHz}$ clock rate requires a minimum of 16,383 flip-flops for each digital signal. To keep circuitry size to a minimum it is desirable to use as dense an IC technology as possible, thus the individual $100-\mathrm{MHz}$ signals must be broken down into parallel path signals of lower data rate. Figure 3 gives a block diagram of the delay path for one bit of a delay line. The original $100-\mathrm{MHz}$ signal is split into four $25-\mathrm{MHz}$ clock rate lines by clocking each $4^{\text {th }}$ bit into a given path. Conversion to TTL logic elements takes place at this point. A second stage of clock rate reduction yields 16 signals, each with a $6.25-\mathrm{MHz}$ data rate.

The actual delay logic was optimized for low cost, low PC board area and low power requirements. Low cost MOS shift registers were used instead of RAM storage for this reason. The two-stage bulk delay shown in Figure 3 yielded both low cost and low IC count. The

delay line
Figure 3
first stage is the start of a binary progression delay line, a discrete 512 or 0 (times 16 paths) bit shift-register. The second stage is a dual path shift register which can be utilized in a cyclic manner by varying the clock rate and path usage to obtain thru-put delays of between 512 and 1024 bits. If, for example, only the top shift register were used, being clocked at the full $6.25-\mathrm{MHz}$ data rate, a stage delay of 512 bits will result. If, however, both stages were clocked alternately at 3.125 MHz with odd input bits being shifted into the top register and even. bits into the bottom, a 1024 -bit shift register would result with proper action of the output multiplexer.

Delays intermediate between the extremes of 512 and 1024 bits are implemented by combinations of the two signal paths. Consider the following examples assuming that the clock signals and the multiplexer control signal are controlled to accomplish the specified action (in a cyclic fashion where for each example one cycle is shown):

## 513-bit delay

511 bits into top register at 6.25 MHz


514-bit delay
510 bits into top register at 6.25 MHz
interleaved $\left.\right|^{\rightarrow} \quad 2$ bits into top register at 3.125 MHz
514 bits total delay

1022-bit delay
2 bits into top register at 6.25 MHz
$|\vec{~}| \overrightarrow{510}$ bits into top register at 3.125 MHz
interleaved $\mid \rightarrow \underline{510}$ bits into bottom register at 3.125 MHz 1022

This process must be cyclic, for a given thru-put (or delay), with the cycle repeating every time 512 bits are shifted into the top shift register. The bottom shift register must get exactly 512 clock pulses during this cycle so that data bits shifted into it appear at its output at the same time that adjacent bits of the original data appear at the output of the top shift register. Thus data-in is equal to data-out with any delay of 512 to 1024 bits obtainable. In actual operation the delay range is 512 to 1023 and for 16 parallel paths a total delay of 8,192 to 16,368 bits, or a programmable delay range of 0 to 8,176 bits, results.

The MSB of the 14-bit delay program word commands the operation of the first bulk stage which provides 0 or 8,192 bits of delay or just $\frac{1}{2}$ of the total range desired. The next 9 bits of the delay program word control the action of the dual register stage by controlling the operation of a relatively simple control logic section which develops all of the clock and multiplexer control signals necessary. These two stages yield a 16,368 -bit delay range and a $160-\mathrm{nsec}$ delay resolution.

The two stages of parallel-to-serial data conversion, which reconstitute the original $100-\mathrm{MHz}$ data input, also provide programmable delay. Each stage decodes two of the delay program word bits to time the parallel-to-serial conversion operation to occur on a specific serial data clock cycle of the four available in the duration of a parallel bit clock interval. Thus the serial data emerges from such a stage early or late in the parallel data clock period as programmed by two bits of the delay program word.

Each delay line can be driven by two sources. Astronomical information from the samplers drive the delay lines during observational periods. Each 52.083 msec , however, a $1.6-\mathrm{msec}$ period exists in which astronomical data is not available. This "data invalid" period is due to the design of the waveguide transmission system from antennas to the Control room; during this time local oscillator and control signals are transmitted from the Control room to each antenna. During these data invalid periods a precisely known pseudo-random data pattern is made to drive all delay line inputs for testing purposes.

This automatic self-test capability of the system will be described later.

### 5.0 THE RECIRCULATOR SUBSYSTEM

As will be seen when the multipliers are described, the multiplier subsystem consists of 32 sets of 351 digital multiplier circuits. Continuum observations are supported by the multiplier system producing 32 sets of 351 baseline products where each set is of the form;

$$
A_{i}\left(C_{i}, Q_{i}\right) \div A_{j}\left(C_{j}, Q_{j}\right)
$$

where for antennas $A_{i}$ and $A_{j}, i<j ; C_{i}$ is one of the $A, B, C$, or $D$ channel outputs; and, $Q_{i}$ is one of the $s i n$ or $\cos$ quadrature components.

In spectral line observations, however, up to 1024 sets of 351 baseline products must be supplied of the form;

$$
\begin{equation*}
A_{i}\left(\tau_{o}\right) \div A_{j}\left(\tau_{m}\right) \tag{1}
\end{equation*}
$$

where for antennas $A_{i}$ and $A_{j}, i<j ; A_{j}$ data displaced $m$ bits from the time reference of $A_{i}$; i.e. $\tau_{m}=\tau_{0}+m \Delta \tau$ where $-512<m<512$.

The recirculator subsystem stores astronomical data in semiconductor memory and can drive the multipliers with the same data but with different delay index, $m$. The recirculators function is accomplished in real time by storing only lower clock rate astronomical signals for higher-speed play-back thru the multipliers. Consider, for example, two memories of equal size, in the time it takes to fill one memory with astronomical data at a $100 / \mathrm{N}-\mathrm{MHz}$ rate, $\mathrm{N}=2,4,8,16 \cdots, 256$, a previously filled memory can supply $N$ recirculations of data to the multipliers at a $100-\mathrm{MHz}$ clock rate. If the multiplier hardware is capable of producing 32 sets of products of the form given in Equation (1), and if the recirculator steps the value of $m$ (the lag) by 32 each recirculation, 32 N sets of products can be produced by the 32 sets of multipliers.

In actual practice a two-memory recirculator, as described above, was not used because of the inefficiency of memory usage. Also, if a memory of $B$ bits were used, and if lags of up to 1024 must be produced, only $B-1024$ of the memories stored bits can be used per recirculation, thus introducing observational inefficiencies which become worse as $B$ gets smaller. Instead a single-memory recirculator was designed with a storage capacity of 10,240 bits per digital input. This single memory is time-shared between read and write operations as will be described next.

The 10,240 -bit capacity can produce 9216 -bit serial data scans to drive the multipliers and lags of up to 1024 can be generated. Small integrations (9216 bits) were used in the VLA correlator to both reduce the recirculators memory size requirements and because data at the lowest clock rate, $100 / 256 \mathrm{MHz}$ or 390 kHz , could fill only a small memory in the approximately $50-\mathrm{msec}$ data valid periods used in the VLA.

Figure 4 gives a block diagram of a recirculator. Digital data from the delay lines interface this card with a $100-\mathrm{MHz}$ clock rate. Every $N^{\text {th }}$ bit, $N=2,4,8,16, \cdots, 256$, of this data is clocked onto the card lowering the effective sample rate to $100 / \mathrm{N} \mathrm{MHz}$. This data is broken into 40 parallel paths to accommodate the speed of the RAM's used and written into the RAM. Data storage is a continuous process of writing through the 10,240 bits of memory and starting over at the top of memory and overwriting old data. On the output side two sets of stored data bits are supplied to the multipliers for each digital signal to be correlated. These two outputs are labled in Figure 4 the $\tau_{o}$ and $\tau_{m}$ outputs and are identical except that the $\tau_{m}$ output is exactly $m$ bits delayed from the $\tau_{0}$ output. The $\tau_{0}$ and $\tau_{m}$ outputs have a $100-\mathrm{MHz}$ clock rate and each consist of scans of 9216 bits of contiguous astronomical data previously stored at the $100 / \mathrm{N}$ MHz input rate.

The recirculator memory has a basic $400-\mathrm{nsec}$ memory cycle. In each such cycle the RAM's are addressed and read twice and potentially written into once (at the rate of one write each $N$ memory cycles). Values of $m$ of up to 1024 bits must be generated and the small scans of 9216 bits require almost instantaneous generation of



RECIRCULATOR RAM ADDRESS EXAMPLE write and read sequences for one full set of lags
this delay to avoid producing observational inefficiencies. Large values of delay (lag) are generated quickly by performing multiple reads of the recirculators RAN's. Thus by obtaining $\tau_{m}$ data from RAM addresses displaced from those from which $\tau_{0}$ data is read, for a given memory cycle, large values of lag can be generated, to a 40-bit resolution, requiring no lag generation time. Small ECL RAM's and timed parallel-to-serial conversion (as in the delay line final stages) will yield the final 1 -bit resolution in the $\tau_{m}$ output obtaining a lag of exactly $m$ bits for any integer $0 \leq m \leq 1024$. Both of these last stages of lag generation occur in the logic that recombines the 40 output lines of a recirculating memory to first 5 and finally to a single output line at a $100-\mathrm{MHz}$ clock rate. Figure 4 defines the three stages of lag generation; the $\triangle T$ RAM address factor, the $\triangle E E C L$ RAM factor and the $\Delta S$ timed strobe factor.

Figure 5 gives a diagrammatic example of the generation of a set of lag products generated by the recirculator system. In actual usage the multiplier system will, in its maximum resolution spectral line mode, develop for all 351 baselines 16 lead and 16 lag products each recirculation. Each such recirculation sees the multiplier system driven with two sets of serial bits per antenna, all sets being 9216 bits in length. The two sets correspond to the $\tau_{0}$ and $\tau_{m}$ versions for each antenna signal. Sixteen sets of 729 products of the form;

$$
A_{i}\left(\tau_{0}\right) * A_{j}\left(\tau_{m+k}\right)
$$

are developed, where for antennas $A_{i}$ and $A_{j} ; m=$ the recirculators generated $\operatorname{lag}$ and $k=a l l$ integers $0,1,2, \cdots, 15$. Note that lag products are;

$$
A_{i}\left(\tau_{0}\right) \div A_{j}\left(\tau_{m}\right) \text { for } i<j
$$

and that lead products are

$$
A_{i}\left(\tau_{0}\right) * A_{j}\left(\tau_{m}\right) \text { for } i>j
$$

The example of Figure 5 assumes a sample rate of $100 / 8 \mathrm{MHz}$. For a 10,240 -bit recirculator in a $256 \times 40$ bit configuration, the RAM addresses extend from 0 to 255 as seen at the top of Figure 5. The first set of 9216 bits read from the RAM will support, in the multiplier system, the generation of all lead and all lag multiplications 0 thru 15. The second set of serial bits, with a recirculator card generated lag of 16 , will support generation of the 16 thru 31 lead and lag multiplications. With the sample factor of 8 in this example, a total of $16(8)$ or 128 lead and 128 lag products can thus be generated in 8 9216-bit read sequences as shown in Figure 5. After a complete set of products is generated the process is repeated. The short "write" arrows of Figure 5 indicate the RAM addresses overwritten each read sequence. The dotted lines connecting terminal points of these arrows to the initial points of subsequent write sequences is an attempt to portray the continuous nature of the write operation. Note that each read sequence that produces the two 9216-bit serial outputs need never cross the write interface, even for lag generations of up to 1024 bits. Thus each 9216-bit serial data output is of contiguous astronomical data previously written into the RAM's.

A recirculator serves no function in continuum modes or in spectral line modes where the sample factor, $\mathrm{N},=1$.

### 6.0 THE MULTIPLIER SUBSYSTEM

The multiplier system is built using two integrated circuits of custom design. These two custom IC's include a dual ECL correlator circuit and a 12-bit integrator-shift register IC designed using lowpower Schottky technology. Figure 6 illustrates the application of these two circuits in implementing a 2-bit by 2-bit digital multiplier with a 14 -bit integrator. The multiplication table of Figure 6 is implemented in the logic of the VLA-1 correlator IC which also contains 2 bits of the overall 14 -bit integrator. Note that this table is a standard multiplication table expected for the combination of multiplying two 2-bit, 3-level, digital signals with the assigned +1 , $0,-1$ weights except that a count of one has been added to each entry in the table. This count offset eliminates the need for reversible


VLA-I CORRELATOR MULTIPLICATION TABLE
integration counters and can be corrected for in the final result by subtracting the number of samples integrated from each integration result.

The VLA-2 integrator contains a 12-bit ripple-thru counter and a 12-bit pipeline-shift register stage for result readout. The two LSB's of the total 14 -bit integration result are discarded and only the 12 more significant bits retained. The shift register allows readout of previous integration results while current integrations are being performed.

This custom IC approach was a large factor in making the VLA correlator system feasible, reducing IC count in the multiplier system by about 55,000 , reducing the number of $8.5 \times 11$ inch multilayer PC cards from 864 to 156 , reducing the number of $100-\mathrm{MHz}$ ECL interfaces by a factor of over 3 (to 2160 ECL interfaces) and reducing the power consumption by about 8 kW . Total system cost was also reduced by about $\$ 100,000$ by use of the custom $I C^{\prime}$ s instead of implementing the multiplier-integrator function with discrete IC's. Both custom IC's were supplied by Silicon Systems, Inc. of Tustin, CA.

In concept, the VLA multiplier system is very simple, being matrices of multiplier-integrator circuits described above. In all, 16 sets of $27 \times 27$ multiplier-integrator cicuits produce all of the products required for either continuum or spectral line observations. Each matrix will produce two of the 351 multiplier products already described. Figure 7 illustrates examples of a 27 x 27 matrix of multipliers producing sets of 351 baseline results for both continuum and spectral line observations. Note from this figure that circuits along the diagonal of the matrix produce auto-correlation products. These auto-correlation products are required as normalization factors in the mathematical process in producing maps from the multiplier results.

The multiplier system, and indeed the entire correlator, was designed to be versatile enough to support a wide variety of observations under the general continuum and spectral line categories. In spectral line mode the system can use all of its resources to process a single set of antenna outputs to maximize the resolution of the spectra developed, or can process two or all four sets of antenna


## EXAMPLES OF MULTIPLIER MATRICES

Figure 7
output simultaneously, producing correspondingly lower spectral resolution. Observations of polarized astronomical sources can be supported by developing integrations of parallel- and cross-products from two sets of orthogonally polarized antenna outputs in either continuum or spectral line. In addition, the 27 antennas of the VLA may be broken into 2 or more subarrays that can then simultaneously observe separate objects in the same or different observational modes.

In spectral line observations the sampling parameter, $N$, where the effective sample frequency is $100 / \mathrm{N} \mathrm{MHz}$, can take on values of N $=1,2,4,8,16,32,64,128$, or 256 which, in conjunction with selectable bandwidth filters in front of the sampler modules, allows the astronomer to select appropriate bandwidth-spectral resolution trade-offs for a given observational program.

The driver section of Figure 1 has digital multiplexers for mode selection and provides the complex digital fan-out of $100-\mathrm{MHz}$ clock rate ECL signals into the multiplier system.

In continuum operation, a total of 11,664 12-bit integration products are developed each 92.16 usec in the hardware formed by 27,000 IC's of the driver and correlator sybsystems; thus there are 2.3 IC's/product as a figure-of-merit parameter for circuit economy. In spectral line operation, up to 373,248 products can result from 39,500 IC's of the recirculator, driver, and correlator subsystems; this gives 0.11 IC's/product.

### 7.0 THE INTEGRATOR SUBSYSTEM

The integrator subsystem accepts the 12-bit 92.16-usec integration products of the multipliers and sums corresponding results for periods of up to 10 seconds. Each $92.16 \mu \mathrm{sec}, 11,664$ products are developed by the multipliers in spectral line observations. The recirculation capacity of the system can produce up to 32 consecutive sets of these 11,664 results that must be summed in independent storage locations. Thus the integrator system must allow storage/ integration space for up to 373,248 independent products.

This large storage/integration capacity is built using high-speed $4096 \times 1$ RAM's. Twenty-five such RAM's provide memory space for 32 sets of 108 products that can integrate to a 25 -bit word length (only


RAMS USED AS INTEGRATORS

Figure 8

108 x 32 or 3456 of the 4096 -bit storage capacity of each RAM is used). RAM's are used as integrators by addressing a given partial sum, adding a new 12 -bit incoming result to it and storing the new partial sum back into the RAM. Figure 8 illustrates this process.

The integrator memory system is pipelined by providing a second storage memory of identical structure into which final integration results may be quickly stored. This storage memory is then available to the FFT processor for most of a subsequent integration period and its results can be accessed and processed by the FFT processor while the integration memory devotes its full energy to the new integration.

The total integration period is software selectable in $52.083-\mathrm{msec}$ steps to 10 seconds.

The integrator subsystem provides up to 373,248 integration products in only 11,300 IC's or 0.03 IC's/product.

### 8.0 SELF-TEST

Since the VLA correlator system is such a large system ( 85,000 IC's) much thought was given to making it self-testing and to some extent self-healing. Two separate schemes of self-test were provided. The delay, recirculator, driver, and multiplier subsystems are tested by using small intervals of time, which occur every 52.083 msec, during which astronomical data are not available, to inject precisely known pseudo-random digital signals into all delay line inputs, and to perform a $92.16-\mu \mathrm{sec}$ test integration on this data. The 11,664 12-bit results of this test integration can then be compared against predicted results and error conditions flagged. Only 729 of the 11,664 results are tested each 52.083 msec and therefore 16 such cycles are required for a full system test. In addition a small subset of 4 delay program values and 4 lag generation values, of the 16,384 and 1024 respectively possible, are used to program the system during these test cycles and thus brings to 64 the 52.083 msec required for a complete self-test.

Four undedicated delay-recirculator paths exist thru the system and if the self-test results yield a pattern of errors that indicate a defective delay line or recirculator card, a self-healing action will be attempted in which one of these undedicated paths will assume the
defective paths function and substitution of this new path, for the defective path, will insure valid data reaching the multipliers.

The integrator system also has 4 undedicated paths thru the integrating and storage memory. This undedicated hardware is made to sequentially parallel the dedicated paths such that a given dedicated path is paralleled with a one-in-27 duty cycle. Each path is tested for proper operation by comparing the results of the paralleled path with those of the test path. Again malfunctions are corrected by substitution of the test path for a defective dedicated path.

Both types of self-test and self-healing described above are performed on an automatic and noninterfere basis. On the basis of IC count, about 95\% of the electronics is automatically self-tested and about $65 \%$ of the system is covered by one of the self-heal mechanisms.

ACKNOWLEDGMENTS
The authors wish to acknowledge the following persons at NRAO that contributed to the conceptual and actual design of the VLA correlator system. Dr. S. Weinreb, Dr. B. Clark and A. Shalloway laid the theoretical groundwork for the system and contributed to its development. A. Shalloway and Dr. G. Patton had design responsibility for the integrator subsystem and R. Mauzy designed the sampler module.

## REFERENCES

[1] D.S. Heeschen, "The Very Large Array", Sky and Telescope, 49, pp334-351, 1975.
[2] S. Weinreb, R. Predmore, M. Ogai, and A. Parrish, "Waveguide System for a Very Large Antenna Array", Microwave J., 20, No.3, pp49-52, 1977.
[3] J. Archer, E. Caloccia, and R. Serna, "An Evaluation of the Performance of the VLA Circular Waveguide System", IEEE Trans., Microwave Theory Tech., MTT-28, pp786-791, 1980.
[4] S. Weinreb, MIT Research Laboratory of Electronics, Report No. 412, p119.
[5] B. Cooper, "Correlators With Two-Bit Quantization", Aust. J. of Physics, 23, pp521-527.
[6] H. Harmuth, "Applications of Walsh Functions in Communications", IEEE Spectrum, ́, pp82-91, 1969.

