## CALIFORNIA INSTITUTE OF TECHNOLOGY VIBA Correlator Project

## VLSI TECHNOLOGY REPORT M. S. Ewing December 28, 1984

### I. INTRODUCTION.

The VLBA correlator must provide approximately 50,000 "lags", or 100,000 physical multiplier-accumulator units. Such a large system can be achieved most economically through use of modern Very Large Scale Integrated circuits (VLSI). Since our chip will be rather complex and specialized, we would expect to develop an Application Specific Integrated Circuit (ASIC). ASICs are becoming widely employed in medium-to-large scale projects, such as the VLBA.

The VLBA specifications which most directly affect the VLSI design are listed in the following table.

### PARTIAL VLBA CORRELATOR SPECIFICATION

| Sample Quantization:<br>Sample Rate: | 2- or 3-level, selectable<br>16 Msamples/sec                                                           |  |  |  |
|--------------------------------------|--------------------------------------------------------------------------------------------------------|--|--|--|
| Oversampling Option:                 | none, or 2X oversampling, selectable                                                                   |  |  |  |
| Lobe Rotation:                       | 3-level sin/cos, 3-level<br>complex output                                                             |  |  |  |
| Vernier Delay:                       | -1, 0, or +1 bit extra delay                                                                           |  |  |  |
| Input Multiplexing:                  | 4-way (to select 10-, 14-, or<br>19/20-station modes, etc.)                                            |  |  |  |
| Data Quality Control:                | Invalid data to be flagged and not correlated                                                          |  |  |  |
| Max. required dump rate:             | <10 Hz (not yet determined)<br>VLSI chips may dump faster<br>if intermediate processor<br>is provided. |  |  |  |

This report summarizes a number of alternate VLSI approaches and discusses our choice of a specific technology and management plan to provide VLSI chips for the VLBA.

#### II. SUMMARY OF CONCLUSIONS.

We have decided to develop a gate array chip for the VLBA correlator project. This chip will be based on 3 micron CMOS technology, which is a "standard" process that does not press the state of the art.

The chip will implement 3x3 and 2x2 level multiplication with internal lobe rotation. Accumulators may be provided offchip by a chip now being developed at JPL, or they may be implemented on-chip with the multipliers. A fringe-rate generator chip also being developed at JPL would be used in either case.

VLBA schedules require us to begin the gate array design process soon. However, the semicustom chip being developed for the Australia Telescope (AT) is to be delivered in prototype quantities during the coming month. This chip may be useable for the VLBA if it operates at a sufficiently high clock rate. If the AT chip offers adequate performance for the VLBA with some overall cost savings, we would be prepared to halt the gate array development.

#### III. GENERAL OPTIONS.

In our VLSI evaluation, we have considered four general methods of providing the chips needed by the VLBA correlator. There are 3 major ASIC approaches available in the industry: gate array, standard cell, and full custom. In addition, of course, there is the possibility that some other group may have developed an ASIC that might fulfill our requirements. We have found potential candidates in each of these categories, excepting full custom.

<u>III.1</u> <u>Gate Array</u>. The most widely accepted ASIC development method for small-to-moderate quantities (<10,000 ICs, approximately) is the gate array. A gate array vendor produces a large number of silicon wafers containing a standard pattern of basic transistor functions. The customer provides his requirements which the vendor uses to add the final one or two layers of metalization to hook up the gates. The resulting wafers are diced, packaged, tested, and finally delivered to the customer.

The customer's input may be at one of several levels. At the most basic, he may provide specifications and a block diagram and rely on the vendor to do all detailed circuit design, layout, and simulation. Alternatively, the customer may do a preliminary design, "schematic capture" (using in-house CAD systems), and rough timing or simulation analysis. The vendor accepts the design, lays out the chip, and performs detailed simulation before manufacturing. There is also the less-often practiced method of the customer doing his own layout and detailed simulation, bypassing much of the vendor's engineering services.

In general, the vendor's price increases with the degree of his participation in the design. At the same time the customer loses more of his flexibility to modify his design the more he relies on outside engineering. If a customer plans to make a number of gate array designs (perhaps as few as several per year), he may find it attractive to purchase CAD equipment (\$50-100K) which will allow him to design and partially simulate his chips in-house.

Commercial gate arrays are available from many vendors covering a wide range of performance. At the high end, 2 micron CMOS chips are available with up to 11,000 gates and clock rates of over 50 MHz. More standard (less expensive) 3 micron CMOS technology offers chips in various die sizes up to about 6,000 gates with clock rates of 20-30 MHz. ("Clock rate" is illdefined. The more relevant number is gate delays; a higher clock rate can often be achieved by using more gates in a pipelined mode.)

<u>III.2</u> <u>Semicustom</u>. In principle it should be possible to improve a gate array design by rearranging the transistors into a more optimal configuration for the specific subfunctions required, e.g., adders or flip-flops. Some of the capacity of the silicon is wasted by forcing designs onto the standard grid of a gate array.

In the semicustom, "standard cell" approach to ASIC development, the designer implements his circuit by assembling cells from a library of standard functions (gates, multiplexers, registers, etc.) onto blank silicon. Individual functions are preoptimized and characterized much like standard small- or mediumscale ICs. As with gate arrays, the standard-cell customer can enter the design process at various levels.

Semicustom designs will be somewhat more expensive and slower to develop compared with gate arrays, since wafers cannot be pre-manufactured. Some industrial surveys state that standard cell designs are cost-effective in the moderate quantity range, perhaps 10,000 - 50,000 units.

In practice, it appears that the standard-cell "industry" is less well developed than the gate array industry, for projects of our size. Advanced gate arrays seem to offer speed superior to available standard cell chips, and densities that are at least comparable.

III.3 Full Custom. Mass produced VLSI chips (e.g., microprocessors) have usually been designed at the transistor level. The greatest performance and density are obviously available with this method. This is clearly advantageous in large quantities, perhaps >100,000 units. Increasingly, this method is becoming available to smaller projects as the "silicon compilation" technique is developed.

#### IV. SPECIFIC OPTIONS.

We have examined each of the potential methods of acquiring a correlator ASIC, and have discussed the possibility of developing a chip on our own or associating our project with several ongoing efforts at JPL. In addition, we are evaluating an ASIC design that is being developed for the Australia Telescope (AT).

IV.1 Gate Array. The normal mode for customers of gate array firms is to carry on some level of design in-house and then to transfer the main load to the vendor's design and verification tools. The VLBA project, for example, could develop a detailed functional specification and block diagram for the proposed chip. We could then make a logic design using a Futurenet or other small-scale CAD system and a module library supplied by the gate array vendor. We could perform some timing checks based on worst-case parameters, but without knowledge of the detailed chip layout. The logic design could then be taken to the vendor, who would check it and attempt to lay it out with his automatic route and place programs. He could then simulate the design and generate test vectors, and, after a joint review, could make prototype chips.

JPL Section 335 (Tracking Systems and Applications Section, contact Larry Young) is funded to develop a set of ASICs for a digital receiver for the Global Positioning System (GPS). The chips of special interest to VLBA are (1) a Fringe Phase Generator (FPG) chip, which includes 4 independent 24-bit phase adder circuits operating at 2 MHz effective clock rate, and (2) an Accumulator (ACC) chip, containing 16 16-bit accumulators, opera-The FPG and ACC chips are described in Appendix ting at 16 MHz. B of the VIBA correlator bi-monthly report of 12/4/84. These chips are being developed on JPL's Mentor workstations by JPL personnel. Layout, partial simulation, fabrication, and testing will be by the vendor, LSI Logic, Inc., using their 3 micron VLSI process and 6,000 gate chips. FPG prototypes are to be delivered during the Spring of 1985; the other chips are scheduled to be prototyped in May or June.

JPL has selected LSI Logic for its work after evaluation of a number of alternative vendors. LSI appears interested in and well-suited for a project of our size. They have extensive experience with their 3 micron CMOS product.

Section 335 has proposed to work together with the VLBA Correlator project to make a detailed design for a VLBA correlator chip (multipliers and shift registers, but no accumulators). They would also make available their FPG and ACC designs for our correlator.

Although division of the correlator function into separate multiplier and accumulator chips seems feasible, it may be more desirable to combine these functions, because we would be fabricating larger numbers of a single chip type, because pin-outs and packaging are less troublesome with the functions combined on one chip, and because power dissipation problems are reduced. We will continue to evaluate these alternative approaches.

IV.2 Semicustom - JPL Design. JPL's Section 331 (Communications Systems Research Section, contact Les Deutsch) has developed an in-house capability for semicustom VLSI design based on publicdomain tools running on a VAX UNIX system. They have developed several successful chips using the DARPA-funded MOSIS service. Their tools include simulators that are sufficient to assure good performance prior to fabrication. This is necessary for MOSIS provides only minimal error checking before manufacturing of

#### prototypes.

The design process is capable of full-custom layouts, however it would normally be used with "macros" which are comparable to, but perhaps not so well characterized as, the standard cell libraries available commercially.

Section 331 has also proposed to develop a correlator chip for the VLBA. Since their process is limited to roughly a 14 MHz clock rate, they must design their chip to run at 1/2 the rate of our clock, i.e., 8 MHz. They must include 4 times the number of multipliers compared with a 16 MHz implementation, but the products can be combined on-chip so that no extra accumulators are required. They would use the MOSIS service for prototyping, and they believe MOSIS will facilitate quantity production. However, the Section has not yet attempted quantity production.

IV.3 Semicustom - AT Chip. The AT project has designed a chip to support their interferometer (cf Correlator Memo VC025). This is an  $8 \times 8$  array of 2-level correlators, 4 bit prescalers, and 16-bit accumulators. It is specified to operate at 12 MHz in 2level mode or 6 MHz in 4-level. Prototypes are to be delivered in mid-January, 1985.

The AT chip has several drawbacks for VLBI use:

1. Its array structure is most appropriate for handling the output of a serial-to-parallel converter which divides down the sample rate of a wide passband. The array structure might be employed in the baseline sense, i.e., 8 stations by 8 stations, but this does not appear to be a natural way to organize the VLBA correlator.

2. The specified clock rate is too low for our 16 MHz channels. The AT group has indications that the chip may run at up to 24 MHz. In this case, the chip is still too slow for 2-bit (4-level) multiplication. It could, however, be combined in a bandwidth doubling mode to achieve 16 MHz at 4 levels.

3. There are no provisions for carrying data validity information through the correlator. External provisions would be required.

4. There is no provision for fringe rotation. One will probably suffer a 7% SNR loss because a 2-level fringe rotation must be used. The normal VLBI 3-level fringe rotation requires at least a 2-level x 3-level multiplication.

Despite these drawbacks, the AT chip offers certain advantages:

1. The design is (probably) complete. Prototypes will be available quite early.

2. Cost is estimated at \$50 per 64-lag chip. If an operating clock rate of 20 MHz is achieved for the AT chip and if the 7% loss factor can be accepted, the VLBA project would probably realize cost savings by using the chip. The savings could be a few hundred thousand dollars if there are no further design problems uncovered in the architectural design process.

3. Some new flexibility is added since the number of elementary correlators is increased. One can trade off sample resolution (1 or 2 bits) for frequency resolution, for example.

More information on the AT chip is being sought to help us further evaluate its potential for the VLBA. Its operating speed is crucial; this should be determined in January. It is clear that adoption of the AT chip would force substantial changes in the overall correlator architecture.

## V. SELECTION OF VLSI APPROACH.

After evaluation of the alternatives presented above, we have chosen to pursue a new gate array design. We will rely on a combination of JPL and vendor engineering support. In this way, we should minimize the load on VLBA engineering personnel, while taking maximum advantage of JPL's and the vendor's expertise with his own product. We also intend to use the Section 335 FPG chip. We would participate in the development and evaluation of that chip at a low budget level to ensure its suitability for our project.

We realize that when further information becomes available on the AT correlator chip, and when we have completed an architectural study of how that chip might be used for the VLBA, that the AT chip may offer a cheaper and faster route toward building the VLBA correlator. If this situation comes about, we will consider halting the design of the gate array. However, we must embark on the gate array work immediately to meet our target of June, 1985 for prototype delivery.

VI. TECHNICAL GOAL.

We believe the following specifications can be achieved with the VLSI approach we have chosen. They constitute our design target. The functions indicated may be implemented on two separate chips (multiplier and ACC) or on a single correlator chip.

## VLBA DESIGN TARGET FOR VLSI CHIP

| Process:               | 3 micron CMOS                      |
|------------------------|------------------------------------|
| Min. clock rate:       | 20 MHz                             |
| Lags:                  | 8, "complex"                       |
| Prescaler:             | 2 bits (part of multiplier)        |
| Counter:               | 16 bits (possibly fewer)           |
| Registers for readout: | none (correlation must be blanked) |
| Quantization:          | 2- or 3-level, selectable          |
| Oversampling:          | none, or 2X, selectable            |
| Clocks:                | Independent for shift and multiply |
| Package:               | 68 pin LCC                         |
| Power:                 | < l watt                           |
| Lobe Rotator:          | sin/cos and multipliers on chip    |
| Vernier delay:         | -1, 0, +1 on chip                  |
| Input Multiplexor:     | 4-way                              |
| Invalid Data Flag:     | 4th state of 2-bit data.           |

## VII. PRELIMINARY DEVELOPMENT PLAN.

VII.1 Vendor Relationship. We intend to negotiate with LSI Logic, Inc., for the development of the correlator chip. In this we will be following JPL's market investigation and we will make maximum use of JPL's experience in working with this firm. LSI Logic's arrays are multi-sourced, adding confidence to our ability to produce the chip.

If these negotiations should fail for any reason, there are alternative suppliers with whom we could design a gate array comparable to our target. These include NEC and RCA.

VII.2 Estimated Schedule and Budget. We have estimated the required manpower levels, costs, and schedule to develop the chip. These are necessarily preliminary in the absence of a firm quote, but they are based on JPL's experience with LSI Logic.

# PRELIMINARY VLSI WORK PLAN

| Milestones              | Work Description                        | Who?          | CIT<br>MWK | Cost :<br>\$K | Elapsed<br>Weeks |
|-------------------------|-----------------------------------------|---------------|------------|---------------|------------------|
| Engineering<br>Purchase | Preliminary Design                      | CIT           | 8          |               | 4                |
| Agreement               | Final Design                            | Both          | 1          |               | 1                |
| Design                  | Schematic Capture<br>Test Patterns      | CIT           | 8          |               | 10               |
| Acceptance<br>Checklist | Logic Simulation                        | Both          | 1          |               | 1                |
| Engineering             | Timing Simulation<br>Test Pattern Check | Vendor        | 4          | 40            | 6                |
| Completion<br>Report    | Place and Route                         | Both          | 1          |               | 1                |
| Derfermense             | Final Timing Check<br>One Iteration     | Vendor        | 4          | 65            | 6                |
| Performance<br>Approval | Tooling<br>Fabricate                    | Both          | 1          |               | 1                |
|                         | Test Prototypes<br>Tester and Tests     | Vendor<br>CIT | 2<br>16    | 20            | 8<br>4           |
| Prototype<br>Approval   |                                         | Both          | 1          |               | 1                |
| PROIOTYPE TO            | CAL                                     |               | 47         | 125           | 43               |
| Production<br>Purchase  |                                         |               |            |               |                  |
| Agreement*              | Fabrication                             | Both          | 1          |               | 1                |
|                         | Testing<br>Testing                      | Vendor<br>CIT | 2<br>4     | 500           | 12<br>4          |
| Production<br>Approval  |                                         | Both          | 1          |               | 1                |
| PRODUCTION TO           | YTAL                                    | ·             | 8          | 500           | 18               |

\* Prototype and production contracts can be combined or separate If separate, any interval of time between them is possible.