Wire transfer of charge packets for on-chip CCD signal processing

## Eric R. Fossum

# Department of Electrical Engineering Columbia University New York, New York 10027

#### ABSTRACT

A structure for the virtual transfer of charge packets across metal wires is described theoretically and is experimentally verified. The structure is a hybrid of charge-coupled device (CCD) and bucket-brigade device (BBD) elements and permits the topological crossing of charge-domain signals in low power signal processing circuits. A test vehicle consisting of 8, 32 and 96-stage delay lines of various geometries implemented in a double-poly, double-metal foundry process was used to characterize the wire-transfer operation. Transfer efficiency ranging between 0.998 and 0.999 was obtained for surface n-channel devices with clock cycle times in the range from 40 nsec to 0.3 msec. Transfer efficiency as high as 0.9999 was obtained for buried n-channel devices. Good agreement is found between experiment and simulation.

## **1. INTRODUCTION**

Recently there has been an emerging interest in placing signal processing circuitry on the same chip as the imager to reduce noise and relieve the burden on downstream digital electronics.<sup>1,2</sup> In most image processing applications, the accuracy required by the algorithm does not exceed 6-8 bits so that analog representation and computation of the signal quantities is adequate. Charge-domain circuits, such as CCDs, have the added advantages of compatibility with the imager output, low power, and low real-estate consumption.<sup>3</sup>

A major disadvantage of CCD circuits is that it is difficult to design circuits in which signal paths physically cross - an important consideration in circuit design. Bucket-brigade devices (BBDs) also operate in the charge domain but have generally lower performance than CCDs.<sup>4</sup> This is primarily due to their incomplete carrier transport mode of operation and the high capacitance associated with each storage node. However, BBDs store the signal charge as majority carrier charge in a heavily doped region. It is therefore possible to envision connection of these heavily doped regions by metallic wires and thereby achieve topological crossing of signal paths.

In this paper, a hybrid structure combining the features of the BBD and the CCD is described and characterized. Termed a wiretransfer structure, its main purpose is to permit rapid, virtual transfer of charge across metallic wires. Although the charge transfer efficiency (CTE) of the wire-transfer structure is less than that of the CCD, it is sufficient to allow the realization of focal-plane image processing circuits in the charge-domain.

# 2. WIRE TRANSFER STRUCTURE

A schematic illustration of the wire transfer process in shown below in Fig. 1. Charge initially confined in a CCD potential well is to be virtually transferred to a receiving well via a metallic wire. A floating diffusion is used in the structure and is designed to have the lowest capacitance possible. The voltage on the floating diffusion is controlled by a barrier gate such that when charge is transferred into the floating diffusion node an equal amount is spilled out the other end, keeping the voltage constant. Performance is enhanced if charge is transferred at a constant rate (i.e. current source) by ramping the voltage on the output transfer gate.



Fig. 1. Schematic illustration of wire transfer process.

This simple picture suggests that the CTE can be close to perfect. However, there are three major sources of non-ideal behavior. First, Si-SiO<sub>2</sub> interface traps in the vicinity of the floating diffusion can result in transfer inefficiency. In fact, in the surface-channel devices investigated experimentally, this appears to dominate the observed non-ideal behavior. Second, if an empty bucket (i.e. no charge) is transferred into the node, charge still transfers out of the floating diffusion by a subthreshold conduction mechanism, creating a deficit of charge which, in turn, is collected from a succeeding charge packet. This leads to a signal dependent "fixed-loss" transfer inefficiency which can be corrected only through the use of a small fat-zero signal. Third, immediately following the transfer of the packet into the node, the floating diffusion is slightly forward-biased relative to the CCD barrier gate induced surface potential. The discharge of the node to a "flat" surface potential is a noisy process leading to kTC noise in the transferred packet.

The transfer inefficiency due to surface traps can be written as simply a proportional-loss constant with a frequency dependence characteristic of the trap energy. Experimentally, no strong frequency dependence was observed.

If m empty charge packets follow a non-empty bucket, the loss due to subthreshold leakage can be written as:

$$Q_{loss} = kTC_n/q'ln[m]$$

where  $C_n$  is the floating diffusion node capacitance. The quantity  $Q_{loss}$  is extracted from the next non-zero charge packet passing through the node. This loss can be avoided if a fat-zero charge packet is used. The minimum fat-zero packet which will prevent sub-threshold loss is given by:

$$Q_{fz} = kTC_n/q \cdot \ln[qI_{cs}T_c/kTC_n]$$

where  $I_{CS}$  is the magnitude of the current source injected into the node due to the ramping of the output transfer gate, and  $T_C$  is the clock period during which transfer takes place. It should be noted that the fat-zero charge depends only weakly on these quantities and, in general, is quite small.

The kTC noise introduced by the diffusion process in the discharge part of the transfer is also small if the node capacitance is kept small, and can be written as:

$$\approx (qQ_{fz})^{1/2}$$

For the experimental devices described below, Table 1 lists the approximate magnitude of each of these quantities.

| Parameter                                          | <u>Symbol</u>        | Value                |
|----------------------------------------------------|----------------------|----------------------|
| Node capacitance                                   | Cn                   | 50 fF                |
| Clock period                                       | Tc                   | 40 nsec              |
| Current Source (Q <sub>max</sub> /T <sub>c</sub> ) | Ics                  | 40 μA                |
| Maximum signal size                                | Q <sub>max</sub>     | 10,000,000 electrons |
| Loss (m=1)                                         | Qloss                | 8,000                |
| Fat Zero                                           | Qfz                  | 56,000               |
| Noise                                              | <q<sub>n&gt;</q<sub> | 250                  |

Table 1. Quantities of Interest in Experimental Devices

### 3. EXPERIMENTAL RESULTS

#### 3.1 Test vehicle design

To experimentally demonstrate the wire transfer concept and to explore the effect of various design parameters, a test vehicle chip was designed, fabricated and tested. The test vehicle consists of eight wire transfer shift registers configured as shown below in Fig. 2. A shift register, as shown, would not likely be used in an actual signal processing circuit due to the preponderance of wire transfer stages, but does allow expedient exploration of wire transfer performance.

A double-poly, double-metal, n-channel surface technology was used with a transfer electrode length of 3.5  $\mu$ m. The devices were fabricated on a p/p+ substrate (with the p-layer nominally 10  $\Omega$ -cm) in a commercial foundry process. All shift registers have a fill-and-spill input stage at the front end and terminate with a two-stage source-follower output amplifier with on-chip sample-and-hold. The amplifier drives a 1 M $\Omega$  - 22 pF oscilloscope directly.

The nominal shift register has 32 stages (each as shown in Fig. 2) with a B1 gate length of 3.0  $\mu$ m and a B2 gate length of 2.5  $\mu$ m. An 8-stage and a 96-stage shift register with the same geometry were The 96-stage register is snaked resulting in a few included. interconnect wiring lengths of several hundred microns. Two 32-stage shift registers with altered geometry were also designed. One has B1/B2 lengths of 2.5/2.5  $\mu$ m (smaller barrier) and the other has B1/B2 of 3.5/3.0  $\mu$ m. Each of these five shift registers uses the The total node second level of metal for interconnect between stages. capacitance for each interconnect (including junction capacitance) was Two other shift registers were included to estimated to be 0.05 pF. test the effect of node capacitance. One has additional second-level metal wiring capacitance to simulate an interconnect length of 150  $\mu$ m and the second has an intentional bootstrap capacitance between the node and B1 in an attempt to reduce the effect of transient substrate currents on node bias. A photograph of the test vehicle is shown in Fig. 3a, and a close-up of the nominal shift register stage in Fig. 3b.

#### 3.2 Test procedure

Several wafers from several lots were tested at the wafer level using a wafer prober. Two wafers were selected for dicing and packaging. The packaged devices were tested in a shielded test box using clock voltages derived from a Pulse Instruments PI-5800 timing generator and PI-453 MOS CCD clock drivers. These drivers have a maximum slew rate of approximately 0.2 V/nsec.

The shift registers were operated as three-phase devices with overlapping waveforms. Each phase had a 50% duty cycle with a total period of six clock cycles on the PI-5800. Thus, the wire-transfer portion occupied two clock cycles, or 1 clock cycle for each of the



Fig. 2 Schematic illustration of wire-transfer shift register.



Fig. 3a Wire transfer test vehicle chip.

# Fig. 3b Close-up of shift register.

two steps. In this paper, the single clock cycle time  $(T_c)$  is reported as the operating speed of the device, whereas the total time to transfer from one stage to the next (one wire transfer and two CCD transfers) takes six clock cycles.

Electrode B1 was typically d.c. biased at 1.0 volts using a precision power supply, electrode B2 clocked with an adjustable peak voltage typically 1.0 volts higher than B1, and electrodes P1,P2, and P3 clocked with a common peak voltage typically 16 volts. The substrate was grounded and the low level for all clocked signals was ground.

Charge transfer efficiency was characterized for worst case operation in which a single full charge packet is loaded into a shift register with 100 empty preceding charge packets. The output charge packet is distorted by loss due to subthreshold effects, wire-transfer efficiency, and CCD transfer efficiency. The measured CTE is approximately given in this case by:

$$CTE = \{ Q_1 / (Q_1 + Q_2 + Q_3...) \}^{(1/m)}$$

. . . .

where  $Q_1$  is the first charge packet,  $Q_2$  is the first trailing charge packet,  $Q_3$  the second trailing charge packet, etc. Fig. 4 is an oscilloscope photograph showing the output of the 8-stage, 32-stage, and 96-stage shift registers for a clock cycle time of 40 nsec.





# 3.3 Results

The CTE was measured for each shift register for several clock cycle times. The CTE was found to be nearly independent of cycle time for the cycle times tested, which ranged from 40 nsec to 0.3 msec, provided the voltage on B1 was adjusted to be higher for the shorter cycle times. Over these times, the CTE for the nominal geometry shift registers was approximately 0.9985 with an experimental accuracy of 0.0005. For the altered geometry shift registers, the smaller barrier length had lower CTE than the nominal register, and the longer barrier length had higher CTE. However, the spread was only approximately 0.0005, close to the limits of experimental accuracy and was not consistent. For the registers with additional node capacitance, the CTE was degraded. The shift register with additional wiring capacitance exhibited CTE of approximately 0.9970 and the shift register with the bootstrap capacitance had a CTE of approximately 0.9960. Furthermore, the bootstrap arrangement seemed more susceptible to low frequency noise.

The nominal geometry 96-stage wire-transfer shift register was also tested for other combinations of duty cycle, where duty cycle is defined as the ratio of number of full charge packets to number of empty packets. For duty cycles ranging from 0.01 to 100, the CTE of the register appeared to remain constant.

For a fixed cycle time, the effect of electrode B1 and B2 bias voltages was investigated. The measured CTE as a function of these voltages is shown in Fig. 5. The voltage on electrode B1 is not overly critical provided it is greater than the MOS threshold voltage. The voltage range for B2 was more robust provided it was biased greater than B1.



Fig. 3. Effect of barriergate bias on CTE.

A wafer with a buried-channel implant was also tested. The major difference between the buried-channel and surface-channel devices (from the perspective of the wire-transfer model) is that the node is n+/n and there is no depletion region edge in the heavily doped region. Thus, the effect of carrier trapping in the vicinity of the junction should be reduced. The node capacitance is also lower implying lower loss due to subthreshold effects. Indeed, the buried-channel devices were found to have CTE as high as 0.9999 which is attributed to these factors (rather than increased speed) since the CTE in the surface-channel devices was not a function of operating frequency.

The noise floor in the measurements was dominated by residual 60 Hz and RF noise in the test station, with a value on the order of 2 mV. Thus the measured dynamic range (defined as 20 log SNR) was 62 dB. However, using the expression above one obtains a theoretical dynamic range of 104 dB.

#### 4. SUMMARY

A structure for the virtual transfer of charge packets across metallic wires has been investigated. The observed CTE for the structure ranged between 0.998 and 0.999 for surface channel devices, and was as high as 0.9999 for buried channel devices. The devices were found to be robust with respect to bias voltages and clock waveforms.

It is evident from analysis and experiment that smaller node capacitances are advantageous for performance. Speed and noise can both be improved with reduced capacitance. The inclusion of a shield gate (B2) to prevent B1 barrier lowering improves CTE. It appears that without overtly sacrificing speed, making the B1 gate length longer improves performance.

The wire-transfer process allows for the topological crossing of signal charges in charge domain signal processing circuitry. The wire-transfer structure also facilitates corner turning, changing channel width, and the programmable steering of charge packets. The wire-transfer structure also makes charge summation and charge packet splitting readily achievable. The structure provides a degree of design flexibility and methodology previously denied CCD circuit designers.

#### 5. ACKNOWLEDGMENTS

The author gratefully acknowledges the assistance of S. Kemeny in the SPICE simulation of the output amplifier. The assistance of Dr. R. Bredthauer of Ford Aerospace in the fabrication of these devices was invaluable.

#### 6. REFERENCES

- 1. E.R. Fossum, "Architectures for focal-plane image processing," Opt. Eng. vol. 28(8) pp. 865-871 (1989).
- 2. For example, see other papers in this proceedings.
- 3. E.R. Fossum, "Charge-coupled computing for focal-plane image preprocessing," Opt. Eng. 26(9), pp. 916-922 (1987).
- 4. For example, see C. Berglund and H. Boll, "Performance limitations of the IGFET bucket-brigade shift register," IEEE Trans. Electron Devices, vol. ED-19(7), pp. 852-860 (1972).

186 / SPIE Vol. 1242 Charge-Coupled Devices and Solid State Optical Sensors (1990)