# Frame-Transfer CMOS Active Pixel Sensor with Pixel Binning

Zhimin Zhou, Student Member, IEEE, Bedabrata Pain, Member, IEEE, and Eric R. Fossum, Senior Member, IEEE

Abstract—The first frame-transfer CMOS active pixel sensor (APS) is reported. The sensor architecture integrates an array of active pixels with an array of passive memory cells. Charge integration amplifier-based readout of the memory cells permits binning of pixels for variable resolution imaging. A  $32 \times 32$  element prototype sensor with 24- $\mu$ m pixel pitch was fabricated in 1.2- $\mu$ m CMOS and demonstrated.

## I. INTRODUCTION

THE CMOS active pixel image sensor (APS) has permitted the realization of a camera-on-a-chip with high performance [1]. Each pixel contains an active amplifier that buffers the photosignal and drives a column-bus readout architecture. Both photodiode and photogate pixels have been explored [2]. Other CMOS-based active pixels for current-mode readout [3] and logarithmic companding [4] have also been reported. It has been suggested that on-chip frame memory would enhance the ability to perform certain image processing tasks on chip [5] including frame-to-frame difference encoding, motion detection, and variable resolution imaging.

In-pixel memory has been used for motion detection for passive pixels [6], in CMOS APS by a slight change in timing [7], and in more complex compression approaches [8]. In-pixel memory has two major drawbacks. First, retention of data can be deteriorated by both stray light and stray carriers, similar to the origin of smear in an interline transfer CCD. Second, inpixel memory results in either low fill-factor or large pixels. In this work, a small prototype CMOS APS with separate on-chip frame memory is implemented for the first time to demonstrate the concept and investigate architecture and performance issues. Variable resolution imaging by binning pixels is readily implemented by the architecture. The work represents the first report of a frame-transfer CMOS APS.

### II. SENSOR DESIGN AND OPERATION

The structure of the sensor is shown in Fig. 1. The full signal chain is shown in Fig. 2. The pixel is implemented as a photogate-type active pixel. Charge is integrated under the photogate and then transferred to the floating diffusion for readout. Row decoder logic on the side of the array is used to select a particular row for readout and apply

Z. Zhou and B. Pain are with the Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109 USA.

E. R. Fossum is with Photobit, La Crescenta, CA 91214 USA.

Publisher Item Identifier S 0018-9383(97)06934-7.

the proper sequence of signals to enable correlated-doublesampling readout. The exact sequence has been well reported previously [2]. The pixel delivers two sequential voltage signals to the vertical column bus—a reference (reset) level and a signal level. The difference of these two voltages levels is proportional to the charge integrated under the photogate according to the conversion gain of the floating diffusion and source-follower combination.

At the bottom of each column of pixels is an ac-coupled source-follower (henceforth called the buffer). When the pixel reset level is present on the column bus, the clamp switch MB1 is closed, clamping the input of the buffer to VCLP. The clamp switch is then opened and the pixel signal level is applied to the column bus. The buffer output is thus reduced by an amount proportional to the photosignal. Voltage offset from the buffer is suppressed by the column charge integration amplifier circuit, as described later.

Below the buffer is the array of memory cells. Each row of memory cells corresponds to a row of pixels in the APS array. The memory cell is a simple passive sample-and-hold switch and capacitor. When a row from the APS array is being read out, a corresponding row of memory cells is selected by R\_Sel. The voltage from the buffer is sampled onto the memory cell capacitor  $C_M$ . When R\_Sel is deactivated, the photosignals from the corresponding APS row are held on the row of memory cell capacitors. The estimated noise for sampling the signal onto the sample and hold capacitor is given by the kTC noise expressed in electrons

$$\langle \sigma_n^2 \rangle = kTC_M/q^2.$$
 (1)

The charge gain G from the pixel to the memory cell sampleand-hold capacitor is

$$G = Q_M / Q_{\text{photo}} = C_M ag / q \tag{2}$$

where  $Q_M$  is the change in charge on the memory cell capacitor,  $Q_{\text{photo}}$  is the photosignal collected under the photogate,  $C_M$  is the memory cell capacitance, a is the gain of the buffer stage, g is the pixel conversion gain (V/electron) and q is the electron charge. The sensor is designed to achieve a charge gain G greater than 10. The large charge gain allows the use of a passive pixel memory cell without significant cost in noise since the *pixel-referred* noise (in electrons r.m.s.) on the memory capacitor is  $\langle \sigma_n^2 \rangle^{1/2}/G$ .

The frame-transfer operation consists of sequentially selecting rows in the CMOS APS, reading the pixels in the row, and writing the signals to the corresponding memory cell row. Total time to transfer a single row of pixel signals to a row

Manuscript received October 19, 1996; revised May 20, 1997. The review of this paper was arranged by Editor C. V. Stancampiano.



Fig. 1. Frame-transfer APS architecture.

of memory cells is approximately 2  $\mu$ s. (This is equivalent to a vertical transfer rate of 0.5 MHz in a frame-transfer CCD) Thus, to perform frame transfer on an array of 32 rows takes approximately 64  $\mu$ s. For an array with 512 rows, it would take 1 ms. Unlike a frame-transfer (FT) CCD, all rows need not be transferred, and they need not be selected sequentially. Faster transfer times are possible in future designs.

The frame-transfer operation allows the CMOS APS to capture images in "snapshot" fashion, thus eliminating flicker caused by indoor lighting, and motion-induced image skew. Unlike frame transfer in a CCD, no smear is introduced by the frame-transfer operation. The penalty for an FT-APS, like in a FT-CCD, is an increase in chip area for the frame memory.

Readout of the frame memory is performed using charge integration amplifiers. Each column has its own charge integration amplifier. The voltage output of the amplifier is sampled onto a capacitor. Charge from these column capacitors is read out using a global charge integration amplifier, as shown in Fig. 2.

The column charge integration amplifier (CCIA) consists of a standard folded-cascode op-amp [9]. and switched capacitor (SC) network. The folded-cascode op-amp and SC network has the unique feature that it is designed to fit in the relatively narrow pitch of 24  $\mu$ m in width. Additional circuitry is added to compensate not only for the op-amp input offset but also the signal mismatch error prestored in the memory cells. This suppresses fixed pattern noise (FPN) in the image sensor.

FPN suppression in the CCIA stage is as follows. Initially, switches B1, B4, and M1 are closed, sampling the clamp

voltage and source-follower offset onto a dummy memory capacitor. Switches C1, C3, C4, and C7 are also closed, putting the op-amp into a unity-gain mode, where the output of the op-amp is the op-amp input offset voltage. Switches C1, C3, and C7 are opened, and C2, C4, and C6 are closed. The charge from the dummy memory capacitor is then transferred to the CCIA by opening B4 and closing C5. The capacitor  $C_{C2}$  (4 pF) now has a voltage across it that includes both the op-amp and buffer offset. Switches C4, C5 and C6 are opened, and switch C7 is closed, effectively flipping the capacitor  $C_{C2}$  and negating the offsets. This causes the output of the op-amp to become equal to V+, the reference voltage. This FPN suppression sequence needs only to be performed at a rate to compensate for droop on capacitor  $C_{C2}$ , but typically is performed once per frame.

To commence readout of signals stored in the frame memory, switch C1 is momentarily closed to reset the integrator. Charge is then transferred to the integrator by selecting a memory capacitor via switch M1, and closing switch C5. If the change in buffer output signal from a pixel is  $V_{sig}$  (not the voltage stored on the memory capacitor), then the integrator output changes in proportion to  $V_{sig}$  when the memory capacitor is selected and integrated. The change is proportional also to the ratio of  $C_{C1}$  to  $C_M$ , designed to be unity (each 0.5 pF). Capacitor sizing mismatch leads to nonunity gain and secondorder effects that limit the performance of the CCIA. These effects are partially compensated by a second clamp circuit at the output of the CCIA formed by  $C_{C3}$  and switch C8. The opamp voltage is sampled onto the capacitor  $C_{\rm MC}$  via sampling switch C9 and level shifting of the clamp circuit  $C_{C3}$  (2 pF) and C8.  $C_{\rm MC}$  is designed for a value of 2 pF leading to a net voltage gain of 0.5 from  $C_M$  to  $C_{MC}$ . While transferring the charge from  $C_M$  to  $C_{\rm MC}$  via the CCIA helps reduce the parasitic capacitance in the subsequent horizontal circuit described below, its true utility is in the binning operation described later.

The charge on the column capacitors  $C_{\rm MC}$  are sequentially selected for readout by the column decoder circuit. A given column is selected for readout by switch C\_Sel. Prior to activating C\_Sel, the horizontal charge integrating amplifier (HCIA) is reset by pulsing RSTO shorting its feedback capacitor  $C_O$  (1 pF). When C\_Sel is activated for a particular column, the charge from the capacitor is converted to a voltage through the capacitance  $C_{CO}$ , resulting in a net voltage gain of 2 from  $C_{\rm MC}$  (unity gain from  $C_M$ ). The process continues until all columns that are desired to be read out have been selected. Column selection need not be sequential nor does it need to be scanned in a particular direction.

The total read noise for the readout of nonbinned pixels is calculated as follows:

$$\langle V_{n,o}^2 \rangle = \frac{3KT}{C_M} \alpha_O^2 \left\{ \frac{C_M}{C_{\rm MC}} + \left( \frac{C_{\rm MC}}{C_{\rm MC} + C_{\rm C3}} \right)^2 \times \left( [1 + (1+\alpha)^2] \alpha^2 + \frac{C_M}{3C_{\rm C2}} (1+\alpha)^2 \right) \right\}$$
(3)

where  $\alpha_O = C_{\rm MC}/C_O$  and  $\alpha = C_M/C_{\rm C1}$ .



Fig. 2. Signal chain of the FT-APS readout.

#### **III. BINNING OPERATION**

CCD's have been operated in a binned mode almost since their inception. Charge from adjacent pixels in a column are summed in the horizontal register. Charge from adjacent columns are summed at the output node. The effective resolution of the CCD is decreased depending on how many pixels are summed in each direction. The signal-to-noise ratio (S/N) is increased by the square root of the number of pixels binned if the noise is dominated by shot noise and the S/N is increased linearly with the number of pixels binned if the noise is dominated by read noise. In a CCD, the binning operation is noiseless since summation takes place in the charge domain. This is a major advantage for CCD's.

An analogous operation can be performed for the FT-APS using the CCIA amplifiers and the HCIA. For vertical direction binning, charge from multiple memory cells can be summed using the CCIA by sequentially accessing the cells using R\_Sel and integrating the charge on the feedback capacitor.

For the binning of pixels in adjacent columns, the HCIA is utilized. In this case, charge from multiple capacitors ( $C_{\rm MC}$ ) are summed in the HCIA by sequentially accessing those capacitors via C\_Sel, without resetting the HCIA between selection of successive columns. Since the charge on the multiple capacitors represents a vertically binned signal, the output from the HCIA represents a 2-D binning of pixels of arbitrary kernel size. In the FT-APS, binning of pixels results in an improvement in S/N. However, because the binning process is not noiseless, the improvement in S/N is less than in the case of the CCD. The noise for binning *m* pixels vertically and *n* pixels horizontally can be written as

$$\langle V_{n,o}^2 \rangle = \frac{3KT}{C_M} \alpha_O^2 n \left\{ \frac{C_M}{C_{\rm MC}} + \left( \frac{C_{\rm MC}}{C_{\rm MC} + C_{\rm C3}} \right)^2 \times \left( \alpha^2 [m + (1+\alpha)^2] + \frac{C_M}{3C_{\rm C2}} (1+\alpha)^2 \right) \right\}.$$
(4)

The resolution of the sensor is also modified by the binning process in the same way as for a CCD. It should be noted that the variable resolution offered by this approach is different from that previously reported for a multiresolution CMOS APS [10] because in the present case the signal grows as charge is binned. Summation is important for low light conditions where



Fig. 3. Image of George Washington at full resolution taken at 100 kpixels/s.

S/N can be improved at the expense of spatial resolution. In the previous work, the binned pixels are averaged, not summed. Averaging is important for common lighting conditions where summation would cause saturation of the sensor output.

#### **IV. EXPERIMENTAL RESULTS**

A 32 × 32 element APS array and a 32 × 32 cell frame memory were implemented using the HP 1.2  $\mu$ m single-poly, double-metal process with linear capacitor option available through MOSIS. The APS array pixel size was 24 × 24  $\mu$ m with a designed fill factor of 29%. The sensor was measured to have a conversion gain of approximately 6  $\mu$ V/e<sup>-</sup>.

The memory cell size was  $24 \times 28 \ \mu$ m with a memory cell capacitance  $C_M$  of 0.5 pF. Layout of the CCIA and SC network was 24  $\mu$ m in width and 400  $\mu$ m in length. Second level metal was used for routing in the frame memory and used for light shield in the pixel array periphery. Total chip size was  $2.8 \times 4.5$  mm. Power dissipation was measured to be less than 1.6 mW at 400 kpixels/s.

Charge gain can be estimated from (2) to be 15, leading to pixel-referred noise in the memory cell of  $18 e^-$  r.m.s. from kTC sampling noise. Read noise from similar pixels has been measured to be  $13 e^-$  r.m.s. leading to a total pixel-referred cell-write noise of 22 e<sup>-</sup> r.m.s.

The CCIA was tested independently and found to be linear over its designed operating range of 1 V swing to better than



Fig. 4. Same scene as Fig. 3 taken at lower illumination (a) with FPN suppression circuitry disabled and (b) with FPN suppression circuitry enabled.



Fig. 5. Same scene as Fig. 4(b) but with  $2 \times 2$  binning of pixels.

1 part in 256 or 0.4%. The CCIA op-amp was biased at 5  $\mu$ A (power of 25  $\mu$ W). The HCIA was biased at 120  $\mu$ A (power of 600  $\mu$ W) and was also found to have linearity better than 0.4% over its 1 V nominal range. The memory cells were tested for leakage (dark current) effects and found to retain their value to better than 0.4% over a 1 s interval. This is expected since dark current in APS pixels in the fabrication process typically have dark currents below 1 nA/cm<sup>2</sup>. Thus, the components in the FT-APS signal chain can be said to meet or exceed 8-b accuracy hence achieving one of the design goals.

A full-resolution  $(32 \times 32)$  image captured from the frametransfer APS is shown in Fig. 3. The image was captured at 100 kpixels/s due to the maximum speed of the 16-b ADC card used in the acquisition system. Due to a layout error, the first two rows and first two columns of the array were not functional. The sensor was successfully operated up to 400 frames\s for 400 kpixel output data rate. Higher operating speeds can be achieved with improvement to the HCIA. No blooming or smear was observed in the acquired images, though quantitative assessment of these parameters was not performed.

A full-resolution lower illumination level image captured with the FPN suppression circuitry disabled by timing is shown in Fig. 4(a) and can be compared with the same conditions with the FPN suppression circuitry enabled in Fig. 4(b). Good improvement in image quality is demonstrated.

FPN in the sensor was quantitatively determined by uniformly illuminating the sensor array. With the pixel CDS circuit operational, but with the CCIA FPN suppression circuits deactivated by timing, the FPN in the output image was 40 mV p-p. Activation of the CCIA FPN circuit suppressed the FPN to below 15 mV p-p resulting in a 8.5-dB improvement. With a saturation level of 1 V, the residual FPN is 1.5%



Fig. 6. Improvement in signal with number of pixels binned.



Fig. 7. Improvement in S/N with number of pixels binned.

sat or approximately four LSB's in an 8-b system. Further improvement in FPN suppression is desired for the future.

The same conditions of Fig. 4(b) were used to demonstrate a  $2 \times 2$  pixel binning operation as shown in Fig. 5. The apparent brightness of the image has increased as expected and S/N was

improved. Improvement in output signal level with number of pinned pixels is plotted in Fig. 6.

Noise in the sensor was theoretically calculated to be 0.4 mV r.m.s. from (3). Experimentally, the noise was measured to be much higher at 3.1 mV r.m.s. (Table I). It is felt at this time that residual test station noise has likely caused an anomalously high measured noise level but this hypothesis was not proven by the time of writing. Some improvement in S/N with binning was nevertheless observed, as shown in Fig. 7.

#### V. CONCLUSION

The first frame-transfer CMOS APS has been demonstrated. The sensor integrates an active pixel array with a passive memory cell array to permit frame-transfer operation. Charge integration amplifiers at the bottom of each column and a horizontal charge integration amplifier permit binning of pixel signals from arbitrary kernel sizes during the readout process. The results from the experimental sensor help illuminate options for future, larger frame-transfer APS devices with smart on-chip functions.

#### REFERENCES

- [1] E. R. Fossum, "CMOS image sensors: Electronic camera-on-a-chip," in IEEE IEDM Tech. Dig., Dec. 1995, pp. 17-25,
- [2] S. K. Mendis, S. E. Kemeny, R. C. Gee, B. Pain, Q. Kim, and E. R. Fossum, "CMOS active pixel image sensors for highly integrated imaging systems," IEEE J. Solid-State Circuits, vol. 32, pp. 187-197, Feb. 1997.
- [3] R. D. McGrath, V. Clark, P. Duane, L. McIlrath, and W. Washkurak, "Current-mediated, current-reset  $768 \times 512$  active pixel sensor array, in 1997 IEEE ISSCC Dig. Tech. Papers, San Francisco CA, Feb. 1997, pp. 182–183.[4] N. Ricquier and B. Dierickx, "Pixel structure with logarithmic response
- for intelligent and flexible imager architectures," Microelectron. Eng., vol. 19, pp. 631-634, 1992.
- [5] E. R. Fossum, "Architectures for focal-plane image processing," Opt. Eng., vol. 28, no. 8, pp. 865-871, 1989.
- A. Simoni, G. Torelli, F. Maloberti, A. Sartori, S. Plevridis, and A. [6] Birbas, "A single-chip optical sensor with analog memory for motion detection," IEEE J. Solid-State Circuits, vol. 30, pp. 800-805, July 1995.
- [7] A. Dickinson, B. Ackland, E-S. Eid, D. Inglis, and E. R. Fossum, "A  $256 \times 256$  CMOS active pixel image sensor with motion detection," in 1995 ISSCC Dig. Tech. Papers, 1995, pp. 226-227.
- [8] K. Aizawa, H. Ohno, Y. Egi, T. Hamamoto, M. Hatori, and J. Yamazaki, "On sensor video compression," presented at 1995 IEEE Workshop CCD's and Advanced Image Sensors, Dana Point, CA, Apr. 20-22, 1995.
- [9] K. R. Laker and W. Sansen, Design of Analog Integrated Circuits and Systems. New York: McGraw-Hill, 1994.
- [10] S. E. Kemeny, B. Pain, R. Panicacci, L. Matthies, and E. R. Fossum, "CMOS active pixel sensor array with programmable multiresolution readout," presented at 1995 IEEE Workshop CCD's and Advanced Image Sensors, Dana Point, CA, Apr. 20-22, 1995.

Zhimin Zhou (S'97), for a photograph and biography, see this issue, p. 1763.

Bedabrata Pain (M'95), for a photograph and biography, see this issue, p.

Eric R. Fossum (S'80-M'84-SM'91), for a photograph and biography, see this issue, p. 1698.

| SUMMARY OF THE FT-APS PERFORMANCE |                     |  |
|-----------------------------------|---------------------|--|
| T                                 | <0.40/              |  |
| Integrator nonlinearity:          | <0.4%               |  |
| Sensor saturation:                | 1 V                 |  |
| Temporal noise:                   | 3.1 mV r.m.s.       |  |
| Dynamic range:                    | 50 dB               |  |
| Conversion gain:                  | 6 uV/e <sup>-</sup> |  |

40 mV

15 mV

<4 mV/sec

1.6 mW @ 400 frames/sec.

TABLE I

Power consumption:

FPN with suppresion:

Memory leakage:

FPN without suppression: