## 7.1 An 18.6Gb/s Double-Sampling Receiver in 65nm CMOS for Ultra-Low-Power Optical Communication

Meisam Honarvar Nazari, Azita Emami-Neyestanak

California Institute of Technology, Pasadena, CA

Using optics for chip-to-chip interconnects has recently gained a lot of interest. As data rates scale to meet increasing bandwidth requirements, the shortcomings of copper channels are becoming more severe. Hybrid integration of optical devices with electronics has been demonstrated to achieve high performance [1-4], and recent advances in silicon photonics have led to fully integrated optical signaling [5]. These approaches pave the way to massively parallel optical communications. Dense arrays of optical detectors require very low-power, sensitive, and compact optical receiver circuits. Existing designs for the input receiver, such as TIA, require large power consumption to achieve high bandwidth and low noise, and can occupy large area due to bandwidth-enhancement inductors. In this work, a compact low-power optical receiver that scales well with technology is designed to explore the potential of optical signaling for future chip-to-chip and on-chip communication.

In most optical receivers, the photodiode current is converted to a voltage signal. A simple resistor can perform the I-V conversion if the resulting RC time constant is in the order of the bit interval (T<sub>b</sub>) [5]. However, for a given photodiode capacitance and target SNR, the RC limits the bandwidth and hence the data rate. To avoid this problem, TIAs are commonly employed, which are highly analog, power hungry, and do not scale well with technology. One alternative is the integrating front-end to eliminate the need for resistance and breaking the bandwidth trade-off. However, this technique suffers from voltage headroom limitations and requires short-length DC-balanced inputs [6]. In this work, we design an RC front-end with a time constant that is much larger than  $T_b$  (RC>> $T_b$ ) without imposing any requirement on the input data. The voltage across the resistor is sampled at the end of two consecutive bit times  $(V_n, V_{n+1})$  and these samples are compared to resolve each bit ( $\Delta V_s$  =  $V_{n+1}\text{-}V_n\text{>}0$  results in 1 and  $\Delta V_s$  < 0 results in 0). Similar to [6], this double-sampling technique allows demultiplexing by use of multiple clock phases and samplers, as shown in Fig. 7.1.1. The RC at the front-end prevents out-of-range input voltages caused by long sequences of ones or zeros. However, it causes the voltage difference ( $\Delta V_s$ ) to be input-dependent. For instance, a one after a long sequence of zeros generates larger  $\Delta V_s$  than a one after a long sequence of ones, as shown in Fig. 7.1.2. To resolve this problem, the offset of the sense amplifier is dynamically modulated to provide a constant voltage at its input regardless of the data sequence.

Figure 7.1.1 shows the top-level architecture of the receiver. The input current from the photodiode is integrated over the parasitic capacitor, while the shunt resistor (R<sub>s</sub>) limits the voltage. R<sub>s</sub> can be adjusted to prevent saturation at high optical powers. The resulting voltage from the integrator is sampled by a bank of four sample/holds (S/H). The S/H is composed of a PMOS switch and the parasitic capacitor (C<sub>s</sub>) from the following stage. C<sub>s</sub> is chosen to be about 15fF to optimize the trade-off between the S/H speed and KT/C noise. An amplifier with 6dB of gain provides isolation between the sensitive sampling node and the sense amplifier to minimize kick-back. This also creates a constant commonmode voltage and prevents input-dependent offset. A StrongARM sense amplifier [7] is employed to achieve high sampling rate and low power. This sense amplifier has a separate offset cancellation for mismatch compensation. Figure 7.1.3 shows the details of the dynamic offset modulation technique. A differential pair at the input of the sense amplifier introduces an offset that is proportional to the difference between the sampled voltage (V<sub>S0</sub>) and a reference voltage ( $V_{REF}$ ).  $V_{REF}$  is defined as the average of the maximum ( $V_{DD}$ - $R_s \times I_{zero}$ ) and minimum  $(V_{DD}-R_s \times I_{one})$  voltages at  $V_{PD}$ . Figure 7.1.2 provides an example to illustrate the operation of the dynamic offset modulation. As a worst case, a long sequence of ones, followed by a long sequence of zeros is considered. The first one after zeros generates a large voltage at  $V_{\text{PD}}$ . As the number of successive ones increases, this voltage decays exponentially due to R<sub>s</sub> and C<sub>PD</sub>. As shown in Fig. 7.1.2, if the maximum voltage at the output of the amplifier ( $\Delta V_{AMP}$ ) is equal to  $\Delta V_{max}$ , dynamic offset modulation introduces an offset so that the sense amplifier differential input is  $\Delta V_{max}/2$ , regardless of the previous bits. For instance, an offset equal to  $-\Delta V_{max}/2$  is applied when  $\Delta V_{AMP} = \Delta V_{max}$ , no offset

is applied when  $\Delta V_{AMP} = \Delta V_{max}/2,$  and an offset equal to  $\Delta V_{max}/2$  is applied if  $\Delta V_{AMP} = 0.$ 

The prototype is fabricated in a 65nm CMOS technology and occupies less than 0.0028mm² and is shown in Fig. 7.1.4. It is composed of two receivers, one with a photodiode emulator and one for optical testing with a photodiode. In the first version, an emulator mimics the photodiode current with an on-chip switchable current source. A bank of capacitors ( $C_{pd}$ ) is integrated to emulate the parasitic capacitances (photodiode plus bonding). The functionality of the receiver is first validated using the emulator and PRBS7, 9, 15 sequences.  $R_s$  and  $C_{pd}$  are chosen to be 2.2k $\Omega$  and 250fF (RC = 550ps). Figure 7.1.3 shows how the BER changes with the input current at 14.2Gb/s, 16.7Gb/s, 20Gb/s, and 24Gb/s. The receiver achieves about 75 $\mu$ A of sensitivity at 14.2Gb/s that reduces to 160 $\mu$ A at 24Gb/s. This is mainly due to the fact that the bit interval and hence the integration time decreases. The voltage sensitivity of the receiver is measured to be about 13mV up to 22Gb/s and drops to 17mV at 24Gb/s.

The receiver is wire-bonded to a high-speed photodiode and tested at different data rates. The photodiode, bonding pad, wire-bond, and the circuitry are estimated to introduce more than 200fF capacitance. Figure 7.1.4 shows the optical test setup. The optical beam from a DFB laser diode is modulated by a highspeed Mach-Zender modulator and coupled to the photodiode through a singlemode fiber. As the beam forms a Gaussian profile upon leaving the fiber, the gap between the fiber tip and the photodetector causes optical intensity loss. This, combined with the optical connector's loss, is estimated to introduce about 3 to 4dB reduction in optical power at the photodiode. Figure 7.1.5 shows how the sensitivity of the receiver changes with data rate. Note that the coupling loss is not considered in this plot. The receiver achieves more than -12.5dBm of sensitivity at 10Gb/s that reduces to -7.3dBm at 18.6Gb/s. The maximum achievable data rate in this experiment is mainly limited by the performance of the external optical intensity modulator. The maximum optical power before the receiver goes into saturation is about 0dBm. The receiver power consumption (including all clock buffers) at different data rates is also shown in Fig. 7.1.5. The power increases linearly with the data rate as the receiver employs mostly digital blocks. The receiver offers a peak power efficiency of 0.36mW/Gb/s at 20Gb/s data rate. Tables in Fig. 7.1.6 summarize the performance of the optical receiver and compare it with prior art.

The double-sampling optical receiver with dynamic offset modulation consumes less than 0.36mW/Gb/s power while operating at 18.6Gb/s. The architecture is well-suited for highly-scaled technologies. Experimental results validate the feasibility of the receiver for ultra-low-power, high-data rate and highly parallel optical links.

## Acknowledgements:

The authors acknowledge the support of NSF, FCRP, Cosemi Tech Inc, and STMicroelectronics. M. Nazari would like to thank Z. Safarian for constant help and support.

## References:

- [1] C. L. Schow et al, "Low-Power 16x10 Gb/s bi-directional single chip CMOS optical transceivers operating at < 5 mW/Gb/s/link," *J. Solid-State Circuits*, vol. 44, no. 1, pp. 301-313, Jan. 2009.
- [2] T. Takemoto et al, "A Compact 4x25-Gb/s 3.0 mW/Gb/s CMOS-based optical receiver for board-to-board interconnects," *J. Lightwave Technology*, vol. 28, no. 23, pp. 3343-3350, Dec. 2010.
- [3] I. A. Young et al, "Optical I/O technology for Tera-scale computing," *J. Solid-State Circuits*, vol. 45, no. 1, pp. 235-248, Jan. 2010.
- [4] F. Liu et al, "10Gbps, 530fJ/b optical transceiver circuits in 40nm CMOS," *Symp. On VLSI circuits Dig. Tech.* Papers, pp. 290-291, Jun. 2011.
- [5] D. Kucharski et al, "10Gb/s 15mW optical receiver with integrated Germanium photodetector and hybrid inductor peaking in 0.13µm SOI CMOS Technology," *ISSCC Dig. Tech. Papers*, pp. 360-361, Feb. 2010.
- [6] S. Palermo et al, "A 90 nm CMOS 16 Gb/s transceiver for optical interconnects." *J. Solid-State Circuits*, vol. 43, no. 5, pp. 1235-1246, May 2008.
- [7] J. Montanaro et al, "A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor," J. Solid-State Circuits, vol. 31, no. 11, pp.1703-1714, Nov. 1996.



Figure 7.1.1: Receiver top-level architecture.



Figure 7.1.3: Receiver front-end and input current sensitivity at different data rates with PRBS9.



Figure 7.1.5: Optical Sensitivity, power efficiency, and input and output diagram.



Figure 7.1.2: Double-sampling and dynamic offset modulation technique.



Hybrid integration of the photodiode and the receiver

Figure 7.1.4: Optical test set-up, micrograph of the receiver with bonded photodiode.

SMF Fiber

| Performance Summary                     |                            |  |  |  |  |
|-----------------------------------------|----------------------------|--|--|--|--|
| Technology                              | 65nm CMOS                  |  |  |  |  |
| Supply Voltage                          | 1.2V                       |  |  |  |  |
| Max Data Rate*                          | 24Gb/s                     |  |  |  |  |
| RX C <sub>in</sub>                      | 200fF                      |  |  |  |  |
| RX Sensitivity (BER<10 <sup>-12</sup> ) | Current                    |  |  |  |  |
| 14.2Gb/s, 16.7Gb/s, 20Gb/s, 24Gb/s      | 76μΑ, 100μΑ, 130μΑ, 160μΑ  |  |  |  |  |
| Area                                    | 50µm×50µm                  |  |  |  |  |
| Power Dissipation                       |                            |  |  |  |  |
| 14.2Gb/s, 16.7Gb/s, 20Gb/s, 24Gb/s      | 5.7mW, 6.5mW, 7.3mW, 9.6mW |  |  |  |  |

<sup>\*</sup> Employing on-chip current emulator

## Comparison Table

|                    | This work             | [1]*                | [2]                | [4]        | [5]                | [6]*                 |
|--------------------|-----------------------|---------------------|--------------------|------------|--------------------|----------------------|
| Technology         | 65nm                  | 130nm               | 65nm               | 40nm       | 130nm SOI          | 90nm                 |
| Data Rate          | 18.6Gb/s              | 12.5Gb/s            | 25Gb/s             | 10Gb/s     | 10Gb/s             | 16Gb/s               |
| Power              | 0.4mW/Gb/s            | 3.5mW/Gb/s          | 3mW/Gb/s           | 0.4mW/Gb/s | 1.5mW/Gb/s         | 1.4mW/Gb/s           |
| Area               | 0.0028mm <sup>2</sup> | 0.15mm <sup>2</sup> | 0.4mm <sup>2</sup> | -          | 0.9mm <sup>2</sup> | 0.025mm <sup>2</sup> |
| RX C <sub>in</sub> | 200fF                 | 110fF               | -                  | <60fF      | <20fF              | 440fF                |
| Sensitivity        | -7.3dBm**             | -8.5dBm***          | -7.3dBm            | -15dBm     | -                  | -5.4dBm              |

<sup>\*</sup> Require 8B/10B encoded data to ensure DC balance \*\* Coupling loss is not considered @18.6Gb/s

<sup>\*\*</sup> Coupling loss is not considered @18.6Gb/s

\*\*\* Sensitivity @ 10Gb/s

Figure 7.1.6: Receiver performance and comparison with the prior art.