# Novel Low-Voltage Low-Power Full-Swing BiCMOS Circuits

Muhammad S. Elrabaa, Student Member, IEEE, Michael S. Obrecht, and Mohamed I. Elmasry, Fellow, IEEE

Abstract—A novel BiCMOS full-swing circuit technique with superior performance over CMOS down to 1.5 V is proposed. A conventional noncomplementary BiCMOS process is used. The proposed pull-up configuration is based on a capacitively coupled feedback circuit. Several pull-down options were examined and compared, and the results are reported. Several cells were implemented using the novel circuit technique; simple buffers, logic gates, and master-slave latches. Their performance, regarding speed, area, and power, was compared to that of CMOS for different technologies and supply voltages. Both device and circuit simulations were used. A design procedure for the feedback circuit and the effects of scaling on that procedure were studied and reported.

## I. INTRODUCTION

S VLSI technology scales down, the supply voltage also scales down. Although bipolar devices do not usually suffer any significant loss in performance with scaling, BiC-MOS circuits utilizing them do. They suffer losses in speed and output voltage swing. Meanwhile, the high performance digital applications of the future will demand both speed and low power consumption at low supply voltages [1]. These demands are yet to be met by BiCMOS circuits.

At low supply voltages, fast and full-swing BiCMOS output waveforms become essential for both speed and static power dissipation of the driven CMOS gates. The speed of CMOS is greatly affected by the input slew rate and if the input is of partial-swing, subthreshold leakage currents cause a nonzero static power dissipation.

Currently, there are two major circuit techniques used in BiCMOS full-swing circuits. The first technique utilizes shunting of the output BJT drivers in one of two configurations: base-emitter shunting (using MOS or resistors) or collectoremitter shunting (using MOS) [2]–[4]. The shunting devices could be controlled by the output (via feedback) as in [4]. In all cases, this class of full-swing BiCMOS circuits suffer from slow switching during the last portion of output transition and poorer low-voltage performance [2]–[4].

The second technique utilizes PNP BJT's in one of two configurations. The first configuration is the emitter-follower

Manuscript received December 4, 1992; revised September 27, 1993. This work was supported in part by research grants from NSERC, ITRC, and MICRONET.

M. S. Elrabaa and M. I. Elmasry are with the VLSI Research Group, Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1.

M. S. Obrecht is with the VLSI Research Group, Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1, on leave from the Institute of Pure and Applied Mechanics, Novosibirsk, Russia.

IEEE Log Number 9214770.

configuration where a PNP is used in the pull-down section and an NPN is used in the pull-up section; both emitters connected to the output node [5]. In the second configuration, the common-emitter configuration, the PNP is in the pull-up section and the NPN is in the pull-down section [5]–[7]. This class of circuits requires complementary BiCMOS processing. Also to achieve full-swing at low supply voltages, the PNP, used in the common-emitter configuration, has to be saturated during transients [5],[7]. This results in an excess power consumption which is not used to charge/discharge the output (the emitter provides a large, unnecessary, base current).

The proposed novel circuits reported in this paper utilize a conventional emitter-follower configuration combined with a positive capacitively coupled feedback technique to achieve output swings very close to the rail supply for voltages down to 1.5 V. They have a more efficient usage of power and do not require PNP's or any other special processing. In Section II the concept of operation of the novel circuits is presented and compared to that of conventional BiCMOS circuits. Also in that section, the operation of the novel circuits was verified using both circuit and device simulations. In Section III, different building blocks that are used in high performance digital subsystems were implemented using the novel circuit techniques. Their performance was evaluated and compared to that of optimized CMOS blocks with similar functionality. The performance comparisons included delay, area, and power comparisons for several technologies and supply voltages. Finally, in Section IV, a brief discussion on the design of the feedback circuitry and the effects of technology and voltage scaling on the design process is presented.

# II. CONCEPT OF OPERATION

#### A. Conventional BiCMOS Circuits

Conventionaly, BiCMOS circuits utilized an emitter-follower configuration for the pull-up section as in Fig. 1(a). In such circuits, the maximum output voltage ( $V_{omax}$ ) is limited by the  $V_{BE}$  drop across the base emitter junction of the BJT driver. The HSPICE [9] transient circuit simulation results for the base and emitter voltages of such circuits, during output pull-up, are shown in Fig. 1(b). As the input falls, the PMOS starts conducting, charging the base-emitter junction to  $V_{BE(on)}$  and the bipolar starts to conduct, raising the output node voltage. As the output rises, the base voltage also rises and approaches  $V_{DD}$ . The PMOS starts to turn off and the base current decreases and finally changes direction; Fig. 1(c). At this point, there is still a collector current due to the remaining



Fig. 1. The conventional BiCMOS pull-up circuit; (a) schematics, (b) the transient response of the base and output voltages, and (c) the transient response of the base and emitter currents.

base (minority carriers) charge. As a result, the output voltage continues to rise until it reaches a value that is slightly above  $V_{\rm DD} - V_{\rm BE}$ , by the end of transition. The voltage of the almost floating base increases above  $V_{\rm DD}$  following the output voltage. However, the PMOS starts to conduct in the reverse direction, removing some of the extra base charge and bringing the base voltage back to  $V_{\rm DD}$ ; Fig. 1(b). The base charge, and hence the final value of the output voltage, depends on the value of the load capacitance  $C_L$ , the technology parameters, and the supply voltage [8]. For low  $C_L$  and/or scaled down technologies and supply voltages,  $V_{\rm omax}$  is relatively lower.

## B. The Novel Circuits

Referring to Fig. 1(a), if the PMOS is turned off before the end of output transition, the charge leakage from the base would be significantly reduced. Hence  $V_{\rm max}$  would increase. In fact, if the PMOS is turned off, half of its channel charge would be injected into the base. This will boost the collector current for an additional period of time and increase  $V_{\rm omax}$ . Also, if an additional source of charge is used to inject extra charges into the base,  $V_{\rm omax}$  will increase even further.

The proposed pull-up circuits are shown in Fig. 2 with an NMOS pull-down section. Several pull-down circuits will be studied later. They use the above mentioned techniques to achieve full and fast output swing for supply voltages down to 1.5 V. In Fig. 2, a feedback switch consisting of MP2 and MN1 is used to turn the PMOS (MP1) off by the end of output transition and a positive feedback capacitance  $C_{fb}$  is used as an additional source of base charge. The feedback switch, the feedback inverters in 1 and in 2, and  $C_{fb}$  will be referred to as the feedback circuit.

The novel circuits work as follows: As the output voltage rises, a charge is stored in  $C_{fb}$ . When the output approaches the maximum value limited by the  $V_{BE}$  drop, the feedback circuit injects this charge back into the BJT base. The base voltage will rise, making the BJT to continue to conduct and the output voltage reaches the rail voltage. As  $C_{fb}$  starts injecting charges into the base, the feedback switch starts to turn the PMOS off minimizing charge losses and injecting more charges into the base. Also, turning the PMOS off reduces the pull-down time since MN will have to discharge only the BJT base. Feedback timing is controlled by the sizing of the feedback inverters and will be discussed later.



Fig. 2. Different implementations of the novel BiCMOS circuits utilizing the positive dynamic feedback.

The circuits in Figs. 2(b) and (c) are noninverting, while 2(a) is inverting. Circuit 2(a), however, would draw a substantial current from the driving gate (like MOS pass logic) during switching that may cause a dip in the input voltage. Circuit 2(b) is obtained from adding a CMOS gate to the input of 2(a), the simplest of all, while in 2(c), the CMOS gate drives the pull-down section, and only an NMOS block is used to drive the pull-up section to reduce the parasitics at the source node of MN1.

A merged BiPMOS device was used for both the NPN and MP1 for two reasons: 1) To not lose additional charge in the MP1 drain-substrate diode when the base node voltage goes above  $V_{\rm DD}$ , and 2) to reduce the overall area. The dotted section of Fig. 2(a) is shown in Fig. 3. The merged device used had a width of 20  $\mu$ m and the depths of emitter, base,



Fig. 3. A cross section of the merged BiPMOS device used in the simulations.

 TABLE I

 The Device Parameters of the Three Generic BiCMOS Technologies

|                          | CMOS                                   |                         |             | Bipolar (min. size) |                                 |                              |                   |                   |
|--------------------------|----------------------------------------|-------------------------|-------------|---------------------|---------------------------------|------------------------------|-------------------|-------------------|
| L <sub>eff</sub><br>(μm) | V <sub>th</sub><br>(V)<br>NMOS<br>PMOS | t <sub>ox</sub><br>(nm) | <br>(fF/μm) | β                   | τ <sub>F</sub><br>( <b>PS</b> ) | <i>I<sub>k</sub></i><br>(mA) | C <sub>c</sub> fF | C <sub>s</sub> fF |
| 0.8                      | 0.8<br>-0.8                            | 15                      | 0.4         | 100                 | 10                              | 2.5                          | 9.5               | 25                |
| 0.5                      | 0.56<br>-0.6                           | 12                      | 0.6         | 100                 | 7                               | 1.8                          | 7.5               | 18                |
| 0.2                      | 0.35<br>-0.35                          | 7                       | 1           | 100                 | 4                               | 1.0                          | 3.5               | 8.5               |

buried layer, and *P*-substrate were 0.01, 0.11, 0.6, and 1.0  $\mu$ m, respectively. The bipolar gain was enhanced by reducing the base current by decreasing the surface recombination velocity at the emitter contact. The drain depth was about 0.08  $\mu$ m and the effective metallurgical channel length was 0.38  $\mu$ m.

The operation of the novel pull-up circuits was verified using both circuit and device simulations. Circuit simulations showed that the novel circuits can achieve swings very close to the rail voltage for different technologies and supplies, as shown in Fig. 4. The HSPICE parameters of those generic BiCMOS technologies are in Table I. The output is fast for the whole duration of transients, i.e., there is no slow portion at the end of transition as in circuits that utilize MOS shunting (e.g., in [4]). Also, Fig. 5 shows the emitter current of the BJT, during the pull-up and pull-down transients, at a frequency of 250 MHz. During pull down, the emitter current is zero, indicating that the excess minority carriers trapped in the collector, due to saturation, do not affect the pull-down transients. This is because unlike the circuit in [7], where a PNP collector is driving the output, an NPN emitter is driving the output.

A two-dimensional transient device simulator TRASIM [10], that was developed from DC device simulators for MOS and bipolar devices [11], [12], was used to check  $V_{omax}$  and the effects of turning off the PMOS and the feedback capacitance on it. The structure in Fig. 3 was simulated. The gate voltage,  $V_G$ , was changed from  $V_{DD}$  to zero and back to  $V_{DD}$  to simulate the feedback switch. As  $V_G$  increases back to  $V_{DD}$ ,  $V_{fb}$  is increased from zero to  $V_{DD}$ . The start of the feedback action is defined as the time when both  $V_G$  and  $V_{fb}$  change from low to high. The initial values of the base and emitter voltages were set to zero. Fig. 6 shows that turning the PMOS off, increases the swing significantly, and adding



Fig. 4. The pull-up transient response of the novel circuit, for several technologies and supply voltages (solid lines), and for  $C_L = 0.3$  pF. For the (0.8  $\mu$ m, 5 V) and (0.5  $\mu$ m, 3.3 V) technologies, the dotted lines represent the response of the circuit in [4] with an MOS output shunt.



Fig. 5. The emitter current of the pull-up BJT during pull-up and pull-down at a 250 MHz frequency.



Fig. 6. The transient response of the novel pull-up circuit with and without turning the PMOS off. Also the effect of  $C_{fb}$  is shown.

the feedback capacitance increases it even further. Although this figure might undermine the importance of the feedback capacitance, since it only increased the swing by about 0.2 V, it should be noted that the subthreshold slope of the stateof-the-art MOSFET's is about 90 mV/decade. This means that the use of  $C_{fb}$  will reduce the leakage currents in CMOS gates driven by the output by about two orders of magnitude. Also Fig. 6 shows that the output voltage does not have a "slower" portion, instead its slope actually increases after the start of feedback.

Π







Fig. 7. The 2-D hole distribution in the merged BiPMOS device at different instants during pull-up transient. (a) Before the feedback. (b) Just after the start of the feedback. (c) At the end of transition.

Device simulations were also used to check for latch-up resulting from a parasitic PNP in the merged BiPMOS devices that was reported to be a potential source of latch-up [13]. There are two parasitic PNP's to be considered. One is under the PMOS gate between the base and source, and the other is under the base between the base and the substrate. From the 2-D hole distribution at different points of time during transients, Fig. 7, the following could be noticed:

1) There is a PNP between the source and the base. It enhances the performance before the feedback by increasing  $I_b$ . However, after the start of feedback, it reverses direction, saturates (Fig. 7(b)) and removes some of the base charge, hence, decreasing the voltage





Fig. 8. The 2-D electron concentration in the base for two instants after the start of feedback. (a) Just after the start of the feedback. (b) At the end of transition.

swing. This PNP could be eliminated using special layout techniques for the merged structure [13].

- Also, after the start of feedback, holes are injected into the collector, due to saturation. However, hole injection into the substrate remains insignificant, as Fig. 7(b) shows.
- 3) By the end of transition, the hole concentration in the N-well, under the gate and base, falls by a few orders of magnitude (Fig. 7(c)). This means that there are no longer any parasitic PNP's and hence no runaway conditions.

The 2-D electron concentration in the base is shown, in Fig. 8, for two instances after the start of the feedback. The BJT continues to conduct after the start of the feedback, as indicated by the large concentration gradient in the base, Fig. 8(a). By the end of transition, the electron concentration in the base drops by several orders of magnitude, as in Fig. 8(b). This means that the pull-down circuit will not be affected, confirming the results obtained from circuit simulations.

# **III. PERFORMANCE COMPARISONS**

In this section the performance of three types of circuits implemented using the novel circuit techniques will be evaluated for several design conditions. These circuits are simple buffers, AND gates, and master-slave D latches.

#### A. Simple Buffers

1) Delay Comparison: The delay time,  $T_D$ , the average of rise and fall times, measured from input =  $V_{DD}/2$  to output



Fig. 9. The average delay of the circuits in Fig. 2 compared to CMOS and using the  $(0.2 \ \mu m, 2 \ V)$  technology.

 $= V_{\rm DD}/2$  was calculated from HSPICE simulations for the circuits in Fig. 2 for the (0.2  $\mu$ m, 2 V) technology and as a function of  $C_L$ .  $T_D$  was also calculated for an optimized CMOS buffer with the same input capacitance as the three BiCMOS buffers. The CMOS buffer was limited to one or two stages only (depending on the value of  $C_L$ ), for practical area considerations. Also, for all circuits, the rise and fall times were approximately equal. The results are shown in Fig. 9. The novel circuits outperformed CMOS down to 0.2 pF load capacitance, something that was not achieved by any other reported BiCMOS circuit at 2 V. The speed-up between the novel circuits and the CMOS buffer start to increase as  $C_L$  increases, but it then starts to decrease and finally CMOS becomes faster. However, at such a point the CMOS buffer chain becomes excessively larger. Similar behavior was recently reported for conventional BiCMOS buffers when compared with optimized CMOS buffer chains [14].

Two other pull-down circuits were tested in conjunction with the novel pull-up circuit. One circuit is similar to the pullup circuit, without the feedback capacitance, and the other is similar to the one in [7], as shown in Fig. 10. The delays of these two circuits and that of circuit 2(a) and the optimized CMOS buffer, are shown in Fig. 11. The novel pull-up/pulldown circuit achieved the highest speed-up over CMOS for higher load capacitance. The circuit with NMOS pull-down achieved good speed-up at lower load capacitance and had the least area among the three BiCMOS circuits.

2) Power Comparison: The average power dissipation of the three BiCMOS circuits, 2(a), 10(a), and 10(b), and the CMOS buffer at 100 MHz is shown in Fig. 12 as a function of  $C_L$ . Circuit 2(a) has the least power dissipation. The two other BiCMOS circuits have power consumptions that are very close to that of the CMOS buffer, especially circuit 10(a).

3) Delay versus Supply Voltage: The delay of circuit 10(a) was calculated as a function of the supply voltage,  $V_{\rm DD}$ , and compared to that of the two stage CMOS buffer for the three BiCMOS technologies, as shown in Fig. 13. The areas of the two circuits were kept approximately equal and the value of  $C_L$  was approximately equivalent to a fanout of 4. For the 0.8  $\mu$ m technology the novel BiCMOS circuit did not



Fig. 10. The novel pull-up circuit combined with (a) a novel pull-down circuit, and (b) a pull-down circuit similar to the one in [5].



Fig. 11. The average delay of the circuits in Figs. 10 and 2(a) compared to CMOS for the (0.2  $\mu$ m, 2 V) technology.

operate below 2 V supply voltage. However, it outperformed the CMOS buffer down to that voltage. As for the 0.5  $\mu$ m technology, the novel circuit outperformed CMOS down to about 1.7 V. At 0.2  $\mu$ m it even outperformed CMOS down to 1.5 V. These results were not achieved by any previously reported BiCMOS circuits that do not utilize the highly expensive PNP's.

## B. AND Gates

A multi-input AND gate was implemented using the novel circuit technique with slight modifications to increase the

П



Fig. 12. Power versus  $C_L$  the circuits in Figs. 10 and 2(a) compared to CMOS for the (0.2  $\mu$ m, 2 V) technology.



Fig. 13. Delay versus supply voltage of circuit 10(a) and the CMOS buffer for three BiCMOS technologies.

speed. As Fig. 14 shows, the feedback switch is slightly different. MP2 is connected as a load and the NMOS MN1 was placed at the bottom of the NMOS logic block so that a PMOS logic block, in the pull-up section, is not needed. There is no static power dissipation since the feedback turns the MN1 off before the end of transition. This configuration saves area and enhances speed by reducing the parasitic capacitance at the BiPMOS gate. The feedback capacitance was dropped out since this circuit was intended for the 0.5  $\mu$ m and 3.3 V technology. Turning the PMOS off was sufficient to make the output voltage reach 3.1 V. This value of  $V_{\text{omax}}$  did not decrease as the number of inputs was increased up to 7. An NMOS base-emitter shunt is used to discharge the NPN base. Since  $V_{\rm th}$  of the NMOS is smaller than  $V_{\rm BE(on)}$ , the BJT will be turned off at the end of pull-up and would remain off during the pull-down. The pull-down section is similar to that in [7] with a few modifications. MN2 is diode connected and the unnecessary NMOS logic block was removed to reduce area and parasitic capacitance at the base of the BJT.



Fig. 14. An AND gate implemented using the novel circuit technique.



Fig. 15. The speed-up and area ratio between the novel BiCMOS AND and the CMOS NAND for the (0.5  $\mu m,$  3.3 V) technology.

The pull-down section of the AND gate operates as follows: starting with high inputs and output, as one or more input changes to low, the corresponding PMOS in the pull-down section turns on, hence turning the bipolar on which starts discharging the output node. This continues until the feedback inverter turns MP3 off and MN2 starts discharging the bipolar base and turning it off. Meanwhile, the shunting NMOS in the pull-up section keeps the BJT off. At the end of the transition, the feedback switch in the pull-up circuit is turned on and the circuit is ready for a pull-up transition. The operation of the pull-up section II. However, in this circuit, the NMOS shunting the BJT is turned on at the end of transition such that the circuit is ready for the next pull-down transition.

The sizing of transistors MP2 and MN2 is very important. MP2 should be small enough not to slow the NMOS logic chain, yet large enough to prevent or reduce glitches at the BiPMOS gate node. MN2 should be small enough not to slow the parallel PMOS logic block from turning the BJT on, yet large enough to discharge the BJT in adequate time (depending on the frequency).

The speed-up and the area ratio between the novel BiCMOS AND gate and a CMOS NAND with equal input capacitance for the (0.5  $\mu$ m, 3.3 V) technology are shown as a function of the number of inputs in Fig. 15.

The speed-up is defined as

Speed-Up Factor =  $\frac{\text{Delay of CMOS Gate}}{\text{Delay of Novel BiCMOS Gate.}}$ 



Fig. 16. A master-slave latch implemented using the novel circuit technique.

This figure shows that for a single input, the speed-up factor is about 1.7 and the area ratio is about 4.3. For the six input gates, the speed-up factor reaches a maximum of 2.7 and the area ratio drops to about 1.6. Hence in macrocells (e.g., adders, ALU's, etc.) implemented using the novel techniques, as the number of inputs per logic gate increases, the overall speedup over the CMOS implementation increases while the relative area penalty decreases. This is confirmed by results reported in [1] where the speed-up of a BiCMOS carry lookahead adder over a CMOS one increased as the average fan-in per logic gate increased. The technique reported here, however, in addition to being faster than CMOS, would consume less area, rendering it more attractive.

#### C. Master-Slave D Flip-Flop

A master-slave D flip-flop that utilizes a single-phase clocking scheme implemented using the novel circuit techniques is shown in Fig. 16. The single-phase clock operation is very essential for future RISC processors applications [1]. A clocked inverter is used to hold the output when the inputs are disabled by either the clock (CK) or the inverted clock  $(\overline{CK})$  [1]. The pull-up section is similar to that of the AND gate, except for the PMOS MP3 at the input used to prevent glitches at the gate node of the BiPMOS device. Glitches may occur when the input is low and the clock goes high. This also eliminates the need for a clocked inverter at the input. Hence the area and the delay are reduced (no series PMOSFET's needed). The PMOS in the feedback switch is now controlled by the feedback inverter. The NMOS shunting the BJT is controlled by by the pull-down section. Unlike the AND gate or the gate in [7], a new arrangement is used for the pull-down section. It consists of a regular feedback switch connected in series with another switch that is controlled by the input and the clock. This arrangement eliminates the need for staking three PMOSFET's on top of each other, which would slow the circuit and hinder its low voltage operation.

This BiCMOS flip-flop, like the one in [1], does not suffer in performance if there is a clock skew between CK and  $\overline{CK}$ . However, it will have smaller area and faster output response even for low supply voltages.

The performance of the novel BiCMOS D flip-flop was compared to that of a CMOS single-phase clocked D flip-flop



Fig. 17. The write and total delays of the BiCMOS and CMOS latches versus the supply voltage.

with the same input capacitance and an approximately equal area for several supply voltages. Using the 0.5  $\mu$ m BiCMOS technology and a Fan<sub>out</sub> of 4, the write and total delays of both circuits are reported in Fig. 17 as a function of the supply voltage. The write time, is the time required to transfer the data from the input to the output of the master latch. The total delay time is the time required to transfer the data from the input to the slave latch. The novel BiCMOS flip-flop not only outperformed the CMOS one by almost a factor of 2 in total delay down to 1.5 V supply, but it also had a smaller write time.

Hence the new circuit techniques can be used to implement buffers, logic gates, and master-slave latches that exceed the speed of CMOS for equal input capacitance and silicon area, and for different technologies and supply voltages.

#### IV. THE DESIGN OF THE FEEDBACK CIRCUIT

The feedback circuit consists of three parts: 1) the feedback switch, MN1 and MP2 in Fig. 2; 2) the feedback capacitance  $C_{fb}$ ; and 3) the feedback inverters, in1 and in2 in Fig. 2. The size of MN1 equals that of the other NMOS transistors in the NMOS series logic block at the input, which are sized according to the input capacitance (loading on the driving gate) specifications. MP2 is minimally sized such that it does not turn MP1 off prematurely or too late. In either case this would lead to a smaller output swing. The sizing of MP2 is strongly coupled to that of the feedback inverter in1. So the NMOS in in1 and MP2 are sized simultaneously such that the feedback switch is turned off when the output is about  $V_{\text{DD}} - V_{\text{BE(on)}}$ . And if in1 is also used to control the pull-down section (as in the circuits of Figs. 9, 14, and 16) its PMOS should be sized such that the pull-down feedback switch is also turned off at the proper time. The discharging NMOSFET's, used to discharge the bases of the pull-up/down BJT's, should be sized minimally and according to the frequency of operation such that they discharge the BJT's base in adequate time, yet do not load the circuit significantly. Inverter in2 is sized depending on the value of  $C_{fb}$ , the larger  $C_{fb}$  is, the larger the PMOS in in2 should be. However, the width of the NMOS in in2 is set close to the minimum width.

The value of  $C_{fb}$  depends on many factors, such as the loading, the technology, and the supply voltage.  $C_{fb}$  should be able to hold enough charge to charge the NPN base to about 0.7 V above  $V_{\rm DD}$  and supply the base charge needed to make the NPN continue conducting till the output reaches  $V_{DD}$ . For a base-collector junction capacitance  $C_{bc}$ , the minimum  $C_{fb}$ required to charge the base to  $V_{DD}$  + 0.7 V would be

$$C_{fb}(\min) = \frac{0.7C_{bc}}{V_{DD}}.$$
(1)

Noting that  $C_{fb}$  is also assisted by the charge injected by the turning off of the PMOS MP1,  $C_{fb}$  does not need to be much greater than the value of  $C_{fb}(\min)$ . This is important since increasing the value of  $C_{fb}$  decreases the output slew rate and hence increases the delay, as Fig. 6 shows. This means that for a supply voltage of 3 V and a  $C_{bc}$  of about 20 fF, a  $C_{fb}$  of 15 fF would probably be sufficient for the circuit to achieve a full swing without compromising the speed significantly. For very high loads,  $C_{fb}$  would have to be increased above the value of  $C_{fb}(\min)$ , especially if the charging PMOS is not very large. The amount of charge supplied to the NPN base by the PMOS when it turns off,  $\Delta Q$ , could be roughly estimated as half of the total channel charge under the gate, i.e.,

$$\Delta Q = \frac{1}{2} C_{\rm ox} V_{\rm DD}$$

where  $C_{ox}$  is the gate oxide capacitance of MP1. If this capacitance is about 40 fF and  $V_{DD}$  is 3 V,  $\Delta Q$  would supply the NPN with an average base current of about 0.6 mA for a hundred picoseconds, usually sufficient to achieve full swing. It should be noted, however, that a portion of that current would be injected into the collector of the now saturated NPN, Hence the circuit designer should not only rely on  $\Delta Q$  to achieve full swing, especially at lower supply voltages, as Fig. 6 shows.

The redesign of the feedback circuit as the technology scales is not straightforward. This is because supply voltage scaling does not usually follow that of the horizontal dimensions, hence the value of  $C_{th}(\min)$  in (1) above will not remain constant with scaling. However, a simple analysis reveals the following: although both  $C_{bc}$  and  $V_{DD}$  scale down in different proportions that may cause the value of  $C_{fb}(\min)$  to increase, the amount of charge required to keep the NPN conducting will become smaller with scaling. This is because the base will be shallower, its area will be smaller, and the collector doping will be higher leading to smaller base charge, smaller leakage surface for the injected charge, and less parasitic PNP latch-up, respectively. This means that the required value of  $C_{fb}$  will probably not increase significantly with scaling. For this work, the values of  $C_{fb}$  used were around 20, 30, and 50 fF for the (0.8  $\mu$ m, 5V), the (0.5  $\mu$ m, 3.3 V), and the (0.2  $\mu$ m, 2V) technologies, respectively.

## V. CONCLUSION

Novel BiCMOS full-swing circuits that utilize a positive capacitively coupled feedback technique and have superior performance over CMOS down to a  $V_{\rm DD}$  of 1.5 V were presented. Their operation was analyzed and confirmed using both circuit and device simulations. Three types of circuits were implemented using the novel circuit techniques and their performance was compared to CMOS. The novel circuits achieved higher speeds for the same input capacitance and area. The novel circuits outperformed CMOS down to 1.5 V supply, for both simple and complex gates. For multi-input gates, it was found that as the number of inputs increases, the speed-up of the novel circuits over their CMOS counterparts increases while the area difference decreases, rendering the novel circuits attractive for logic sub-system implementation. The design procedure of the feedback circuitry, used in the in the novel circuits, was discussed and the effects of scaling on that procedure were summarized. It was shown that, as the technology scales down, the value of the feedback capacitance and hence the sizes of the feedback inverters will not increase.

#### REFERENCES

- [1] K. Yano et al., "3.3-V BiCMOS circuit techniques for 250-MHz RISC arithmetic modules," IEEE J. Solid-State Circuits, vol. 27, pp. 373-381, 1992
- S. H. Embabi, A. Bellaouar, and M. I. Elmasry, *Digital BiCMOS Integrated Circuits Design*. Kluwer Academic, 1993.
   Y. Nishio et al., "A BiCMOS logic gate with positive feedback," in
- ISSCC Tech. Dig., 1989, pp. 116–117.
  [4] H. Hara et al., "0.5 µm 2M-transistor BipnMOS channelless gate array,"
- in ISSCC Tech. Dig., 1991, pp. 148-149.
- Hyun Shin, "Performance comparison of driver configurations and full-swing techniques for BiCMOS logic circuits," IEEE J. Solid-State Circuits, vol. 25, pp. 863-865, 1990. S. H. Embabi et al., "New full-voltage-swing BiCMOS buffers," IEEE
- [6] J. Solid-State Circuits, vol. 25, pp. 150-153, 1991. M. Hiraki et al., "A 1.5 V full-swing BiCMOS logic circuit," in ISSCC
- [7] Tech. Dig., 1992, pp. 48-49.
- T. Arnborg, "Performance predictions of scaled BiCMOS gates using [8] physical simulation," IEEE J. Solid-State Circuits, vol. 27, pp. 754-760, 1992.
- HSPICE User's Manual, Meta-Software, Inc., Campbell, CA, 1990.
- [10] M. S. Obrecht and M. I. Elmasry, "TRASIM-a compact and efficient two-dimensional transient simulator for arbitrary planar semiconductor devices," to be published. M. S. Obrecht, "SIMOS-two-dimensional steady-state simulator for
- [11]MOS devices," Solid-State Electronics, Software Survey Section, vol. 32, no. 6, 1989.
- [12] M. S. Obrecht and J. M. Teven., "BISIM-a program for steadystate two-dimensional modeling of various bipolar devices," Solid-State Electronics, Software Survey Section, vol. 34, no. 7, 1991.
- [13] H. Momose et al., "Characterization of speed and stability of BiNMOS gates with a bipolar and PMOSFET merged structure," IEDM Tech. Dig., 1990, pp. 231-234.
- M. S. Elrabaa and M. I. Elmasry, "Design and optimization of buffer [14] chains and logic circuits in a BiCMOS environment," IEEE J. Solid-State Circuits, vol. 27, pp. 792-801, 1992.



Muhammad S. Elrabaa (S'90) was born in Khartoume, Sudan, on August 6, 1968. He received the B.Sc. degree in computer engineering from Kuwait University, Kuwait, in 1989 and the M.A.Sc. degree in electrical engineering from the University of Waterloo, Waterloo, Ontario, Canada. He is currently studying towards the Ph.D. degree at the University of Waterloo. His research intersts include low-power low-voltage BiCMOS circuits and ECL circuits.



Michael S. Obrecht was born in Kholmsk, Sakhalin, Russia, on January 14, 1953. He received the M.Sc. degree in physics and applied mathematics in 1975 and the Ph.D. degree in theoretical physics in 1983, both from the Novosibirsk State University, Novosibirsk, Russia. He has worked in the area of semiconductor device numerical modeling for the last 10 years.

Currently he is a Senior Scientist at the Institute of Pure and Applied Mechanics, Russian Academy of Sciences, Novosibirsk, Russia. In 1991, he joined

the the Department of Electrical and Computer Engineering of the University of Waterloo as a Visiting Research Associate Professor. He has authored and coauthored over 20 papers. He has developed software tools for twodimensional process-device simulation which are being distributed worldwide. Recently, he has been working on numerical modeling of semiconductor devices for BiCMOS circuits. His research interest is mainly in new efficient algorithms for steady-state and transient, 2D and 3D semiconductor device simulation.



Mohamed I. Elmasry (S'69–M'73–SM'79–F'88) was born in Cairo, Egypt, on December 24, 1943. He received the B.Sc. degree from Cairo University, Cairo, Egypt, and the M.A.Sc. and Ph.D. degrees from the University of Ottawa, Ottawa, Ontario, Canada, all in electrical engineering, in 1965, 1970, and 1974. respectively.

He has worked in the area of digital integrated circuits and system design for the last 27 years. He worked for Cairo University from 1965 to 1968 and for Bell-Northern Research, Ottawa, Canada, from

1972 to 1974. He has been with the Deparment of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada, since 1974. where he is a Professor and founding Director of the VLSI Research Group. He has a cross appointment with the Department of Computer Science where he is a Professor. He has held the NSERC/BNR Research Chair in VLSI design at the same university since 1986. He has served as a consultant to research laboratories in Canada and the United States, including AT&T Bell Labs, GE, CDC, Ford Microelectronics, Linear Technology, Xerox, and BNR, in the area of LSI/VLSI digital circuit/subsystem design. During sabbatical leaves from the University of Waterloo he has been at the Micro Components Organization, Burroughs Corporation (Unisys), San Diego, CA; Kuwait University, Kuwait; and the Swiss Federal Institute of Technology, Lausanne, Switzerland. He has authored and coauthored over 200 papers on integrated circuit design and design automation. He hold several patents. He is editor of the IEEE Press books Digital MOS Integrated Circuits (1981), Digital VLSI Systems (1985), Digital MOS Integrated Circuits II (1991), and Analysis and Design of BiCMOS Integrated Circuits (1993). He is also author of Digital Bipolar Integrated Circuits (John Wiley, 1983) and a coauthor of Digital BiCMOS Integrated Circuits (Kluwer, 1992). He has served in many professional organizations in different positions including the chairmanship of the Technical Advisory Committee of the Canadian Microelectronics Corporation. He is a founding member of the Canadian Conference on VLSI, the International Conference on Microelectronics and is the founding President of Pico Electronics Inc.

Dr. Elmasry is a member of the Association of Professional Engineers of Ontario and is a fellow of the IEEE for his "contributions to digital integrated circuits."