# Split-Gate Logic Circuits for Multi-Threshold Technologies

Muhammad E. S. Elrabaa, and Mohamed I. Elmasry\*

Electrical Engineering Dept., United Arab Emirates University P.O.Box 17555, Al-Ain, UAE

\* VLSI Group, University of Waterloo, Waterloo, Ontario, CANADA

### ABSTRACT

A new dual-Vt static CMOS circuits, the Split-Gate dual-Vt (SG-DVT) logic, are devised. Their performance is compared to that of all-low-Vt. all-high-Vt, and other dual-Vt circuits in terms of speed and energy consumption (both static and dynamic). They achieved speeds close to that of the all-low-Vt circuits, lower leakage (both stand-by and active) than other dual-Vt circuits, and lower leakage dependency on logic block input patterns.

# I. INTRODUCTION

Many circuit techniques for reducing energy consumption of digital CMOS LSIs have been developed. The majority of these techniques focused on reducing the supply voltage as the most favorable way of reducing energy consumption. However, to prevent speed degradation, threshold voltages had to be also scaled down which increased the leakage currents and the static power consumption. Also, it was shown that variations in the threshold voltages diminish the resultant energy-delay improvements [1-2]. This means that threshold voltages would have to be scaled further to get a significant improvement in Energy-Delay product, thus increasing the static power consumption even further.

Several techniques for reducing standby leakage currents in low-Vt (LVT) CMOS circuits were proposed. These techniques range from utilizing high-Vt (HVT) MOS devices to gate the -off the gating devices by

applying a negative  $V_{GS}$  [4]. The second technique requires the complexity of having multi-supply voltages. Similar gating techniques preserved the logic state in standby mode by either adding large resistors [5] or diodes [6] in parallel with the gating devices. All these techniques require the non-trivial task of sizing the gating devices [7-8], which will significantly impact both the speed and area. Also, they require very high ratio of standby time to active time to be effective (they do not improve active leakage).

Other methods for leakage reduction utilize the fact that the leakage of series-stacked low-Vt devices is much smaller than non-stacked devices. Hence an input vector that gives a "minimum" leakage is selected either via a statistically based search algorithm [9] or a genetic algorithm [10]. The resulting leakage reduction is due to the reversed biased  $V_{GS}$  resulting from internal source nodes charging up above ground (for NMOS) and below VDD for (PMOS). This reverse bias, however, would take a relatively long time to develop and cut the leakage completely. Hence this technique still requires a high standby time to active time ratio to be effective. Also, every time the design changes, the "minimum" leakage input vectors will have to be re-calculated using the above mentioned time-consuming algorithms.

Another power reduction technique utilizes automatic feedback control to set the supply voltage at a minimum value that achieves the operating frequency [11]. This technique involves the design of a complex feedback control loop, speed detectors, and highly efficient on-chip DC-to-DC converters. This would have to be repeated for different logic blocks operating at different

speed makes the design process more difficult since the designer does not have a specific supply voltage to target.

In this paper a new dual-vt (DVT) static CMOS circuit technology, the split-gate DVT (SG-DVT), is introduced. The performance of the SG-DVT circuits is compared to that of all HVT (AllHVT), all LVT (AllLVT), and other DVT circuits with equal gate size and load capacitance. All results were obtained using HSPICE<sup>®</sup> simulations with a 0.25 $\mu$ m, 2.5V CMOS technology. The high-Vt devices had thresholds of 540 mV and -580 mV for the NMOS and PMOS devices, respectively. The thresholds of the low-Vt devices were set to 300 mV lower than their HVT counter part. This corresponds to an increase of about 25% in the saturation currents of the LVT devices over those of the HVT devices at the specified supply.

In section II, the circuit topologies of the SG-DVT circuits are introduced and their operation is briefly explained. In section III energy/delay comparisons between the new circuits and the other DVT circuit options are presented. This is followed by a leakage comparison between these circuits in section IV. This includes static and active leakage comparisons and the leakage effect on energy consumption per operation. Finally, conclusions are provided in section V.

### **II. SG-DVT Circuits**

The two SG-DVT techniques that will be examined in this work are demonstrated on 2-inputs NAND gates in Figure 1. MOS devices with thicker gate-lines are LVT while the others (with thin gate lines) are HVT. The gate is split into two types of DVT gates. For the first type, the SG1-DVT, the gate is split into an AllHVT gate and an AllLVT gate connected in parallel as shown in figure 1(a). In the second type (SG2-DVT) the gate is also split into two gates: one with an LVT N-block and HVT P-block (LVTN-HVTP) and another with an LVT P-block and HVT N-block (LVTP-HVTN) as shown in figure 1(b). The output of the LVTN-HVTP gate (type 1 output) which has a faster High-to-low edge is used to drive all the **PMOS** devices in the subsequent gates. The output of the LVTP-HVTN gate (type 2) which has a faster low-to-high edge is used to drive all the NMOS devices in the subsequent gates. Thus both types of devices are driven by signals with the appropriate fast edge. This also means that the SG2-DVT gate has two outputs, rendering it only appropriate for densed logic where the wiring capacitance is very small and insignificant and the outputs do not need to be routed for a long distance.



Fig. 1. The two SG-DVT circuits.

Another **DVT** circuit that will be compared to the SG-**DVT** circuits is the **Alt-Gates DVT** circuit option. This type represents the option of approximately equally mixing **AllLVT** and **AllHVT** gates in the logic path. This result from using **LVT** gates to implement critical paths in a logic block, hence the other paths, on the average, would have equal mixtures of **LVT** and **HVT** gates. Hence both the delay of the **Alt-Gates** logic path and its static (leakage) power are expected to be half way between the **AllLVT** option and the **AllHVT** option.

All the above gates are fully compatible with other CMOS circuits, both dynamic and static (i.e. can drive or be driven by conventional CMOS circuits).

# **III. Energy/Delay Comparisons**

## **Delay Comparison:**

The delay of the above-mentioned DVT circuits along with those of the AllHVT and the AllLVT were evaluated using 31-stages ring oscillators of 2-input NAND gates with a  $fan_{out}$  of 1. Ring oscillators were used to evaluate the delay since this method gives a fairly accurate estimation of the average gate delay including the effects of output loading and input-waveform slope. All circuits had equal input capacitance (equal total gate areas). NAND gates were selected as test vehicles to account for series gating effects such as body effect and reverse biasing (negative V<sub>GS</sub>) of the series connected devices in the off state due to leakage. Also, NAND gates are very favored in CMOS designs. The delay was evaluated as a function of the P/N ratio (defined as the ratio of the W<sub>PMOS</sub> to the W<sub>NMOS</sub> of the NAND gate) and is shown in Figure 2 normalized to the delay of the AllHVT circuit at P/N ratio of 0.5.



Fig. 2. The normalized delay versus the P/N ratio at equal input capacitance.

The following could be observed from this figure:

- 1. The delay of the **AIIHVT** and **AIILVT** circuits represent the upper and lower delay limits
- 2. The optimum P/N ratio of all **DVT** falls between 1.35 to 1.45, similar to standard CMOS.

3. The AIILVT had an optimum delay that is ~ 22% lower than that of the AIIHVT. This is consistent with published delay analysis data for series-connected MOSFET circuits [12]. The AIILVT to AIIHVT delay ratio can be approximated as:

$$Delay \ ratio = \frac{(V_{DD} \ V_{thLVT})^{n}}{(V_{DD} \ V_{thHVT})^{n}}$$

Where **n** is the saturation velocity index and ranges from 1 for very-short channel MOS devices with total velocity saturation to 2 for long-channel devices with no velocity saturation [13]. For a 0.25 $\mu$ m technology the value of **n** is in the range 1.3~1.5 and increases for series-connected devices such as in the NAND gates used in this work [12]. A delay ratio of 0.78 for the values of the LVT and HVT threshold voltages corresponds to an **n** of about 1.6.

- 4. The SG1-DVT and the Alt-Gates circuits are very close and achieved an improvement in the minimum delay equivalent to half the AlILVT  $\sim$ 11%).
- 5. The SG2-DVT, however, achieved an astonishing 20% improvement in the optimum delay over the AllHVT. Hence although only half the SG2-DVT transistors are LVT its speed improvement is within 2% of that of the AllLVT's. This is due to the fast edges driving both types of devices in the gate as explained in section II.

#### **Energy-Delay Product comparison:**

Figure 3 shows the energy-delay (ED) products of the different ring oscillators. Again, the DVT circuits performance lies between the AllHVT AllLVT



Fig. 3. The Energy-Delay product versus the P/N ratio at equal input capacitance.

The following can be deduced from this figure:

- 1. The AIILVT achieved the best ED performance at a P/N ratio of 1.7 not the P/N ratio of 1.35 that yield the minimum delay. The AIIHVT achieved its best ED performance at a P/N ratio of 1.4, the same ratio that yielded its minimum delay.
- 2. The SG2-DVT had a relatively higher ED product than the other DVT implementations because of two reasons: 1) the difference between the edge rates of the inputs to the NMOS and PMOS blocks. This causes a slight increase in the rush-through currents (currents from VDD to GND) during switching, 2) the

 $i_{out}$  of 1. As the Fan<sub>out</sub> increases, the rush-through power becomes negligible part of the total switching power. Nevertheless, the SG2-DVT circuit still achieved lower ED product than the AllHVT circuit.

# **IV. Leakage Comparison**

#### DC (static) leakage:

The average leakage of the different circuits is shown in Table 1 normalized to that of the AllHVT. The leakage was measured using chains of 31 NAND gates at the optimum-delay P/N ratio of each circuit and using DC simulations. Hence it accounts for the reverse biasing of the stacked NMOS devices due to the charging of internal source nodes.

| AllLVT | Alt-Gates | SG1-DVT | SG2-DVT |
|--------|-----------|---------|---------|
| 1444   | 724       | 760     | 760     |

#### Table 1. DC leakage at the optimum delay P/N ratio and normalized to that of the AllHVT circuit.

The following points can be observed from these results:

- 1) All **DVT** circuits have close average DC leakage that is about half the **AllLVT**
- 2) The above results represent the average leakage obtained by averaging the leakage resulting from the two possible inputs to the NAND chains. For the AllLVT, SG1-DVT and SG2-DVT circuits this does not make any difference since the chain leakage is the same for both inputs. However, it does make a huge difference for the Alt-Gates circuit. This because the leakage will vary significantly depending on weather the LVT NMOS stack is leaking or the HVT NMOS stack. The normalized leakage of the Alt-Gates circuit was found to change from 275 to 1171 (4.3x), depending on the input state. The SG-DVT average leakage is slightly higher than the Alt-

due to the splitting of the SG-DVT gate. This reduces the reverse biasing at the internal nodes compared to the Alt-Gates and hence the higher average leakage.

3) If the logic path were composed of different gates, then the AllLVT implementation would also have leakage input dependency. The SG-DVT implementations, however, would have leakage independent of the input pattern and close to half that of the AllLVT.

#### Active leakage:

The above results assume that the stand-by time is very large. In most times, this is not the case. Figure 4 below shows the normalized leakage of the four circuits in Table 1 normalized to DC leakage of the AllHVT as a function of stand-by time elapsed after switching. This figure shows that the DC value of leakage is only attained after 0.2  $\mu$ S of idle time. Still, the relative difference between the **DVT** circuits and the AllLVT remains the same (~half). Also, for all circuits, the starting leakage is **67%** higher than the DC value. This shows the importance of taking account of the stand-by time in leakage evaluation. The effect of this active

leakage on the energy consumption per operation (EPO) of these circuits is shown in Figure 5.



Fig. 4. The normalized leakage versus elapsed stand-by measured from end of switching.



Fig. 5. Normalized Energy per Operation vs. Activity Factor at a nominal frequency of 1GHz.

This figure shows the EPO as a function of the activity factor (AF) at a nominal frequency of 1 GHz and normalized to that of the AllHVT at 100% AF. An AF of 100% means the circuit switches  $2\times10^9$  times per second. As AF decreases, the EPO increases significantly for all circuits except the AllHVT due to the effect of leakage. However, with the exception of the SG2-DVT, the EPO of the DVT circuits remains significantly below that of the AllLVT. The EPO of the SG2-DVT starts at an equal value to that of the AllLVT but then becomes significantly lower below an AF of 2%. This is again due to the unity fan<sub>out</sub> which over emphasis the rush-through currents at high AF. Hence, at low AF the SG2-DVT circuit would achieve 90% of the AllLVT speed at a significantly lower EPO. The EPO difference grows as the AF decreases further.

### V. Conclusion

A new type of dual-Vt logic, the split-Gate logic, is introduced. Two flavors of the SG-DVT were developed; the SG1-DVT and the SG2-DVT. The SG1-DVT achieved identical performance to regular DVT circuits except that it reduces the logic block input pattern dependency of leakage. Hence one can get the speed and leakage advantage of regular **DVT** circuits without out performing the excruciating task of determining the optimum input pattern every time there is a design change in the logic block. The **SG2-DVT** achieved **90%** of the speed gains of the **AllLVT** option at half the leakage. The **SG2-DVT** is specially suited for densed logic blocks since it has two outputs per gate. These outputs, however, can be tightly routed together since they have the same polarity, just different edge-speeds. This circuit suffers from a relatively higher rush through currents due to the slow edge signals that feed the **LVT** devices in these gates. This problem, however, becomes insignificant at Fan<sub>Out</sub> larger than 1.

An important parameter, the energy per operation (EPO), was used to evaluate the effect of active leakage on energy consumption. The results showed that although the AlILVT achieved the lowest energy-delay product, the dual-Vt circuits achieved much lower energy per operation, especially at low activity factors. Hence it is not sufficient to only consider the energy-delay product to evaluate the power performance of circuits. This is especially true since most circuits in many applications have very low activity factors.

### **VI. REFERENCES**

- [1] D. Frank, et al., "Supply and threshold Voltage Optimization for Low Power Design," 1997 ISLPED, pp. 317-322, 1997.
- [2] R. Gonzalez, et al., "Supply and threshold Voltage Scaling for Low Power CMOS," IEEE JSSC, vol. 32, pp. 1210-1216.
- [3] S. Mutoh, et al., "A 1-V Power Supply High-Speed Digital Circuit Technology with Multi-threshold Voltage CMOS," IEEE JSSC, vol. 30, pp. 847-854, 1995.
- [4] M. Stan, "Low-Threshold CMOS Circuits with Low Standby Currents," 1998 ISLPED, pp. 97-98, 1998.
- [5] M. Horiguchi, et al., "Switched-Source-Impedance CMOS Circuit for Low Standby Subthreshold Current Giga-Scale LSIs." IEEE JSSC, vol. 28, pp. 1131-1135, 1993.
- [6] H. Makino, et al., "An Auto-Backgate-Controlled MT-CMOS Circuit," in 1998 Sym. On VLSI Circuits, pp. 42-43, 1998.
- [7] J. Kao, et al., "Transistor Sizing Issues and Tool for Multi-Threshold CMOS Technology," 34th Design Automation Conf., pp. 409-414, June 1997.
- [8] J. Kao, et al., "MTCMOS Hierarchical Sizing Based on Mutual Exclusive Discharge Patterns," 35th Design Automation Conf., pp. 495-500, June 1998.
- [9] J. Halter and F. Najm, "A Gate-Level Leakage Power Reduction Method for Ultra-Low-Power CMOS Circuits." 1997 Custom Integrated Circuits Conf., pp. 475-478.
- [10] Z. Cheng, et al., "Estimation of Standby Leakage Power in CMOS Circuits Considering Accurate Modeling of Transistor Stacks." 1998 ISLPED, pp. 1-6.
- [11] T. Kuroda, et al., "Variable Supply-Voltage Scheme for Low-Power High-Speed CMOS Digital Design," IEEE JSSC, vol. 33, pp. 454-462.
- [12] T. Sakurai and A. R. Newton, "Delay Analysis of Series-
  - 131.
- [13] T. Sakurai and A. R. Newton, "Alpha-power law MOSFET model and its application to CMOS inverter delay and other formulas." IEEE JSSC, vol. 25, pp. 584-593.