# A Low-Power and High-Speed Quaternary Interconnection Link using Efficient Converters

Jean-Marc Philippe, Sébastien Pillement, Olivier Sentieys IRISA - University of Rennes 1 (ENSSAT) 6, rue de Kerampont 22300 Lannion, France {jphilipp, pillemen, sentieys}@irisa.fr

Abstract—We introduce a new quaternary link including a binaryto-quaternary encoder and a quaternary-to-binary decoder in voltagemode multiple-valued logic (MVL). This link improves the transistor count compared to existing designs and it has no DC current path. The complete link was simulated with SPICE in two recent technologies. It additionally shows interesting advantages on power consumption for global interconnects compared to full-swing signaling binary systems (up to 46% less energy consumption). Its low propagation delay is also an advantage in the design of high-speed on-chip links.

#### I. INTRODUCTION

In modern CMOS technologies, interconnects represent a significant part of the power consumption (up to 50%) [1] and of the chip area. New constraints (such as costs or speed) on system-onchips (SoC) with deep submicron technologies require having a lowpower and high-speed interconnect. Recent researches have focused on reducing the interconnect area as well as the pin requirements. One idea consists in increasing the data rate on a wire by having more than two logic states: this research field is called multiple-valued logic (MVL). This idea is also used to design high-speed inter-chip links using pulse-amplitude modulation (PAM) [2].

MVL has interesting advantages in term of interconnect area reduction. It can also increase the link bandwidth by having more information transferred per cycle on a single wire. This domain is very large and we focus in this article on the interconnect aspect of MVL.

MVL designs can be roughly classified into two categories: currentmode [3] and voltage-mode circuits [4], [5], [6], [7], [8]. The common idea of these circuits is to have more than two logic levels which are encoded into current or voltage levels. Many papers illustrating the design of encoders and decoders for voltage-mode MVL designs have appeared in recent years.

Generally, these articles deal with quaternary links, which consist in having two binary signals (MSB and LSB) converted into one four-level (i.e. quaternary) signal (Q0) which propagates on a wire. This quaternary signal is received by a decoder which converts it back into binary signals (Fig. 1).  $V_{swing}$  is the voltage swing of the binary link. Quaternary links are a good compromise between noise tolerance and wire number reduction.

The number of wires can be reduced by two without decreasing the bandwidth. This is a great advantage since the designer can have more space between adjacent wires to reduce the cross-coupling capacitances.

This paper introduces a low-power and high-speed quaternary link using new converters architecture. It is designed with recent technologies to meet the requirements of SoC.

The remainder of this paper is as follows. Section II quickly reviews some of the existing implementations of quaternary links. Our approach and its feasibility are explained in section III. Section IV deals with the implementations of the encoder and the decoder.



Fig. 1. (a) Binary and corresponding quaternary signals and (b) the logic correspondance between binary and quaternary.

We present the experimental results in section V and finally, section VI conclude.

#### II. RELATED WORKS ON VOLTAGE MODE CONVERTERS

The solutions which can be found in the literature for converting two binary signals into a quaternary one are composed of two parts: a binary-to-quaternary encoder and a quaternary-to-binary decoder. A basic description of the link is shown in Fig. 2.



Fig. 2. Basic description of the link.

In [5], the author uses a switch-based design for the binary-toquaternary encoder. This solution is simple to ensure a good signal quality but it has a DC current path due to the generation of the voltage references. Additionally, it needs 20 transistors (without the voltage references generation). One interesting scheme is proposed in [4]. It has only four transistors (eight if the complete set of configurations is needed). It consists in having two inverters in parallel with a common output which is the output of the encoder. One inverter is controlled by the MSB and the other one by the LSB. The main problem of this encoder is that it has DC current paths for particular combinations of the inputs. A recent encoder with 10 transistors is shown in [6]. It consists in having five inverters supplied with appropriate voltages to control the output voltage. It is quite slow due to double inversion of the binary signals. So, solutions exist for the design of binary-to-quaternary encoders.

The main source of interest is the design of quaternary-to-binary converters. Basically, they consist in driving a bank of comparators with the quaternary signal. The comparators can have external reference voltages [5] (generated or provided by additional power supplies) or can have internal references [4], [7]. In [7], the authors use three inverters with modified switching thresholds. These thresholds are set using proper transistor sizing. There are two main drawbacks with this approach: they use very large transistors and the outputs of the inverters are awkward to use, so an additional circuitry is needed to ensure driving properly following transistors for the complete decoder. A solution derived from algebraic properties is described in [6]. It is a combination of different simple operators but it needs 26 transistors.

Other solutions have been studied. The basic idea is that standard CMOS transistors are not adapted since they can be driven with a lower voltage than their original power supply voltage. So it can create some current paths and an increase of power consumption. In [8], for example, the authors describe a multiple-input floating gate MOSFET to solve the problem. It consists in having multiple input gates which are coupled with a floating-gate. It has a reduction of approximatively 75% in transistor count compared with existing solutions but the design is experimented using an old AMI  $1.5\mu m$  process.

#### III. DESCRIPTION OF THE BASIC ELEMENTS AND FEASIBILITY

This paper proposes a new way to design the comparators of the decoders. The basic approach is the same as [7], it consists in having a bank of inverters but the switching thresholds of the inverters are not set by their size. The transistors thresholds are set by the fabrication process. This technology is used successfully in [10] where the authors design ternary cells using SUS-LOC (Supplementary Symmetrical Logic Circuit Structure) methodology [9]. They present three pairs of transistors with modified thresholds to meet the SUS-LOC methodology. A prototype chip has been fabricated using an SOI (Silicon On Insulator) CMOS technology [11] from UCL (Université Catholique de Louvain-La-Neuve).

We use three pairs of transistors with modified thresholds to design the inverters. The required thresholds for the decoder design are given for two technologies in Table I. The power supply voltage is 1.2Vin  $0.13\mu m$  and 1.8V in  $0.18\mu m$ .

| Transistor Name | Vth (V) $0.13 \mu m$ | Vth (V) $0.18 \mu m$ |
|-----------------|----------------------|----------------------|
| Pm              | -0.52                | -0.78                |
| Nm              | 0.52                 | 0.78                 |
| P+              | -0.92                | -1.38                |
| N+              | 0.92                 | 1.38                 |
| Р-              | -0.12                | -0.18                |
| N-              | 0.12                 | 0.18                 |

TABLE I Voltage threshold for each transistor in the two technologies.

It is interesting to notice that, due to variations inherent to the fabrication process, these thresholds can vary in some limits without deteriorating the system functionality. It only impacts the noise margin by shifting the switching thresholds of the comparators. We can also notice that the switching thresholds of the inverters can be tuned finely by setting proper W/L ratios. The required voltage thresholds are determined by the formulas (Eq. 1 and Eq. 2 for PMOS and NMOS respectively) described in [9].

$$V_{TH}(PMOS) = Vi - (Vo - (OP \times LSV))$$
(1)

$$V_{TH}(NMOS) = Vi - (Vo + (OP \times LSV))$$
(2)

Vi is the input logic level voltage limit the transistor must respond to and Vo is the required output logic voltage level of the transistor. OP is the overlap percentage and it is set in this paper to 70%. The LSV is the logic step voltage between two consecutive quaternary levels.

## IV. DESCRIPTION OF THE CIRCUITS

#### A. Encoder

The encoder is dedicated to the conversion of two bits into a quaternary-valued signal. It uses three power supplies as it is shown in Fig. 3.



Fig. 3. Quaternary encoder.

This encoder needs only 10 transistors (8 for the encoder and 2 for inverting the LSB input). These transistors are standards in the experiments but we can use transistors of type P- and N- of Table I to reduce the number of masks.

We use the coding presented in Fig. 1b where 0, 1, 2 and 3 are considered to be the four logic levels of the quaternary link.

This design enables us to have just one opened branch at a time and a very stable quaternary signal. The MSB signal drives the central inverter to select one node between A and B. The voltage at node A or B is determined by the LSB signal which is driving two switches (using Pass-Transistor Logic) to select the appropriate voltage.

SPICE simulation results in  $0.13\mu m$  are given in Fig. 4. They are obtained by loading the output of the encoder by a 1mm wire (modelled with a  $\pi 3$  model) and the decoder. We have represented all the transitions in this signal. It can be seen that the 500MHz output of the encoder is stable so it can be used in a high-speed link.



Fig. 4. SPICE simulation results for the encoder with a 500MHz input.

#### B. Decoder

The decoder that we propose is composed of 12 transistors (6 custom transistors and 6 standard ones) and it has a very small area. As we can see in Fig. 5a, it is composed of three modified inverters whose inputs are the quaternary signal, and a XOR gate designed using Pass-Transistor Logic (Fig. 5b).



Fig. 5. (a) Quaternary decoder and (b) XOR gate using Pass-Transistor Logic.

The power supply of the decoder is the power supply of the circuit (i.e. V3). The Q0 quaternary signal drives the bank of inverters. Due to the modified thresholds, the inverters permit to isolate each of the four levels and hence to have well-formed binary signals at their outputs.

The first inverter determines the MSB of the quaternary signal. The thresholds of the two transistors are close to the middle of the voltage swing ( $V_{swing}$ ) and are symmetrical.

The other inverters are used to determine the LSB. The output of the second inverter is a logical 1 only if the input is the quaternary level 0 and 0 otherwise. The output of the third inverter is a logical 1 for all inputs except for the quaternary level 3. The LSB is given by applying the XOR function to these two signals.

SPICE simulation results in  $0.13\mu m$  are given in Fig. 6. Each of the decoder outputs is loaded by a standard inverter. The input is generated using SPICE with a fall time and a rise time of 0.1ns for all the transitions.



Fig. 6. SPICE simulation results for the decoder with a 500MHz input.

It can be seen that the decoder can operate with a high-frequency

input and hence is adapted to the design of high-speed on-chip links.

## V. PERFORMANCES

We have simulated the entire link with SPICE using both UMC  $0.13\mu m$  and  $0.18\mu m$  CMOS technologies. The encoder and the decoder are linked by a wire modelled using the  $\pi 3$  model. All transistors are designed with common W of  $12\lambda$  for the PMOS and  $6\lambda$  for the NMOS. We can expect an improvement of our link by optimizing these dimensions. The link was modelled using UMC rules for a metal-2 layer with a power supply voltage of 1.2V and 1.8V respectively. The intermediate voltage levels were equally distributed between the ground and the power supply voltage.

#### A. Energy Consumption

This section compares the proposed quaternary interconnect link structure with two binary structures modelled as two inverters connected by the same wire model. The MSB and LSB inputs are two random signals with a level duration of 10ns to measure the energy consumption of the system with a very long wire up to 10mm. Such a long wire is not useful in current VLSI circuits so we have also designed a quaternary repeater (not presented in this article). The input rise and fall times are 0.1ns for all signals. The outputs of the decoder and of the two last inverters in the binary links have a load of one common inverter ( $W = 12\lambda$  for the PMOS and  $6\lambda$  for the NMOS). The energy consumption is detailed in Fig. 7 for UMC  $0.13\mu m$  and in Fig. 8 for UMC  $0.18\mu m$ , for wire lengths from 1mm to 10mm.



Fig. 7. Energy Consumption (pJ) as a function of the interconnect wire length (mm) for the  $0.13\mu m$  technology.



Fig. 8. Energy Consumption (pJ) as a function of the interconnect wire length (mm) for the  $0.18\mu m$  technology.

These two figures show that the proposed quaternary link consume less energy than two binary ones. The gain is up to 43% in a  $0.13\mu m$  technology and 46% in a  $0.18\mu m$  technology with a 10mm wire.

## B. Delay

We measured the total propagation delay of the interconnection link, the propagation delay and the rise and fall times for both the encoder and the decoder. In this testbench, the input signals have the same rise and fall times of 0.1ns.

We define the propagation delay as the delay between the time the input reaches 50% of its transition and the time the output reaches 50% of its transition, even for a transition in the quaternary case. The rise (or respectively fall) time is defined as the time needed for a signal to increase from 10% to 90% (or decrease from 90% to 10%) of its maximal transition value.

1) Propagation delay of the global interconnection link: We measured the propagation delay of the global interconnection link (the encoder, a 1mm wire and the decoder) for all the possible transitions of the binary inputs. The worst case results are 1.18ns for UMC  $0.13\mu m$  and 0.965ns for UMC  $0.18\mu m$ .

2) The encoder: We measured the propagation delay and the rise and fall times of the encoder for all the transitions. The worst case results are presented in Table II. The encoder is loaded by a 1mm wire and the decoder.

| Technology                            | $0.13 \mu m$ | $0.18 \mu m$ |  |
|---------------------------------------|--------------|--------------|--|
| Rise or fall time (ns)                | 0.560        | 0.772        |  |
| Propagation delay of the encoder (ns) | 0.256        | 0.359        |  |

TABLE II

PROPAGATION DELAY AND RISE AND FALL TIMES OF THE ENCODER.

| Technology             |    | $0.13 \mu m$ |      | $0.18 \mu m$ |     |
|------------------------|----|--------------|------|--------------|-----|
|                        |    | MSB          | LSB  | MSB          | LSB |
| Rise or fall time (ps) | 1  | Х            | 172  | Х            | 184 |
|                        | 2  | 300          | Х    | 234          | X   |
|                        | 3  | Х            | 419  | Х            | 527 |
|                        | 4  | Х            | 661  | Х            | 486 |
|                        | 5  | 208          | Х    | 247          | X   |
|                        | 6  | Х            | 132  | Х            | 183 |
|                        | 7  | 296          | 356  | 233          | 394 |
|                        | 8  | 68.5         | 132  | 86.1         | 178 |
|                        | 9  | 39.5         | Х    | 53.4         | X   |
|                        | 10 | 50.5         | Х    | 72.7         | X   |
|                        | 11 | Х            | 173  | Х            | 184 |
|                        | 12 | 53.9         | 438  | 65.5         | 523 |
|                        | 13 | 194          | 104  | 248          | 151 |
| Propagation delay (ps) | 1  | Х            | 246  | Х            | 263 |
|                        | 2  | 138          | Х    | 135          | Х   |
|                        | 3  | Х            | 435  | Х            | 565 |
|                        | 4  | Х            | 680  | Х            | 441 |
|                        | 5  | 95.9         | Х    | 153          | X   |
|                        | 6  | Х            | 232  | Х            | 344 |
|                        | 7  | 147          | 109  | 151          | 159 |
|                        | 8  | 31.2         | 249  | 48.2         | 361 |
|                        | 9  | 47.0         | X    | 56.2         | X   |
|                        | 10 | 46.0         | X    | 60.5         | X   |
|                        | 11 | Х            | 247  | Х            | 263 |
|                        | 12 | 33.8         | 476  | 42.8         | 574 |
|                        | 13 | 110          | 91.4 | 169          | 137 |

TABLE III

PROPAGATION DELAY AND RISE AND FALL TIMES OF THE DECODER.

3) The decoder: We have measured the propagation delay and the rise and fall times for all the possible quaternary transitions. Our benchmark is composed of signals with rise and fall times of 0.1ns. Hence the slope is not the same for all transitions. All delays are given in Table III. Each transition is defined by a number given in

Fig 9. A X in Table III means that there is no transition at the output for this transition at the input.



Fig. 9. Quaternary input signal transitions used in the testbench.

The worst case propagation delays of the proposed encoder and decoder are better than the worst case propagation delays of other solutions (which are greater than 2ns).

## VI. CONCLUSION

A new quaternary link is presented in this paper. This approach can increase the bandwidth of a link or can enable the designer to save silicon area because it divides by two the number of required wires. It can also be used to increase the inter-wire distance, and thus reduce cross-talk noise. This link was simulated with SPICE models on two recent UMC technologies. It has up to 46% less power consumption than a full-swing signaling system for long global interconnects. This link is also adapted to design high-speed interconnects due to its low propagation delay.

#### REFERENCES

- H. Zhang, V. George, and J. M. Rabaey, "Low-Swing On-Chip Signaling Techniques: Effectiveness and Robustness," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 8, no. 3, pp. 264–272, June 2000.
- [2] M. Pedram and J. M. Rabaey, *Power Aware Design Methodologies*. Kluwer Academic Publishers, June 2002, ch. 8, pp. 201–239.
- [3] K. W. Current, "Current-Mode CMOS Multiple-Valued Logic Circuit," *IEEE Journal of Solid-State Circuits*, vol. 29, no. 2, pp. 95–107, February 1994.
- [4] N. R. Shanbhag, D. Nagchoudhuri, R. E. Siferd, and G. S. Visweswaran, "Quaternary Logic Circuits in 2μm CMOS Technology," *IEEE Journal* of Solid-States Circuits, vol. 25, no. 3, pp. 790–799, June 1990.
- [5] K. W. Current, "Memory Circuits for Multiple Valued Logic Voltage Signal," in 25th International Symposium on Multiple-Valued Logic (ISMVL'95), Bloomington (USA), May 1995, pp. 52–57.
- [6] I. M. Thoidis, D. Soudris, I. Karafyllidis, and A. Thanailakis, "The Design of Low-Power Multiple-Valued Logic Encoder and Decoder Circuits," 6th IEEE International Conference on Electronics, Circuits and Systems (ICECS '99), vol. 3, pp. 1623–1626, September 1999.
- [7] Y. B. Guo and K. W. Current, "Voltage Comparator Circuits for Multiple-Valued CMOS Logic," in 32nd IEEE International Symposium on Multiple-Valued Logic (ISMVL'02), Boston (USA), May 2002, pp. 67– 73.
- [8] A. Srivastava and H. N. Venkata, "Quaternary to binary bit conversion cmos integrated circuit design using mulitple-input floating gate mosfets," *INTEGRATION, The VLSI Journal*, vol. 36, no. 3, pp. 87–101, October 2003.
- [9] E. D. Olson, "Supplementary Symmetrical Logic Circuit Structure," in 29th International Symposium on Multiple-Valued Logic (ISMVL'99), Breisgau (Germany), May 1999, pp. 42–49.
- [10] E. Kinvih-Boh, M. Aline, O. Sentieys, and E. D. Olson, "MVL circuit design and characterization at the transistor level using SUS-LOC," in 33th International Symposium on Multiple-Valued Logic (ISMVL'03), Tokyo (Japan), May 2003, pp. 105–110.
- [11] D. Flandre, S. Adriaensen, A. Afzalian, J. Laconte, D. Levacq, C. Renaux, L. Vancaillie, J.-P. Raskin, L. Demes, P. Delatte, V. Dessard, and G. Picun, "Intelligent SOI CMOS Integrated Circuits and Sensors of Heterogeneous Environments and Applications," in *1st IEEE International Conference on Sensors (Sensors 2002)*, vol. 2, Orlando (USA), June 2002, pp. 1407–1412.