# Synthesis of regular computational fabrics with ambipolar CNTFET technology

Michele De Marchi<sup>\*</sup>, Shashikanth Bobba<sup>\*</sup>, M. Haykel Ben Jamaa<sup>||</sup>, Giovanni De Micheli<sup>\*</sup> \*LSI, EPFL, Lausanne, 1015, Switzerland, {michele.demarchi, shashikanth.bobba, giovanni.demicheli}@epfl.ch <sup>||</sup>CEA-LETI-MINATEC, 17, Rue des Martyrs, F-38054 Grenoble, France, haykel.ben-jamaa@cea.fr

Abstract—In this paper, we report on a physical design of regular fabrics with ambipolar CNTFET devices. Three mediumgrain size cells, built with ambipolar CNTFETs with in-field controllable polarities are evaluated. We designed regular layouts using these cells using 32nm technology rules and we performed technology mapping and routing of a set of benchmark circuits. CNTFET-based cells were then compared with an existent configurable cell of similar grain size, the Actel ACT1 logic brick, simulated with a 32nm MOSFET model. We obtained delays about  $2\times$  better than those obtained with ACT1 after normalization to the intrinsic technology delay. After technology mapping and routing steps, we report performances about  $8\times$  better in terms of area  $\times$  normalized delay for the CNTFET-based cells over the Actel ACT1 cell.

## I. INTRODUCTION

Bulk CMOS technologies are predicted to face crucial technological challenges in the next decade. At the same time, novel devices such as *Carbon Nanotube Field Effect Transistors* (CNTFETs) and *Silicon Nanowire Field Effect Transistors* (SNWFETs), which do not suffer from the same constraints, are expected to play a primary role as devices in future ultra-large scale integration technologies. The interest in these devices is motivated not only by their small size, but also by their superior characteristics, such as quasi-ballistic transport, steep sub-threshold slopes and one-dimensional channel geometry [1].

Among the types of CNTFETs demonstrated in literature, double-gate ambipolar CNTFETs are four-terminal devices where a second gate terminal is added to control the device polarity. These devices combine performance exceeding that of current scaled MOSFETs, with the possibility to control the device polarity by electrostatic doping of the nanotubes [2].

Various attempts of exploiting the unique characteristics of these devices have been proposed in literature. In [3], a logic gate is presented, where the symmetric characteristic of ambipolar CNTFETs is exploited to build a single-transistor XOR gate. In [4], the authors construct configurable dynamic logic gates which can be configured by setting the polarity of the CNTFETs and in [5], an interconnection scheme is presented to implement complex circuits with these configurable gates. In [6], a static logic design methodology using ambipolar CNTFETs with controllable polarities is described. Ambipolar CNTFETs are employed to produce logic gates with high expressive power and low area occupation, i.e. capable to implement binate functions such as XOR or complex combinations of XORs with few devices and simple topologies. In [7], the application of ambipolar CNTFETs with in-field controllable polarities to design regular fabrics with static logic was investigated. Various medium-grained logic gates were used as configurable gates to perform technology mapping on a set of benchmark circuits. In particular, gates with an And-Or-Inverter structure were shown to be the most efficient among gates including at most 12 CNTFETs.

In this paper, we implement a physical design of regular fabrics using a set of configurable logic gates built with ambipolar CNTFET technology with 32nm technology rules. Three cells were chosen among a library of static complementary CNTFET-based logic gates. Technology mapping of a set of benchmark circuits and interconnect routing design steps were performed with each of the three cells. Cells were placed as regular arrays of the same cell. For the sake of comparison, the same design flow was applied to an existing cell, the Actel ACT1 [8], due to its similar grain size to the CNTFET cells. The ACT1 cell was simulated with 32nm MOSFET devices.

All analyzed CNTFET cells showed higher performances than the ACT1 block in terms of area occupation and delays. Although the higher complexity of the ACT1 block enables mapping of circuits with a smaller number of instances than the CNTFET-based cells, the smaller area and efficient implementation of XOR operator of our cells resulted in a better delay and area performance of ambipolar CNTFET cells over the ACT1 cell. In particular, CNTFET cells demonstrated 8× improvement in area × normalized delay over the Actel ACT1 cell.

This paper is structured as follows. Section II provides a background on ambipolar CNTFET static logic and regular fabrics. Section III describes the implementation and evaluation design flow for CNTFET-based configurable cells. Section IV summarizes the results of the analysis we performed. Section V concludes the paper.

# II. BACKGROUND AND MOTIVATION

In [6], a static logic design methodology based on ambipolar CNTFETs was introduced. This methodology exploited the unique characteristic of ambipolar CNTFETs to be in-field configurable to produce logic gates with high expressive power, capable to implement binate functions such as XOR at a low area cost, still providing all the advantages of complementary static logic such as CMOS.

Logic gates built with this methodology are particularly suited to implement regular fabrics, due to their intrinsic sym-



Fig. 1. Regular structures with two different alternating logic bricks. (a) Island-style FPGA and (b) structured ASIC style.

metry and high expressive power. Figure 1 shows two types of regular structure in which these gates can be embedded. The first one (Figure 1a) is an FPGA architecture, where logic bricks are interleaved with interconnect channels, which can be configured by means of antifuses or using SRAM memory cells [8]. The second architecture (Figure 1b) is the structured ASIC, i.e. the logic cells are tightly packed and prestructured, and only the higher level masks can be configured [9]. Structured ASICs are very attractive as they provide a way in between the costly full custom ASICs and less efficient FPGAs.

By exploiting the symmetry in conductance between n-type and p-type devices, CNTFET complementary logic gates can be designed to be intrinsically symmetric, e.g. a NOR (shown in Figure 2b) gate can be built from a NAND one (Figure 2a) by simply vertically mirroring its layout. Moreover, CNTFETs have a channel which is isolated from the substrate, and do not require wells to obtain proper functionality. This enables the construction of a layout consisting of a chessboard-like tiling of dual logic gates, i.e. a logic cell and the cell produced by switching the *pull-up* (PU) and *pull-down* (PD) networks topology, without significantly reducing the overall macroregularity of the layout.

# III. AMBIPOLAR CNTFET REGULAR FABRIC DESIGN

In a regular structure, such as a gate array or structured ASIC, a single logic gate can be used to efficiently implement complex logic circuits by configuring the way its inputs are connected. By feeding the inputs of a gate with logic constants (1 or 0) or by connecting two or more inputs together, smaller logic functions can be implemented using a larger gate. A library of *sub-functions* can thus be extracted from a complex logic gate. A more complex gate will generally implement a higher number of *sub-functions*. Nonetheless, this increase in



Fig. 2. A NOR2 gate layout (b) is derived from a NAND2 layout (a) by simple vertical mirroring

TABLE I

SELECTED GATES, AND THEIR DUALS, WITH THREE TRANSMISSION GATES OR TRANSISTORS IN THE PU AND PD NETWORKS. THE NUMBER OF *sub-functions* IS LISTED FOR EACH CELL. FOR THE DUAL CELLS, THE SUM OF THE *sub-function* NUMBER OF BOTH THE CELL AND ITS DUAL IS LISTED.

| Gate   | Function                                           | Number of sub-functions |
|--------|----------------------------------------------------|-------------------------|
| F1     | $\overline{((A \oplus D) + B) \cdot C}$            | 12                      |
| F1dual | $\overline{A + (B \oplus D) \cdot C}$              | 15                      |
| F2     | $\overline{(A+B)\cdot(C\oplus D)}$                 | 11                      |
| F2dual | $\overline{(A \oplus D) + B \cdot C}$              | 17                      |
| F3     | $\overline{((A \oplus D) + B) \cdot (C \oplus E)}$ | 26                      |
| F3dual | $\overline{(A \oplus D) + (B \oplus E) \cdot C}$   | 44                      |

 TABLE II

 LIBRARY OF sub-functions IMPLEMENTED BY CELL F1.

| F1-0 A                                                                                                                                                                                                                                                                             |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FI-1 A                                                                                                                                                                                                                                                                             |
| F1-2 $\overline{A} + A \cdot \overline{B}$                                                                                                                                                                                                                                         |
| FI-3 $\overline{A} \cdot \overline{B} + A$                                                                                                                                                                                                                                         |
| F1-4 $\overline{A} \cdot \overline{B}$                                                                                                                                                                                                                                             |
| FI-5 $A \cdot \overline{B}$                                                                                                                                                                                                                                                        |
| F1-6 $\overline{A} \cdot \overline{B} + A \cdot B$                                                                                                                                                                                                                                 |
| FI-7 $\overline{A} \cdot \overline{C} + A \cdot \overline{B} + A \cdot B \cdot \overline{C}$                                                                                                                                                                                       |
| FI-8 $\overline{A} \cdot \overline{B} + \overline{A} \cdot B \cdot \overline{C} + A \cdot \overline{B}$                                                                                                                                                                            |
| F1-9 $\overline{A} \cdot \overline{B} + \overline{A} \cdot B \cdot \overline{C} + A \cdot \overline{B} \cdot \overline{C} + A \cdot B$                                                                                                                                             |
| FI-10 $\overline{A} \cdot \overline{B} \cdot \overline{C} + A \cdot B \cdot \overline{C}$                                                                                                                                                                                          |
| $FI-II \overline{A} \cdot \overline{C} + A \cdot \overline{B} \cdot \overline{C}$                                                                                                                                                                                                  |
| FI-12 $\overline{A} \cdot \overline{B} \cdot \overline{C} + A \cdot \overline{B}$                                                                                                                                                                                                  |
| $F1-13\ \overline{A} \cdot \overline{B} \cdot \overline{C} + \overline{A} \cdot \overline{B} \cdot C \cdot \overline{D} + \overline{A} \cdot B \cdot \overline{D} + A \cdot \overline{B} \cdot \overline{D} + A \cdot B \cdot \overline{C} + A \cdot B \cdot C \cdot \overline{D}$ |
| $F1-14 \overline{A} \cdot \overline{B} \cdot \overline{D} + \overline{A} \cdot B \cdot \overline{C} \cdot \overline{D} + A \cdot \overline{B} \cdot \overline{C} \cdot \overline{D} + A \cdot B \cdot \overline{D}$                                                                |

complexity will require a higher area occupation.

Logic block size can range from few transistor, small-grain cells, such as NAND and NOR gates to coarse-grain complex cells including memory elements, adders and similar complex structures. We limited our analysis to medium-grained cells, including at most three elements (transistors or transmission gates) in the PU and PD networks, as these allow us to highlight the structural advantage given by configurable polarity transistors [7].

As we have seen in Section II, CNTFET static logic is particularly suited to build configurable gates to implement chessboard-like regular fabric layouts with alternating dual cells. Dual cells, used together, provide a higher number of implemented functions than a single gate. Since dual CNTFET gates can be produced by mirroring their layout, it is possible to produce chessboard-like layouts which are more regular than their CMOS counterparts, without modifying transistor sizes to obtain dual gates.

For this study we have limited our experiments to 3 distinct libraries formed by cells F1, F2, F3 and their respective dual gates. The gates are listed in Table I. Each of these cells is a universal gate implementing a small library of logic *sub-functions* which can lead to optimal logic synthesis. Table II lists all the *sub-functions* in the library formed by the F1 logic structure, which can be employed to synthesize any combinational logic circuit.

Figure 3 shows the schematic view of the three considered cells, F1, F2 and F3, subjected to transistor sizing. The gates are built with a static, complementary logic approach similar to CMOS. Figure 4a shows the circuit symbol of the double gate, amibpolar CNTFET devices. The device includes a *control gate* (CG) which controls channel conduction and a *polarity gate* (PG) which controls device polarity. By coupling two



Fig. 3. Schematic view of cells F1, F2 and F3. Transistor sizings are reported in red.



Fig. 4. Double gate, ambipolar CNTFET circuit symbol (a) and transmission gate logic convention (b).

ambipolar CNTFETs, a *transmission gate* structure can be built (Figure 4b), allowing the inclusion of binate operators (XOR) in the cells. In our analysis, we considered the need to provide double rail inputs to the transmission gates necessary to implement the XOR operator, present in each of the analyzed gates. Inverters were included at the inputs of each cell to provide the negated signals.

With the intent of creating a meningful comparison with an existing technology, we considered a configurable gate of similar grain size to our CNTFET cells, the Actel ACT1 [8] (Figure 5).

## **IV. SIMULATION RESULTS**

Figure 6 shows the design flow we considered for our analysis. The initial libraries consist of the list of *sub-functions* implemented by each cell, F1, F2, F3 and ACT1.

The libraries were characterized using SignalStorm Library Characterizer (Version 5.20) to obtain the gate timing information. At this stage 32nm technology rules of the Stanford CNFET transistor model [10] were employed to simulate cells F1, F2 and F3. For the physical information of the library we designed the basic cell with the backend design



Fig 5 The ACTEL ACT1 configurable logic cell



Fig. 6. The considered design flow.

rules offered by the opensource 45nm Nangate library (v1.3). Layout techniques which are immune to misalighed CNTs [11] were incorporated while designing the cells. Synopsys Design Compiler (B-2008.09-SP3) was used for mapping the RTL of the benchmarks onto target cell library. Cadence SoC Encounter (Version 07.10) was used as the physical synthesis engine to study the impact of interconnect parasitics associated with the interconnect routing phase of the mapped netlist. For the ACT1 cell, we followed the same design flow as for the CNTFET cells, extracting delay information by spice simulations with the 32nm MOSFET Predictive Technology Model [12].

In Figure 7 we report the delay estimation summary for each of the considered cells. Delays are reported as a sum of the delays of a set of benchmark circuits (mostly taken from the ISCAS'85 set) after the technology mapping design step and after the interconnect routing of the circuits. Note that all delays are normalized to the intrinsic technology delays, respectively 0.59ps for CNTFETs and 3.00ps [13] for 32nm MOSFETs used to implement the ACT1 cell.

All three ambipolar CNTFET cells showed better performance than the Actel ACT1 cell in terms of normalized delay. In particular, F2 demonstrated the best results, showing



Fig. 7. Normalized delays after technology mapping and after routing for the CNTFET cells and ACT1.

 TABLE III

 Results after technology mapping and routing for a set of benchmark circuits. Delays are normalized to the intrinsic delays (0.59ps for F17, F21, F33 and 3.00ps for ACT1). Total values are also provided in percentage of the ACT1 results.

|       | F1    |        |       |        | F2    |        |       | F3     |       |             | ACT1  |        |       |        |       |        |
|-------|-------|--------|-------|--------|-------|--------|-------|--------|-------|-------------|-------|--------|-------|--------|-------|--------|
|       | Delay |        | C     | ells   | D     | elay   | Cells |        | D     | Delay Cells |       | Delay  |       | Cells  |       |        |
| Bench | Map   | P&R    | N.    | Area   | Map   | P&R    | N.    | Area   | Map   | P&R         | N.    | Area   | Map   | P&R    | N.    | Area   |
| add16 | 105.1 | 364.4  | 400   | 287.5  | 76.8  | 275.3  | 434   | 311.9  | 74.3  | 284.3       | 584   | 503.7  | 149.3 | 605.5  | 371   | 1697.9 |
| add32 | 146.4 | 532.0  | 837   | 601.6  | 101.0 | 455.9  | 859   | 617.4  | 107.0 | 452.7       | 1012  | 872.8  | 212.4 | 968.6  | 655   | 2997.7 |
| add64 | 175.4 | 838.1  | 1991  | 1430.9 | 133.9 | 667.5  | 1978  | 1421.6 | 141.7 | 968.4       | 2995  | 2583.1 | 250.5 | 1608.8 | 2122  | 9711.5 |
| C1355 | 105.8 | 361.2  | 410   | 294.7  | 59.7  | 216.9  | 307   | 220.6  | 79.5  | 293.9       | 375   | 323.4  | 144.5 | 510.6  | 271   | 1240.3 |
| C1908 | 131.4 | 538.6  | 579   | 416.1  | 86.3  | 450.8  | 522   | 375.2  | 112.1 | 392.4       | 378   | 326.0  | 189.9 | 823.1  | 253   | 1157.9 |
| C2670 | 124.2 | 558.5  | 849   | 610.2  | 85.3  | 403.2  | 899   | 646.1  | 83.6  | 315.2       | 515   | 444.2  | 112.6 | 367.4  | 439   | 2009.1 |
| C3540 | 173.7 | 744.1  | 1200  | 862.4  | 114.0 | 597.3  | 1213  | 871.8  | 140.1 | 541.8       | 1033  | 890.9  | 242.2 | 1595.0 | 726   | 3322.6 |
| C5315 | 113.8 | 586.4  | 1938  | 1392.8 | 80.7  | 432.4  | 1960  | 1408.7 | 97.5  | 415.6       | 1204  | 1038.4 | 174.1 | 900.2  | 808   | 3697.9 |
| C6288 | 435.4 | 1234.2 | 2284  | 1641.5 | 332.5 | 1175.8 | 2299  | 1652.3 | 434.0 | 1234.2      | 3117  | 2688.4 | 818.2 | 2406.0 | 2096  | 9592.6 |
| C7552 | 122.5 | 623.4  | 2581  | 1855.0 | 78.4  | 471.0  | 2469  | 1774.5 | 98.5  | 547.7       | 1627  | 1403.3 | 165.9 | 821.4  | 1058  | 4842.0 |
| dalu  | 83.0  | 443.1  | 749   | 538.3  | 56.1  | 401.5  | 788   | 566.3  | 75.8  | 498.3       | 691   | 596.0  | 118.9 | 759.7  | 477   | 2183.0 |
| des   | 105.8 | 845.3  | 3366  | 2419.1 | 61.1  | 458.3  | 3699  | 2658.5 | 77.2  | 980.0       | 3197  | 2757.3 | 140.7 | 1401.2 | 2095  | 9588.0 |
| i10   | 175.0 | 877.5  | 1883  | 1353.3 | 122.3 | 610.3  | 1957  | 1406.5 | 139.8 | 761.8       | 1816  | 1566.3 | 256.9 | 1457.2 | 1286  | 5885.5 |
| i8    | 104.2 | 1158.8 | 3194  | 2295.5 | 68.9  | 637.8  | 3272  | 2351.6 | 76.3  | 577.5       | 747   | 644.3  | 127.0 | 818.6  | 578   | 2645.3 |
| Total | 2101  | 9705   | 22261 | 15999  | 1456  | 7254   | 22656 | 16283  | 1737  | 8263        | 19291 | 16638  | 3103  | 15043  | 13235 | 60571  |
| %ACT1 | 67.7  | 64.5   | 168.2 | 26.4   | 46.9  | 48.2   | 171.2 | 26.9   | 56.0  | 54.9        | 145.8 | 27.5   | 100.0 | 100.0  | 100.0 | 100.0  |

a normalized delay of 47% of the delay of the ACT1 cell after technology mapping and of 48.2% compared to ACT1 after interconnect routing. We attribute the slightly lower performance of the F3 cell to the higher complexity caused by the presence of two transmission gates in its schematic. As observed in [7], gates with two or more transmission gates (or, equivalently, XOR operators) cannot be used for technology mapping as efficiently as smaller gates such as F2. When comparing F1 and F2, finally, we observed that benchmarks can be mapped with a similar number of cells of types F1 or F2. This result is explained by the comparable set of *subfunctions* implemented by the two gates. When looking at delays, however, we see a distinct advantage of F2 over F1. This is mostly due to the lower average delay of cell F2 when compared to F1.

Table III reports the detailed results of the characterized benchmark circuits. The benchmark circuits were mapped with each cell of Table I plus the Actel ACT1 cell. Delays after technology mapping and after interconnect routing steps are provided. Compared to the work in [7], we also considered more accurate gate areas, estimating the overhead due to power supply rails and input/output ports.

Finally, when looking at overall performance in terms of area  $\times$  normalized delay for the CNTFET based cells, we observed performances of 5.9 $\times$ , 7.7 $\times$  and 6.6 $\times$  respectively for F1, F2 and F3 over the Actel ACT1.

# V. CONCLUSION

In this paper, we presented a physical design of regular fabrics using ambipolar CNTFETs with in-field controlled polarities. We chose three medium grain-size configurable cells which we used to map a set of benchmark circuits. After technology mapping, routing was performed over regular layouts designed with each analyzed cell. CNTFET-based cells were compared with the Actel ACT1 cell. 32nm technology rules were chosen for the transistors, while the interconnect was defined using 45nm back-end design rules.

Regular layout evaluation showed improvements of over  $2 \times$  in terms of normalized delay for the CNTFET cells compared to ACT1. Area × normalized delay values for routed circuits showed performances about 8× better than that of the ACT1 cell.

Although a number of technological issues still require to be addressed in order to produce defect-free, large scale integration of CNTFET devices, we believe a preliminary analysis of a physical design using these devices is a useful step to evaluate them at a logic and circuit level.

### ACKNOWLEDGMENT

We acknowledge partial support from grant: ERC-2009-AdG-246810.

#### REFERENCES

- [1] Y.-M. Lin et al. "Novel structures enabling bulk switching in carbon nanotube fets," in *62nd DRC*., June 2004, pp. 133–134 vol.1.
- [2] R. Martel et al. "Ambipolar electrical transport in semiconducting singlewall carbon nanotubes," *Phys. Rev. Lett.*, vol. 87, no. 25, p. 256805, Dec 2001.
- [3] R. Sordan et al. "Exclusive-or gate with a single carbon nanotube," *App. Phys. Lett.*, vol. 88, no. 5, p. 053119, 2006.
- [4] I. O'Connor et al. "Ultra-fine grain reconfigurability using cntfets," dec. 2007, pp. 194 –197.
- [5] P.-E. Gaillardon et al. "Interconnection scheme and associated mapping method of reconfigurable cell matrices based on nanoscale devices," july 2009, pp. 69 –74.
- [6] M. H. Ben Jamaa et al. "Novel Library of Logic Gates with Ambipolar CNTFETs: Opportunities for Multi-Level Logic Synthesis," in *DATE* 2009., 2009, pp. 622–627.
- [7] M. De Marchi et al. "Regular fabric design with ambipolar entfets for fpga and structured asic applications," jun. 2010, pp. 65 –70.
- [8] J. Rose et al. "Architecture of field-programmable gate arrays," Proc. IEEE, vol. 81, no. 7, pp. 1013–1029, Jul 1993.
- [9] Y. Ran et al. "On designing via-configurable cell blocks for regular fabrics," in *Proc. DAC '04*. ACM, 2004, pp. 198–203.
- [10] "Stanford university cntfet model." http://nano.stanford.edu/models.php.
- [11] S. Bobba et al. "Design of compact imperfection-immune cnfet layouts for standard-cell-based logic synthesis," apr. 2009, pp. 616 –621.
- [12] "32nm bulk cmos Predictive Technology Model." http://ptm.asu.edu/.[13] J. Deng et al. "Carbon nanotube transistor circuits: Circuit-level performance of the second secon
- [13] J. Deng et al. "Carbon nanotube transistor circuits: Circuit-level performance benchmarking and design options for living with imperfections," in *ISSCC 2007.*, Feb. 2007, pp. 70–588.