# Regular Fabric Design with Ambipolar CNTFETs for FPGA and Structured ASIC Applications Michele De Marchi EPFL, Lausanne, Switzerland Email: michele.demarchi@epfl.ch M. Haykel Ben Jamaa CEA-LETI-MINATEC 17, Rue des Martyrs F-38054 Grenoble, France Email: haykel.ben-jamaa@cea.fr Giovanni De Micheli EPFL, Lausanne, Switzerland Email: giovanni.demicheli@epfl.ch Abstract—In this paper, we propose for the first time the application of ambipolar CNTFETs with in-field controllable polarities to design regular fabrics with static logic. We exploit the high expressive power provided by complementary static logic built with ambipolar CNTFETs to design compact and efficient configurable gates. After evaluating a polarity-aware logic design for the configurable gates, we selected a number of gates with an And-Or-Inverter structure and produced a first comparison with existent medium-grained logic blocks, like the Actel ACT1 and 4-input LUTs [1]. Preliminary evaluation of our gates indicates improvements of around 47% over the ACT1 and of about 18× with respect to 4-input LUTs in terms of area×normalized delay. #### I. INTRODUCTION As CMOS technologies are predicted to face major scalability challenges in the next few years, novel devices such as *Carbon Nanotube Field Effect Transistors* (CNTFETs) are receiving increasing attention due to their promising characteristics, such as quasi-ballistic transport, steep sub-threshold slopes and one dimensional channel geometry [2]. Among the types of CNTFETs demonstrated in literature, double-gate ambipolar CNTFETs are four-terminal devices where a second gate terminal is added to enable the control of the device polarity. These devices combine performance exceeding that of current scaled MOSFETs, with the possibility to control the device polarity by electrostatic doping of the nanotubes [3]. Various attempts of exploiting the unique characteristics of these devices have been proposed in literature. In [4], a logic gate is presented, where the symmetric characteristic of ambipolar CNTFETs is exploited to build a single-transistor XOR gate. In [5], the authors construct configurable dynamic logic gates which can be configured by setting the polarity of the CNTFETs and in [6], an interconnection scheme is presented to implement complex circuits with these configurable gates. In [7], a novel static logic design methodology using ambipolar CNTFETs with controllable polarities is investigated, enabling the design of multi-level logic circuits. The design methodology presented in [7] used Transmission Gates (TGs) to produce logic gates with high expressive power and low area occupation, i.e. capable to implement binate functions such as XOR or complex combinations of XORs with low resources and simple topologies. The objective of this paper is to determine a set of configurable logic gates built with ambipolar CNTFETs which can be implemented in a regular layout fabric, and to evaluate their performance. Structured ASIC, as defined by [8], and FPGAs represent design styles where our technology can have significant impact. For the first time, we explore the performance of various medium-grained configurable logic gates designed with this technology. To compare our implementation with existing technologies, we consider the Actel ACT1 logic block since the ACT1 cell is qualitatively similar to our gates in grain-size and set of derivable logic functions. Obviously, this comparison can give only a coarse approximation, but it can be used to sense the applicability of our technology. Moreover, we compare our gates to the cell derived from the 4-input Look Up Tables (4-LUT) as in [1]. We show improvements up to 47% in Area×Normalized delay Product over the ACT1 block [9] and of around 18× in comparison with 4-LUTs. This paper is structured as follows. Section II provides a background on ambipolar CNTFET static logic and regular fabrics. Section III describes the design of the configurable logic blocks for regular fabrics. Section IV describes the configurable gates implementation and characterization. In Section V we conclude the paper. ### II. BACKGROUND AND MOTIVATION In [7], a design methodology was introduced, consisting of a static complementary logic, where the configurable polarity of ambipolar CNTFETs is exploited to produce logic gates with high expressive power, capable to implement binate functions such as XOR at a low area cost, still providing all the advantages of complementary static logic such as CMOS. Logic gates built with this methodology are particularly suited to implement regular fabrics, due to their intrinsic symmetry and high expressive power. Figure 1 shows two types of regular structure in which these gates can be embedded. The first one (Figure 1a) is an FPGA architecture, where logic bricks are interleaved with interconnect channels, which can be configured by means of antifuses or using SRAM memory cells [9]. The second architecture (Figure 1b) is called structured ASIC, i.e. the logic cells are tightly packed and prestructured, and only the higher level masks can be configured [8]. Structured ASICs are very attractive as they provide a way in between costly full custom ASICs and less efficient FPGAs. Both in FPGAs and structured ASICs, interconnect complexity might limit the density of cells which can be used to map a circuit. However, this limitation can be estimated quantitatively only when fabrication process parameters have been established. Thus, in this work, we will refer to area with respect to the number of cells which are effectively needed to implement the circuit, without considering the overhead area caused by unused cells. Fig. 1. Regular structures with two different alternating logic bricks. (a) Island-style FPGA and (b) structured ASIC style. Thanks to the symmetric conductance of n and p-type CNT-FETs, CNTFET logic gates are intrinsically symmetric, e.g. a NOR (shown in Figure 2b) gate can be built from a NAND one (Figure 2a) by simply rotating its layout by 180°. Moreover, CNTFETs have a channel which is isolated from the substrate, and do not require wells to obtain proper functionality. This enables the construction of a layout consisting of a chessboard-like tiling of dual logic gates, i.e. a logic cell and its dual produced by switching the *pull-up* (PU) and *pull-down* (PD) networks topology, without significantly reducing the overall macro-regularity of the layout. Fig. 2. A NOR2 gate layout (b) is derived from a NAND2 layout (a) by simply rotating it by $180^{\circ}$ . # III. AMBIPOLAR CNTFET CONFIGURABLE LOGIC GATES The high expressive power given by CNTFET static logic makes it a great choice for building configurable gates which can be implemented in arrays to produce regular fabrics. In this work, we apply this logic design methodology to introduce a novel set of configurable gates to be used to design regular fabrics. ## A. Static Ambipolar Logic The ambipolar CNTFET complementary static logic family was first introduced in [7], and exploits the tunable polarity of ambipolar CNTFETs to produce logic gates which implement binate functions such as XNOR (shown in Figure 3c) with low area occupation, thus producing gates with high expressive TABLE I THE 46-GATE STATIC LOGIC LIBRARY. | Gate Function | Gate Function | |--------------------------------------------------------|-----------------------------------------------------------------| | F00 A | F23 $\overline{A + (B \oplus D) \cdot C}$ | | F01 $\overline{A \oplus B}$ | F24 $\overline{(A \oplus D) + (B \oplus D) \cdot C}$ | | $F02 \overline{A + B}$ | F25 $\overline{A + (B \oplus D) \cdot (C \oplus D)}$ | | $F03 \overline{A \cdot B}$ | F26 $\overline{(A \oplus D) + (B \oplus D) \cdot (C \oplus D)}$ | | F04 $\overline{(A \oplus B) + C}$ | F27 $\overline{(A \oplus D) \cdot B \cdot C}$ | | F05 $\overline{(A \oplus B) \cdot C}$ | F28 $(A \oplus D) \cdot (B \oplus D) \cdot C$ | | F06 $(A \oplus B) + (A \oplus C)$ | F29 $(A \oplus D) \cdot (B \oplus D) \cdot (C \oplus D)$ | | F07 $(A \oplus B) \cdot (A \oplus C)$ | F30 $(A \oplus D) + (B \oplus E) + C$ | | F08 $(A \oplus B) + (C \oplus D)$ | F31 $(A \oplus D) + (B \oplus D) + (C \oplus E)$ | | F09 $(A \oplus B) \cdot (C \oplus D)$ | F32 $((A \oplus D) + (B \oplus E)) \cdot C$ | | $F10 \overline{A + B + C}$ | F33 $((A \oplus D) + B) \cdot (C \oplus E)$ | | FII $(A+B) \cdot C$ | F34 $((A \oplus D) + (B \oplus D)) \cdot (C \oplus E)$ | | $F12 \overline{A + B \cdot C}$ | F35 $((A \oplus D) + (B \oplus E)) \cdot (C \oplus D)$ | | F13 $A \cdot B \cdot C$ | F36 $(A \oplus D) + (B \oplus E) \cdot C$ | | F14 $(A \oplus D) + B + C$ | F37 $A + (B \oplus D) \cdot (C \oplus E)$ | | F15 $(A \oplus D) + (B \oplus D) + C$ | F38 $(A \oplus D) + (B \oplus E) \cdot (C \oplus E)$ | | F16 $(A \oplus D) + (B \oplus D) + (C \oplus D)$ | F39 $(A \oplus D) + (B \oplus E) \cdot (C \oplus D)$ | | F17 $((A \oplus D) + B) \cdot C$ | F40 $(A \oplus D) \cdot (B \oplus E) \cdot C$ | | F18 $((A \oplus D) + (B \oplus D)) \cdot C$ | F41 $(A \oplus D) \cdot (B \oplus D) \cdot (C \oplus E)$ | | F19 $((A \oplus D) + B) \cdot (C \oplus D)$ | F42 $(A \oplus D) + (B \oplus E) + (C \oplus F)$ | | F20 $((A \oplus D) + (B \oplus D)) \cdot (C \oplus D)$ | F43 $((A \oplus D) + (B \oplus E)) \cdot (C \oplus F)$ | | F21 $(A+B)\cdot (C\oplus D)$ | F44 $(A \oplus D) + (B \oplus E) \cdot (C \oplus F)$ | | $\mathbf{F22} (A \oplus D) + B \cdot C$ | F45 $(A \oplus D) \cdot (B \oplus E) \cdot (C \oplus F)$ | power. Figure 3a shows the ambipolar CNTFET circuit symbol, where the *Polarity Gate* (PG) controls the device polarity and the *Control Gate* (CG) modulates the channel conductivity, and the logic level convention used for the PG. The library is built with a static, complementary logic approach similar to CMOS, with the addition of TGs (Figure 3b) consisting of two CNTFETs with controlled polarities. Table I presents the library which can be implemented by using a maximum of three transistors or TGs in the PU (or equivalently in the PD) networks. #### B. Signal-Polarity-Aware Design The implementation of this library requires TGs to be fed with dual polarity inputs. In circuit implementations, this translates into a large number of inverters and of dual rail interconnects. To understand how this requirement affects performance, and to find a design methodology to implement configurable cells for regular fabrics, we analyzed areas and delays under three different input and output conditions, shown in Figure 4. We will refer to these conditions as designs (a), Fig. 3. Ambipolar CNTFETs. (a) Symbol and PG logic level convention; (b) Transmission gate and (c) XNOR gate. Fig. 4. Standard cell polarity-aware design. (a) inverters are in independent cells; (b) inverter included at the output of each cell; (c) inverters at the input of cells only when double polarity inputs are required. ## (b) and (c) throughout the paper: - (a) FO4 condition, with an output load of 4 unloaded gates equal to the one under measurement; - **(b)** FO4 with an extra inverter at the output of the gate under analysis; - (c) FO4 with an inverter at any input which requires both polarities (transmission gates); Design (a) is the simplest and requires inverter cells to provide the dual rail signals for the TGs. Since cells produce only single rail outputs, a part of interconnect will be single rail and a part dual rail. Design (b) reduces the number of cells by inserting inverters directly at the output of gates. In this case, the size of the inverters which produce the negated signals cannot be optimized at design time. Although the number of cells is reduced with respect to design (a), dual rail interconnect is always necessary. At last, design (c) is a configuration which does not require dual rail interconnect. Even if the number of inverters is larger than in cases (a) and (b), their size is self-optimized since a unit size inverter is added only when needed to drive a gate input. Moreover, since inverters are typically inserted as buffers in regular layouts, we expect their cost in terms of area to be compensated by reduced signal noise and better delay predictability. ## C. CNTFET Static Logic Gates for Regular Fabrics Each configurable gate is defined by its logic function. In a regular architecture, such as a structured ASIC, the logic gates are pre-configured and only part of the interconnect can be user-configured. By configuring the interconnect, each input of a gate can be fed with either the output of another gate or with a constant value (0 or 1). Each configurable gate is thus capable to implement a set of *sub-functions* with a number of inputs smaller or equal to the one of the logic function which represents it. For each gate, we can define a dual gate as the one produced by simply swapping the topologies of its PU and PD networks. As we have seen in Section II, CNTFET static logic is particularly suited to build configurable gates to implement chessboard-like regular fabric layouts with alternating dual cells. Dual cells, used together, provide a higher number of implemented functions than a single gate. Since dual CNTFET gates can be produced by simply rotating a layout of 180°, it is possible to produce chessboard-like layouts which are more regular than their CMOS counterparts, without modifying transistor sizes to obtain dual gates. From the 46-gate static logic library, we selected a number of gates which could be used as bricks to design regular fabrics (shown in bold in Table I). We included the gates which: 1. contain at least one transmission gate and 2. cannot be implemented by another logic gate with the same topology by feeding two or more of its inputs with a single signal. For example, function $F06 = \overline{(A \oplus B) + (A \oplus C)}$ can be implemented from $F08 = \overline{(A \oplus B) + (C \oplus D)}$ by feeding inputs A and C with the same external signal. In order to evaluate gates with a higher complexity than those from the 46-gate library, we also propose four gates (two gates plus their duals) with four transistors or TGs in the PU and PD networks (see Table II). These gates have the advantage of implementing a high number of *sub-functions*, with low *redundancy* between a gate and its dual, i.e. the set of implemented *sub-functions* of the gate only partially overlaps with that of its dual gate. For example, a layout consisting of both gates G3 and G4 can implement 77% more sub-functions than a layout including only gate G3. TABLE II SELECTED GATES WITH FOUR TG OR TRANSISTORS IN THE PU AND PD NETWORKS. G2 AND G4 ARE RESPECTIVELY THE DUALS OF G1 AND G3. | Gate | Function | |------|--------------------------------------------------------------| | G1 | $\overline{(A \cdot B) + (C \cdot (D \oplus E))}$ | | G2 | $\overline{(A+B)\cdot(C+(D\oplus E))}$ | | G3 | $\overline{((A+B)\cdot(C\oplus D))+(E\oplus F)}$ | | G4 | $\overline{((A \cdot B) + (C \oplus D)) \cdot (E \oplus F)}$ | In Figure 5 (bottom), we show the schematics of gates F21 and G3, indicating the sizing of each transistor. TG sizing refers to the size of each one of the transistors implementing the TG. Figure 5 (top) shows the approximate layouts for the two gates. Since n and p wells are not needed in this technology, layouts can be made more compact than in CMOS. Moreover, n-type and p-type transistors do not need to be separated in distinct zones of the cell layouts, which enables more optimized designs. ### IV. SIMULATION RESULTS This section presents the results of the simulations we performed to evaluate various configurable logic gates. After a preliminary evaluation of the library of 46 gates, we compare the performance of the logic gates when used as bricks to implement regular fabrics. We then compare the most efficient gates with the Actel ACT1 block and 4-input look-up tables. Fig. 5. On top, layout views of ambipolar CNTFET logic gates (a) G3 and (b) F21. At the bottom, the respective schematics, indicating the size of each transistor. Finally, we compare the best regular tiling, F21F22, with standard cells in the same technology. ## A. Logic Library Characterization For each gate of the library in Table I we evaluated areas (normalized to the unit size transistor) and delays in the worst (w) and average (a) cases. We performed SPICE simulations using the *Stanford CNTFET model* [10], using a minimum feature size of 32nm for the CNTFETs. It is very hard to make a fair comparison of the technology described here with other existing technologies. For this reason we present a comparison with 32nm CMOS for which, even though the devices have a different structure, cells can be compared to those built with CNTFETs in terms of area in first approximation. We constructed a library including the 7 CMOS gates (shown in italics in Table I) which can be built with the same topology as for CNTFETs, with no more than three transistors in every PU and PD network. All CMOS cells were simulated using the 32nm CMOS *Predictive Technology Model* [11]. In Figure 6, we show a comparison of the average values of area and normalized delay over the whole library for design (a) and for CMOS. All delays are shown after normalization to the intrinsic technology delay, with $\tau_{\rm CNTFET}=0.59{\rm ps}$ and $\tau_{\rm CMOS}=3.0{\rm ps}$ [12]. Our simulations show a 39% normalized delay reduction for ambipolar CNTFET with respect to CMOS. Even if only 7 gates with this topology can be constructed in CMOS, the average gate area comparison shows how CNTFET cells utilize a similar amount of resources of CMOS to produce a larger number of functions. If we consider the $Area \times Normalized\ delay\ Product\ (ANP)\ (average\ case)\ , we obtain an improvement of 40% for design (a) over CMOS.$ In Figure 7, we show the average values of area, normalized delay (average case) and normalized delay (worst case) over the 46-gate library for designs (b) and (c). When comparing these two designs, we observed a reduction of 3.3% in area of design (c) compared to (b). At the same time, we observed a much more efficient exploitation of inverters in design (c), obtaining a reduction of 30.5% in the average and 25.8% in Fig. 6. Comparison of average values of area and normalized delay between design (a) and 32nm CMOS. Fig. 7. Comparison of area and normalized delay between design (b) and design (c). the worst case gate delay average. Since inverters in design (b) have a pre-defined sizing (since we cannot know the fan-out of a gate before mapping it in a circuit), even gates with low fan-out will be penalized by the presence of an inverter at the output, thus increasing the average gate area. We used the ABC logic synthesis system [14] to perform logic minimization and technology mapping over a set of several benchmark circuits taken from the ISCAS-85 set [13]. With the results of technology mapping, a meaningful comparison can be made among all three design configurations (a), (b) and (c). In Figure 8, we see a summary of the percent improvement over 32nm CMOS in terms of area, normalized delay and ANP for the three design configurations. The percentages represent the average circuit values obtained through technology mapping on a set of benchmark circuits with each design configuration (a), (b) and (c). All the design configurations show a considerable improvement over CMOS in terms of ANP, between $\sim 60\%$ (design (b)) and $\sim 90\%$ (design (a)). If we consider that the average values for the single gates of the library showed an improvement of only 40% in terms of ANP over the CMOS library, we can see how the increased expressive power given by the ambipolar CNTFETs improves performance substantially, when we look at the average performances of mapped circuits for each library. As we expected, design configurations (a) and (c) give the best results, and the performance of these two configurations is very similar. For the considerations we presented in Section III-B, we can then consider design (c) a valid possibility for the implementation of an efficient library of standard cells using ambipolar CNTFETs. # B. Characterization of Configurable Gates The complete list of configurable logic gates we characterized is shown in Table III. For each gate, we give the number of inputs $N_{\rm In}$ , the number of implemented sub-functions $N_{\rm f}$ , area and average normalized delays for design configurations (b) and (c) (see Section III-B). We chose to evaluate the performance of gates implemented with design (b) and (c) since design configuration (a) has the limitation of inverters, which we assume not to be present in the regular fabric. Thus, TABLE III CONFIGURABLE GATES FOR REGULAR FABRICS, IN SINGLE CELL CONFIGURATION OR DUAL CELL TILING (E.G. F04F05). FOR THE ACT1 AND 4-LUT, VALUES ARE RELATIVE TO 32NM CMOS. | Function Data | | Gate Area | | Avg. Delay | | | |-------------------|----------------|-----------|--------|------------|--------|--| | Name $N_{\rm In}$ | $N_{\rm f(b)}$ | Des(b) | Des(c) | Des(b) | Des(c) | | | F04 3 | 8 | 15.67 | 11.67 | 11.67 | 9.73 | | | F04F05 3 | 12 | 15.67 | 11.67 | 11.67 | 9.73 | | | F08 4 | 16 | 17.33 | 17.33 | 9.69 | 7.01 | | | F08F09 4 | 28 | 17.33 | 17.33 | 9.69 | 7.01 | | | F14 4 | 12 | 22.00 | 18.00 | 15.22 | 10.74 | | | F14F27 4 | 20 | 22.00 | 18.00 | 15.22 | 10.74 | | | F17 4 | 16 | 20.33 | 16.33 | 13.47 | 9.47 | | | F17F23 4 | 20 | 20.33 | 16.33 | 13.47 | 9.47 | | | F21 4 | 15 | 20.67 | 16.67 | 10.26 | 7.15 | | | F21F22 4 | 24 | 20.67 | 16.67 | 10.26 | 7.15 | | | F30 5 | 28 | 24.00 | 24.00 | 13.89 | 10.05 | | | F30F40 5 | 52 | 24.00 | 24.00 | 13.89 | 10.05 | | | F32 5 | 32 | 21.67 | 21.67 | 13.65 | 9.70 | | | F32F37 5 | 52 | 21.67 | 21.67 | 13.65 | 9.70 | | | F33 5 | 48 | 22.00 | 22.00 | 11.26 | 7.91 | | | F33F36 5 | 84 | 22.00 | 22.00 | 11.26 | 7.91 | | | F42 6 | 47 | 26.00 | 30.00 | 11.52 | 8.33 | | | F42F45 6 | 90 | 26.00 | 30.00 | 11.52 | 8.33 | | | F43 6 | 105 | 23.33 | 27.33 | 11.00 | 7.82 | | | F43F44 6 | 105 | 23.33 | 27.33 | 11.00 | 7.82 | | | G1 5 | 34 | 25.33 | 21.33 | 14.45 | 11.20 | | | G1G2 5 | 44 | 25.33 | 21.33 | 14.45 | 11.20 | | | G3 6 | 105 | 29.33 | 29.33 | 9.01 | 7.39 | | | G3G4 6 | 186 | 29.33 | 29.33 | 9.01 | 7.39 | | | ACT1 8 | 702 | 33 | .00 | 16.08 | | | | LUT4 4 | 65536 | 169 | 0.00 | 49 | .61 | | for design (a), a number of cells would have to be used to implement the required inverters, causing a great loss in terms of area. 1) Comparison among Configurable Gates: In Figure 10, we present a summary of the total values of ANP obtained by mapping our benchmark circuit set using each one of the logic gates of Table III. Results are shown for every logic gate, with single or dual tiling, for design configuration (c). Note that the x-axis labels in Figure 10 refer to the logic cell for the single-gate layout, so for dual-gate layouts, the respective dual cells are also present. From this plot, we can clearly extract two groups of cells, one group showing high efficiency (F17, F21, F33, G3) and one presenting lower-than-average efficiency (F14, F30, F42). In Table IV, we resume the list of these gates with the respective representative function. We do not report here the same results for the gates constructed with design configuration (b), since their performance was on average 30% lower, in terms of ANP, than the one for design (c). If we consider XORs as if they were single literals, an immediate evidence from this data is that low efficiency Fig. 8. Percent improvement of CNTFET gates over CMOS in terms of area, normalized delay and ANP. TABLE IV SETS OF LOW AND HIGH ANP CONFIGURABLE LOGIC GATES. | High Efficiency Gates | Low Efficiency Gates | | | |-----------------------------------------------------------------|-------------------------------------------------------------|--|--| | Gate Function | Gate Function | | | | F17 $\overline{((a \oplus b) + c) \cdot d}$ | F14 $\overline{(a \oplus b) + c + d}$ | | | | F21 $\overline{(c+d)\cdot(a\oplus b)}$ | F30 $\overline{(a \oplus b) + (c \oplus d) + e}$ | | | | F33 $\overline{((a \oplus b) + e) \cdot (c \oplus d)}$ | F42 $\overline{(a \oplus b) + (c \oplus d) + (e \oplus f)}$ | | | | G4 $\overline{((a \cdot b) + (c \oplus d)) \cdot (e \oplus f)}$ | | | | gates all share a common function structure, consisting of a summation of three terms, while all efficient gates have an *And-Or-Inverter* (AOI) main structure. This also explains the high performance of the G3G4 tiling, which is more complex but can implement all *sub-functions* implemented by tilings F21F22 and F33F36. 2) Comparison with ACT1 and 4-LUT: We synthesized the same set of logic circuits used in Sec. IV-B1 and mapped them using 6 different libraries. The first 4 libraries are formed by the gates marked as "high-efficiency gates" in Tab. IV (F17, F21, F33, G3) and their respective dual gates (F23, F22, F36, G4). The 5<sup>th</sup> library is formed by logic blocks following the same architecture as the Actel ACT1 block [9]. The last library is formed by 4-input LUTs. The last two libraries are realized in a 32nm CMOS technology. The first 4 libraries are realized in the ambipolar CNT technology assuming a lithography pitch of 32nm. We compared the results of the logic synthesis without performing any placement and routing steps. Figure 9 presents the average ANP saving over the whole set of synthesized circuits measured on the mapping with the 4 ambipolar CNT libraries (F17/F23, F21/F22, F33/F36 and G3/G4) with respect to the circuits mapped with the ACT1 blocks. If we consider normalized delays, we obtain a reduction of 39% for G3G4 dual tiling and 47% for F21F22 dual tiling. If we consider absolute delays, with an advantage of CNTFETs of 5.1× over 32nm CMOS, we obtain a performance 8× higher for G3G4 dual tiling and 9× higher for F21F22 dual tiling than CMOS. For the 4-LUT, which we implemented with the topology of [1], the advantage given by the more easy optimization of circuit mapping was not sufficient to match the large size and delay of the LUT. For this reason, for the ANP figure, we obtained an advantage of more than 10× for all the high performance gates listed in Table IV (18× for F21F22 with design (c) configuration). 3) Comparison with Standard Cells: To better evaluate the potential of ambipolar CNTFET based configurable gates, we compared the regular fabrics with a standard cell circuit im- Fig. 9. Percent reduction in terms of ANPs for the high performance gates listed in Table IV with respect to the Actel ACT1 block. Fig. 10. Comparison, in terms of ANP, between single-gate layout and dual-gate layout for the different single input polarity cells (design (c)). X-axis labels refer to the logic cell for the single-gate layout, so for dual-gate layouts, the respective dual cells are also present. Fig. 11. Comparison of normalized ANP for logic gate F21 in single and dual tiling configuration with the Standard Cell libraries produced with design configurations (a), (b) and (c) of Section III-A. plementation in the same technology. In Figure 11, we present this comparison for the best CNTFET-based configurable gate, F21. In the plot, we compare the ANP values, on average over a set of mapped benchmark circuits, for all four types of tiling implemented with F21, i.e. single and dual tiling with the gates in either design configurations (b) and (c). For standard cell technology mapping, we give the values for all three design configurations (a), (b) and (c) and for the 32nm CMOS library. As we can see, every configuration using ambipolar CNT-FETs is more efficient than the CMOS library, even considering normalized delays. This gives us a first confirmation of the efficacy of the regular fabrics implementation. As we expected, however, we observe a non negligible loss of performance for the regular tiling. For example, the design (c) standard cell library for ambipolar CNTFETs performs about 2× more efficiently than the F21F22 tiling in design (c) configuration in terms of ANP, on average over a set of circuis mapped with the respective libraries. #### V. CONCLUSION In this paper, we evaluated a novel application of ambipolar CNTFETs with in-field controllable polarities to produce configurable logic gates for regular fabric design. Preliminary simulation of a 46-gate logic library in ambipolar CNTFET static logic was carried out. Results of technology mapping were then analyzed for each standard cell library. A number of cells were selected and evaluated as possible gates for regular fabric design. For each configurable gate, results of technology mapping were confronted with the standard cell libraries, the Actel ACT1 block and 4-LUT. Gate evaluation showed that an improvement of 47% over the ACT1 block and of about 18× with respect to 4-LUT in terms of ANP can be obtained with an And-Or-Inverter architecture. Standard cells maintained an advantage of about 2× of ANP compared to the most efficient configurable gate. This research shows that the high expressive power of controllable-polarity-CNTFET-based logic produces significant performance advantages over CMOS. This justifies the need of further efforts in evaluating other aspects of the novel logic architectures analyzed in this work, such as power consumption, influence of interconnect over performance, and technology integration procedures. #### ACKNOWLEDGMENT We acknowlege partial support from grant: ERC-2009-AdG-246810. The authors would like to thank Prof. Kartik Mohanram for the valuable discussions. #### REFERENCES - [1] Y. Hu et al., "Design, synthesis and evaluation of heterogeneous FPGA with mixed LUTs and macro-gates," in Proc. ICCAD. Piscataway, NJ, USA: IEEE Press, 2007, pp. 188–193. - [2] Y.-M. Lin et al., "Novel Structures Enabling Bulk Switching In Carbon Nanotube FETs," in 62nd DRC., June 2004, pp. 133–134 vol.1. - [3] R. Martel et al., "Ambipolar electrical transport in semiconducting single-wall carbon nanotubes," Phys. Rev. Lett., vol. 87, no. 25, p. 256805, Dec 2001. - [4] R. Sordan et al., "Exclusive-OR gate with a single carbon nanotube," App. Phys. Lett., vol. 88, no. 5, p. 053119, 2006. [Online]. Available: http://link.aip.org/link/?APL/88/053119/1 - [5] I. O'Connor et al., "Ultra-fine grain reconfigurability using CNTFETs," dec. 2007, pp. 194 –197. - [6] P.-E. Gaillardon *et al.*, "Interconnection scheme and associated mapping method of reconfigurable cell matrices based on nanoscale devices," july 2009, pp. 69 –74. - [7] M. H. Ben Jamaa et al., "Novel Library of Logic Gates with Ambipolar CNTFETs: Opportunities for Multi-Level Logic Synthesis," in DATE 2009., 2009, pp. 622–627. - [8] Y. Ran et al., "On Designing Via-Configurable Cell Blocks for Regular Fabrics," in Proc. DAC '04., 2004, pp. 198–203. - [9] J. Rose et al., "Architecture of Field-Programmable Gate Arrays," Proc. IEEE, vol. 81, no. 7, pp. 1013–1029, Jul 1993. - [10] "Stanford University CNTFET Model." please visit the URL http://nano.stanford.edu/models.php for further details. - [11] "Predictive Technology Model (PTM) for 32nm Bulk CMOS Technology Node." 2009, http://ptm.asu.edu/ for further details. - [12] J. Deng et al., "Carbon Nanotube Transistor Circuits: Circuit-Level Performance Benchmarking and Design Options For Living with Imperfections," in ISSCC 2007., Feb. 2007, pp. 70–588. - [13] F. Brglez et al., "A Neutral Netlist of 10 Combinational Benchmark Circuits," in ISCAS 1985., 1985, pp. 695–698. - [14] "Berkeley Logic Synthesis and Verification Group, ABC: A System for Sequential Synthesis and Verification," Release 70930. http://www.eecs.berkeley.edu/~alanmi/abc/ for further details.