# Design and Architectural Assessment of 3-D Resistive Memory Technologies in FPGAs

Pierre-Emmanuel Gaillardon, *Member, IEEE*, Davide Sacchetto, Giovanni Betti Beneventi, *Member, IEEE*, M. Haykel Ben Jamaa, *Member, IEEE*, Luca Perniola, *Member, IEEE*, Fabien Clermidy, *Member, IEEE*, Ian O'Connor, *Senior Member, IEEE*, and Giovanni De Micheli, *Fellow, IEEE* 

Abstract-Emerging nonvolatile memories (ENVMs) such as phase-change random access memories (PCRAMs) or oxide-based resistive random access memories (OxRRAMs) are promising candidates to replace Flash and Static Random Access Memories in many applications. This paper introduces a novel set of building blocks for field-programmable gate arrays (FPGAs) using ENVMs. We propose an ENVM-based configuration point, a look-up table structure with reduced programming complexity and a highperformance switchbox arrangement. We show that these blocks yield an improvement in area and write time of up to  $3 \times$  and  $33 \times$ , respectively, versus a regular Flash implementation. By integrating the designed blocks in an FPGA, we demonstrate an area and delay reduction of up to 28% and 34%, respectively, on a set of benchmark circuits. These reductions are due to the ENVM 3-D integration and to their low on-resistance state value. Finally, we survey many flavors of the technologies and we show that the best results in terms of area and delay are obtained with Pt/TiO<sub>2</sub>/Pt stack, while the lowest leakage power is achieved by InGeTe stack.

Index Terms—3-D integration, nonvolatile memory, oxide memory, phase-change memory, programmable logic arrays, RRAM.

## I. INTRODUCTION

MONG the emerging nonvolatile memories (ENVMs), phase-change random access memories (PCRAMs), and oxide-based resistive random access memories (OxRRAMs) are considered today as the most promising candidates for next

Manuscript received March 12, 2012; revised September 21, 2012; accepted October 21, 2012. Date of publication November 12, 2012; date of current version January 4, 2013. This work was supported in part by the French National Research Agency under Grant ANR-08-SEGI-12 "NANOGRAIN" and in part by the European Research Council under Grant ERC-2009-AdG-246810 "NANOSYS." The review of this paper was arranged by Associate Editor C. A. Moritz.

P.-E. Gaillardon and G. De Micheli are with the Laboratory of Systems Integration, Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland (e-mail: pierre-emmanuel.gaillardon@epfl.ch; giovanni.demicheli@ epfl.ch).

D. Sacchetto is with the Laboratory of MicroElectronic Systems, Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland (e-mail: davide. sacchetto@epfl.ch).

G. B. Beneventi was with Commissariat à l'Energie Atomique et aux Energies Alternatives, 38054 Grenoble, France. He is now with University of Bologna, Viale del Risorgimento 2, 40136 Bologna, Italy (e-mail: gbbeneventi@ arces.unibo.it).

M. H. Ben Jamaa, L. Perniola, and F. Clermidy are with Commissariat à l'Energie Atomique et aux Energies Alternatives, 38054 Grenoble, France (e-mail: haykel.ben-jamaa@cea.fr; luca.perniola@cea.fr; fabien.clermidy@ cea.fr)

I. O'Connor is with the Ecole Centrale de Lyon, 69134 Ecully Cedex, France, and also with the Ecole Polytechnique de Montréal, Montreal, QC H3T 1J4, Canada (e-mail: ian.oconnor@ec-lyon.fr).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNANO.2012.2226747

generation of nonvolatile memory (NVM) applications [1]. These ENVMs belong to a family of two terminal devices that can store information as an internal resistive state, which depends on the physical properties and electrical behavior of the underlying materials.

The interest in ENVMs is motivated by various advantages that this technology offers as compared to the traditional NVM mainstream. These advantages mainly concern better footprint scalability (down to a few nanometers), faster programming time (of the order of a few nanoseconds), and an enhanced endurance (up to  $10^9$  programming cycles). Furthermore, some demonstrators have recently been presented to showcase the viability of high-density standalone memories based on PCRAM technology from an industrial perspective. Hence, 45-nm 1-Gb [2] and 42-nm 1-Gb PCRAM technologies [3] have now become a reality.

The focus in this study is on the utilization of ENVMs in reconfigurable logic circuits, such as field-programmable gate arrays (FPGAs). The reason behind this choice is the fact that in reconfigurable logic, up to 40% of the area is dedicated to the storage of configuration signals, leading to a large cost of reconfiguration in terms of area and routing delay [4]. Traditionally, the configuration is serially loaded from an external NVM in SRAM cells distributed throughout the circuit [4]. As a result, configuration at power-up is a time-consuming operation preventing putting FPGA in sleep mode when fast wake-up is required. Nonvolatile memories like Flash memories can be used to address this issue [5]. However, combining Flash and CMOS requires a hybrid and costly technology. Contrary to Flash memories, ENVMs can be fabricated in the back-end-ofline (BEOL) process, i.e., like metal layers. BEOL integration allows us to move all the configuration memory onto the top of the chip, and leads to a clear reduction in terms of area, as demonstrated in [6] and [7].

With the recent development of ENVM technology, a number of novel FPGA building blocks and architectures have been proposed in the past few years. For example, in [7], the configuration SRAMs are enhanced by spin-transfer-torque magnetic rams (STT-MRAMs). In [8], a simple memory node storing the reconfiguration signals by means of two resistive memories and one selection transistor was detailed. This structure was extensively used in [6] to implement the nonvolatile configuration of the circuit. In addition, routing structures based on ENVMs have shown promise. In [9], a cross point for switchboxes (SBs), using PCRAMs as nonvolatile switches, is proposed to route signals through low-resistive paths, or to isolate them by means of high-resistive paths. The concept of routing elements based on ENVM switches was then exploited in [10] and [11] for timing optimization in FPGAs.

In this paper, 1) we extend the set of ENVM-based building blocks for FPGAs and study their performance at the circuit level. More precisely, we add to the previously described memory and routing structures, a look-up table (LUT) that efficiently combines logic, programming, and addressing. Then, 2) we study the impact of different ENVM technologies on FPGA performances. Indeed, the diversity of ENVM structures and materials makes the range of electrical properties extremely wide. Among the different ENVM candidates, this paper focuses only on PCRAM and OxRRAM technologies. Hence, the study of the impact of the various materials on system-level performance metrics represents an opportunity for designers and technologists to identify the most promising candidates early in the development cycle.

Both performances of the proposed blocks and their impact on FPGA circuits are studied and compared to the equivalent SRAM and Flash implementations. We show that the ENVM blocks reduce area by a factor of up to  $3\times$  and the write time by a factor of up to  $33\times$  as compared to Flash. The compact dimensions of the ENVM-based device reduce the size of FPGA blocks and routing channels. This yields an area reduction of up to 28% for complex benchmark circuits while the good onresistance properties provide a gain of up to 34% in delay. Several technology flavors are surveyed. Among a large choice of materials, we show that the Pt/TiO<sub>2</sub>/Pt stack gives the lowest area and delay, while the InGeTe material leads to the lowest leakage power.

The organization of this paper is the following. Section II surveys FPGA architecture and explains the motivations of this study. Section III gives an overview of PCRAM and OxRRAM technologies. Then, in Section IV, we present the novel building blocks for FPGAs based on ENVMs. These single blocks are evaluated with respect to competing technologies in Section V. Architectural benchmarking and impact of technologies are detailed in Section VI. In Section VII, we discuss potential opportunities given by the evolution of the technology. Finally, in Section VIII, we draw some conclusions.

# II. ARCHITECTURAL BACKGROUND AND MOTIVATION

In this section, we introduce the baseline FPGA architecture and highlight its limitations.

# A. FPGA Architecture

FPGAs are regular circuits typically consisting of several identical configurable logic blocks (CLBs) surrounded by reconfigurable interconnect lines [4]. As depicted in Fig. 1, every CLB is formed by a set of N basic logic elements (BLEs). A BLE is a K-input LUT whose output can be routed to any other LUT input with or without an intermediate registration phase through a flip-flop. Every CLB has I inputs coming from other CLB outputs and from external signals to the CLBs. All design parameters N, K, and I can be set by the FPGA architect depending on the targeted system granularity. The overall delay and area strongly depend on those parameters, which have been extensively investigated in [12]. The routing part of the FPGA



Fig. 1. Baseline field programmable gate arrays architecture [4].



Fig. 2. Field programmable gate arrays area/delay/power repartition per block [13].

is formed by large channels of width W, interleaved between the CLBs. The channels cross each other and the signals can be routed within the FPGA using SBs. A SB is a matrix at the intersection of channels, which is made of reconfigurable switches.

# B. FPGA Structural Hurdles

Fig. 2 presents the area/delay/power breakdown of the various components of a baseline SRAM-based island style FPGA. It is worth noticing that the configuration memories occupy roughly half of the area in both the logic blocks and the routing resources. Logic blocks occupy only 22% of the whole area including their own configuration memory. Only 14% of the total area is then used for actual computation. In addition to consuming most of the die area, programmable routing significantly contributes to FPGAs hurdles. In [13], interconnect delays are reported to account for roughly 80% of the total path delay. It also contributes to the high power consumption of FPGAs, with more than 60% of the total dynamic power consumption.

From all these observations, the FPGA architecture can be improved by working on memories and their efficient combination with logic.

#### C. Technologies of Programmable Elements

Most of today's FPGAs are implemented into a standard CMOS process. Thus, switches are made of transistors or gatedbuffers, controlled by a reconfiguration signal stored in an SRAM cell. However, the drawback of this SRAM-based solution is its high power consumption and its intrinsic volatility. Hence, for some specific applications, Flash technology is combined with CMOS process and switches are made of Flash transistors [5]. However, Flash-based solution is technologically expensive because of costly integration of Flash devices in a CMOS process. Thanks to their lower integration costs, ENVMs have been proposed to build nonvolatile latches useful for FPGA

 TABLE I

 Device Properties of Different Resistive Memories Technologies (Extracted for a 70 nm × 70 nm Element)

| Technology                   | Family | References      | R <sub>SET</sub> (@1V)<br>(Ω) | $R_{\text{RESET}}$ (@1V)<br>( $\Omega$ ) | Max. I <sub>prog</sub><br>(@RESET) (A) | Prog. Trans. width<br>(@45nm) (µm) | Max. writing time<br>(@SET) (ns) | Failtime temp.<br>(@ 10 years) | Endurance<br>(Cycles) |
|------------------------------|--------|-----------------|-------------------------------|------------------------------------------|----------------------------------------|------------------------------------|----------------------------------|--------------------------------|-----------------------|
| GST                          | PCRAM  | [2, 17]         | 5k                            | 410k                                     | 1.96m                                  | 1.57                               | 500                              | 85°C                           | 10 <sup>9</sup>       |
| N-GST (~5%)                  | PCRAM  | [21, 22]        | 50k                           | (*) 410k                                 | 0.49m                                  | 0.39                               | 100                              | (*) 95°C                       | (*) 10 <sup>9</sup>   |
| 10%SiO <sub>x</sub> /SiN-GST | PCRAM  | [24, 25]        | 10k                           | 1.25M                                    | 0.69m                                  | 0.54                               | 1500                             | 235°C                          | (*) 10 <sup>9</sup>   |
| GeTe                         | PCRAM  | [18-20]         | 50                            | 410k                                     | 1.86m                                  | 1.49                               | 100                              | 100°C                          | $(*) 10^{6}$          |
| GeTeC4%                      | PCRAM  | [19]            | 2.5k                          | 410k                                     | 1.76m                                  | 1.41                               | 250                              | 100°C                          | $(*) 10^{6}$          |
| GeTeC10%                     | PCRAM  | [19]            | 10k                           | 410k                                     | 1.37m                                  | 1.1                                | 1000                             | 130°C                          | (*) 10 <sup>6</sup>   |
| GeTeN2%                      | PCRAM  | [20]            | 1.25k                         | 410k                                     | 1.96m                                  | 1.57                               | 125                              | 125°C                          | $(*) 10^{6}$          |
| GeTeN4%                      | PCRAM  | [20]            | 1.25k                         | 410k                                     | 1.96m                                  | 1.57                               | 125                              | 155°C                          | (*) 10 <sup>6</sup>   |
| Doped InGeTe                 | PCRAM  | [26]            | 10k                           | 4.1M                                     | 1.96m                                  | 1.57                               | (*) 125                          | 150°C                          | $(*) 10^4$            |
| Pt/TiO <sub>2</sub> /Pt      | OxRRAM | [29, 35]        | 10                            | 1M                                       | <100µ                                  | 0.08                               | 5                                | 85°C                           | $10^{6}$              |
| Pt/HfO <sub>2</sub> /TiN     | OxRRAM | [This work, 35] | 2.5k                          | 5k                                       | <100µ                                  | 0.08                               | 5                                | 85°C                           | $(*) 10^{6}$          |
| Pt/TiO <sub>2</sub> /TiN     | OxRRAM | [This work, 35] | 200                           | 1 k                                      | <100µ                                  | 0.08                               | 5                                | 85°C                           | (*) 10 <sup>6</sup>   |

configuration [7]. Beside FPGA implementation, recent works introduced flip-flop and latch circuits integrating OxRRAM cells [14], [15].

In this study, the main idea is not only to replace the memory elements but to introduce low-on resistance ENVMs into the logic data paths of switching elements instead of MOS transistors. Considering that *n*-type transistors have a resistance of around 4 k $\Omega$  in CMOS 45-nm technology [16], we will show that ENVM element is an attractive solution with lower resistance state that has been reported in the range of 10  $\Omega$  to a few k $\Omega$ , depending on the technology (see Table I).

# III. TECHNOLOGICAL OVERVIEW OF PHASE-CHANGE AND OXIDE-BASED MEMORIES

As presented in Section I, we focus on phase-change memories and oxide-based resistive memories. In this section, we give a brief overview of the memory mechanisms and integration flows of both technologies.

## A. Phase-Change Memory Technology

PCRAM device relies on the unique property of chalcogenide alloys as active materials integrated in the memory cell stack. Chalcogenide alloys are semiconducting glasses made by group VI elements of the periodic table, such as sulphur, selenium, and tellurium, which show reversible phase-change capability [2], [17]–[26]. By means of a careful control of Joule heating in the active layer, it is possible to switch the chalcogenide material between two stable configurations, i.e., a high-conductive ordered polycrystalline state (called SET) and a low-conductive disordered amorphous one (RESET). Fig. 3 illustrates both amorphous and crystalline states investigated by transmission electron microscopy.

Even if most of the research work on memory devices has focused up to now on the chalcogenide  $Ge_2Sb_2Te_5$  (GST), novel alloys such as GeTe [18],  $GeTeC\alpha_{\%}$  [19], or  $GeTeN\alpha_{\%}$  [20] (with  $\alpha$  representing the percentage of C or N, respectively) expand the range of the properties reachable by the different materials. The various chalcogenide materials and PCRAMbased device properties surveyed in this study are summarized in Table I.

It is worth mentioning that these materials have been integrated by various manufacturers into a range of device architec-



Fig. 3. PCRAM device structure [23] RESET and SET state (top) and corresponding programming pulses (bottom).

tures with various geometries and electrodes. In order to provide synthetic physical parameters and easily capture the behavior of the various alternatives, the various properties have been normalized to a 70 nm  $\times$  70 nm node size (equal to the dimensions of a VIA1 in 45-nm process [16]). Hence, resistances and programming currents are computed using the intrinsic material resistivities and SET programming current densities. Where measurements are not available, the numbers are extrapolated from the material exhibiting the closest structural and electrical properties. An asterisk identifies these numbers.

## B. Oxide Memory Technologies

Transition-element oxide memory technologies base their working principle on the change of their resistance state due to a modification of the conductivity property of the oxide itself. The switching of OxRRAMs depends on several parameters including the nature of the switchable oxide and the chemical nature of the top and bottom electrodes [27]. Two major groups of OxRRAMs can be identified by considering the physical mechanism that drives the modification of the resistance state.

The first group consists of two-terminal OxRRAM devices based on transition metal oxides, such as  $SiO_2$ , HfO<sub>2</sub> [28], or Al<sub>2</sub>O<sub>3</sub> [30], sandwiched between metal electrodes whose switching does not depend on the polarity of the applied voltage. This mechanism, known as unipolar, can be explained by a



Fig. 4. Resistive switching through I-V sweeps for planar Pt/TiO<sub>2</sub>/Pt layer realized with 270-nm/80-nm/270-nm thicknesses [29].

metallic filament formation mechanism related to the solid-state redox reactions stimulated by the applied electric field [31].

A second group is related to the oxygen vacancy redistribution in oxide layers, such as TiO<sub>2</sub> [32] or HfO<sub>2</sub> [27], upon applying a voltage and it causes the switching from an insulating to a metallic state. In the particular TiO<sub>2</sub> example, the diffusion of oxygen vacancies transforms the TiO<sub>2</sub> volume into a highly conductive TiO<sub>2-x</sub> layer, thus reducing the total resistance of the oxide layer. Upon application of an electric field with opposite polarity, the redistribution of oxygen is led toward the opposite electrode and total resistance is increased again as the proportion of stoichiometric  $TiO_2$  increases with respect to  $TiO_{2-x}$ . Since the writing of this cell relies on the application of opposite voltage polarities, the writing mechanism is often labeled as bipolar. As an illustration, planar Pt/TiO<sub>2</sub>/Pt stack characteristic are shown in Fig. 4. By sweeping voltages from negative to positive values, the devices hold the high resistance state until a transition to a low resistance state occurs. After this event, the voltage can be increased with no effect on the internal state. When moved backward toward the negative voltage region, the device is reset to the original high resistance state.

Some OxRRAM cells have been built and characterized. Their parameters are summarized in Table I.

#### C. Storage Element Integration Flow

One of the big advantages of ENVM technologies is its CMOS-compatibility. Indeed, the materials involved in ENVMs are deposited at low temperature, compatible with metal line process. As an illustration, a schematic cross section of a cointegrated ENVM-CMOS transistor is shown in Fig. 5. As in standalone NOR arrays, this memory cell includes a storage node and a selector transistor in series (i.e., 1-resistor-1transistor configuration). The memory element may be fabricated either just after the Si contact formation step or after the first steps of interconnections (e.g., on top of Metal 1 interconnect level).



Fig. 5. Cross-sectional schematic showing the integration of an ENVM device (PCRAM stack example). The device is including a phase-change layer (PC) with bottom (BEC) and top (TEC) electrode contacts and is integrated between the M1 and M2 interconnection levels in the back-end-of-line. The MOSFET selector (bottom) is fabricated in the front-end-of-line.



Fig. 6. (a) ENVM-based memory node. (b) Node in read configuration. (c) Node in write configuration.

# IV. DESIGN OF ENVM-BASED FPGA

In this section, we describe how to efficiently use resistive memory technologies in logic and routing circuitries.

#### A. 1T2R Basic Memory Node

First, we present an elementary circuit, based on ENVM, and used to put configuration memories above a reprogrammable circuit. Such a memory node is designed for a one-by-one replacement of traditional SRAM and thus to drive MUltipleXer (MUX) inputs or pass gates. The memory node is programmed by injecting a certain current through it, while the information has to be read as a voltage level.

1) Concept: The basic memory node is presented in Fig. 6(a). The circuit consists of two resistive memory nodes connected in a voltage divider configuration between two fixed voltage lines  $(L_A \text{ and } L_B)$ . The memories are used in a complementary manner, in order to improve reliability. Reliability is required because the output is not restored by an inverter for compactness purposes. A transistor is also connected between ground and the output node of the cell. It is used to select the node during the programming phase. The output Y is designed to place a fixed voltage on a conventional standard cell input (i.e., a high or a low logic level). Read operations are intrinsic with the structure, while programming is an external operation to perform on the cell.

2) Read Operation: A voltage divider is used in this topology to intrinsically realize the conversion from a bit of data stored in the variable resistance to voltage level. Fig. 6(b)presents a configuration example where the node stores a logic level high (noted "1"). Voltage lines  $L_A$  and  $L_B$  are,



Fig. 7. Line sharing illustration in standalone-memory-like architecture.

respectively, connected to  $V_{dd}$  and  $V_{ss}$ . The programming transistor is turned OFF by assigning a logic level low (noted "0") to the *Prog\_enable* signal, thereby disconnecting the ground from the output. The resistive memory  $R_1$ , connected to the  $V_{dd}$  line, is configured to the low resistivity state. The other memory  $R_2$ , connected to  $V_{ss}$ , is in the high resistivity state. As a consequence, a voltage divider is configured and the output node is charged close to the voltage of the branch with a high conductivity. The logic levels (respectively, high and low) depend on  $R_{\rm ON}$  and  $R_{\rm OFF}$  (respectively, resistance values in the low resistive SET and high resistive RESET states) as in the following relations:

$$"1" = V_{dd} - \frac{R_{\rm ON}}{R_{\rm ON} + R_{\rm OFF}} (V_{dd} - V_{ss})$$
$$"0" = \frac{R_{\rm ON}}{R_{\rm ON} + R_{\rm OFF}} (V_{dd} - V_{ss}).$$

It is also worth noticing that in continuous read operation, a current will be established through the resistors. This leads to a passive current consumption through the structure based on the following relation:

$$I_{\text{leak}} = -\frac{V_{dd} - V_{ss}}{R_{\text{ON}} + R_{\text{OFF}}} \approx \frac{V_{dd} - V_{ss}}{R_{\text{OFF}}}$$

This static current can be reduced by the choice of a memory technology like doped InGeTe (see Table I) maximizing the resistivity, as well as sizing the memory node to maximize its  $R_{\rm OFF}$  value.

3) Write Operation: Fig. 6(c) presents the programming phase of the node. The programming transistor is first turned ON by setting the *Prog\_enable* signal to logic level "1," so that the lines  $L_A$  and  $L_B$  are disconnected from the power lines and connected to the programming unit. Then, the programming unit applies a current sequence  $I_{\text{progA}}$  and  $I_{\text{progB}}$  to the resistive memories to change their states. Programming currents are drained to ground. As each cell has its own selection transistor, the lines can be shared in a standalone-memory-type architecture (see Fig. 7), yielding an efficient layout strategy.

#### B. Architecture of Resistive Memory-Based Look-Up Tables

While it is possible to create a look-up table by simply replacing the SRAM memories with the basic memory node presented earlier, we further propose an efficient implementation that supports multiplexer sharing for both normal use and programming operations.

1) Normal Operation: The look-up table architecture based on resistive memory is depicted in Fig. 8(a). In normal operation mode, the signal *Prog\_enable* (P) is pulled down to ground,



Fig. 8. Resistive memory-based look-up table. (a) General structure. (b) Structure in normal mode. (b) Structure in write mode.

resulting in the circuit of Fig. 8(b). A voltage  $V_R$  is then applied to the top electrodes of the resistive memories, while the node  $V_{\text{sense}}$  is pulled down to ground through the sense resistance  $R_0$ .  $V_R$  is the read voltage chosen to ensure that the current flowing through the structure is below any configuration thresholds. The address signal corresponds to the multiplexer control signal. Hence, depending on the address, a unique path will be created between the  $V_R$  line and ground through the selected node and the sense resistance  $R_0$ . As a voltage drop off might be present on the node  $V_{\rm sense}$ , the value of  $R_0$  should be chosen to ensure that the voltage levels of node  $V_{\text{sense}}$  will correctly trigger the level restoration output inverter. Typically,  $R_0$  should be an average between on and off state value of resistive memories. Differently from the previous memory node, complementary storage of the state is not required. As the output level is restored by an inverter, constraints regarding the storage might be relaxed.

2) Write Operation: The main advantage of the proposed LUT structure is that it shares the multiplexer for both read and write operations. Fig. 8(c) presents the LUT in write mode. By pulling up the signal P, the node  $V_{\text{sense}}$  is grounded, in order to allow the programming current to be drained, while the programming unit ( $V_W$  line) is connected to the top electrode of the memories. The multiplexer is then used to address the single RRAM to program by allowing the programming current to flow.

### C. Architecture of Resistive Memory-Based SBs

As mentioned is Section II, the most important structures in FPGA routing are SBs. These structures are typically built using SRAM-configured pass-gates.

1) Overview: In this block, we propose to merge the passgate and the programming SRAM with a single memory element. The resulting structure is shown in Fig. 9(a) on a  $2 \times 2$ Wilton SB [40]. Each link is built with a similar structure as for standard FPGAs, whereby two-terminal ENVMs are used in the place of pass-transistors and their associated SRAM configuration points. An on-connection between two wires is realized by using the wires to program the memory connecting them to a low resistance.

2) Write Operation: In the proposed SB architectures, ENVMs replace traditional SRAMs. Consequently, the structure can no longer be programmed with an independent path, such as a shift register. Access to the primary inputs/outputs of the SB is mandatory. The required voltages and timing pulses to



Fig. 9. (a) RRAM-based  $2 \times 2$  SB architecture, (b) input, and (c) output) drivers for RRAM-based crossbar.

program the ENVMs may be applied through the drivers at the primary nodes [depicted in Fig. 9(a)].

Fig. 9(b) and (c) shows a possible structure for these drivers, which interface the signal channels to the programming unit electrically. As with the other blocks, the programming unit generates the required configuration waveforms. In addition to these interface nodes, addressing circuits are required to ensure the sequential programming of the set of memories in the SB. However, the complexity of these circuits is equivalent to the shift registers utilized for SRAM-based FPGA programming.

## V. CIRCUIT-LEVEL CHARACTERIZATION

We will now evaluate the performance metrics at the circuit level. After validating the behavior of the blocks by electrical simulations, their performances will be compared to their equivalent counterparts. We expect significant improvement in area because of the replacement of area-hungry SRAMs by small footprint ENVMs. In this section, we focus on the block metrics such as area, programming time, programming energy, and leakage power.

#### A. Transient Simulation

Transient simulations have been performed in order to validate the global behavior of the structures. A behavioral compact model of PCRAMs [33] has been used to allow the fast prediction of the behavior of the circuits. Note that compact model of OxRRAMs is also available [34]. Such electrical simulations allow the functionality of the blocks to be demonstrated, but also show how the impact of the programming of the ENVM at the circuit level is considered. For the sake of brevity, only the functional validation of the 1T2R basic memory node is reported.

Fig. 10 shows the electrical results of a typical memory usage case. In this electrical simulation,  $V_{dd}$  is set equal to 500 mV, while  $V_{ss}$  is tied to 0 V. Note that this  $V_{dd}$  value has been chosen to show that a very low power supply could be considered without any restrictions, due to the voltage divider arrangement of the node. However, any voltages lower than the memory programming threshold voltage can be used, making the node fully compatible with current technology logic levels. In region A, the elementary node is initially configured to apply a high logic level "1" at the output node. At that time, the memories



Fig. 10. 1T2R Basic Memory Node transient simulation.

are configured according to Fig. 6(b). The memory node is then reprogrammed to the region B, by a sequential application of a RESET pulse on the memory that is connected to  $V_{dd}$ , and a SET pulse on the other one, while the programming transistor is set ON. This leads the structure to flip its memory content. In the final read operation C, a low logic level "0" is then sensed at the output node.

## B. Performance Characterization

In order to evaluate the impact of the different blocks at the circuit level, we compare their performances with their traditional counterparts.

1) Methodology: To characterize the ENVM-based building blocks, we evaluated their performances metrics in terms of area, write time, programming energy and leakage power. Note that, in FPGAs, information stored in the different configuration memories are never read back. Indeed, their information is only used locally, and the read operation is then intrinsic. Hence, differently from standalone memories, their performance evaluation does not include reading time and reading energy. The performance extraction is based on the node complexity expressed in terms of the basic elements that are required to realize the circuit. The area is extracted from basic layout considerations using CMOS 45-nm technology rules [16] and expressed in halfpitch to give values independent of lithography node. Timing is obtained by electrical simulations using a behavioral compact model [33]. The programming energy is extracted from the ITRS [35], while the leakage power is computed at  $V_{dd}$ equal to 1 V, considering all static currents involved. GeTebased PCRAM (see Table I) has been used as a baseline ENVM technology for the evaluation. Indeed, while GeTe exhibits a maturity equivalent to GST, it demonstrates better data retention properties. This makes it a good candidate for embedded applications such as FPGAs. Comparison to building blocks traditionally used in FPGA, such as CMOS SRAM 5 T cells [4] and Flash memory elements [5], are then used to evaluate the

| TABLE II                                      |  |  |  |  |  |  |  |  |  |
|-----------------------------------------------|--|--|--|--|--|--|--|--|--|
| 1T2R BASIC MEMORY NODE PERFORMANCE EVALUATION |  |  |  |  |  |  |  |  |  |

|                | Cell | Area Write tim<br>(F <sup>2</sup> ) (ns) |        | Prog. energy<br>(pJ) | Leakage<br>at 1V (nW) |  |
|----------------|------|------------------------------------------|--------|----------------------|-----------------------|--|
| SRAM           | 5T   | 196                                      | 0.2    | 5.10-4               | 142                   |  |
| Flash cell     | 2T   | 84                                       | 1 000  | 100                  | 210                   |  |
| ENVM cell      | 1T2R | 28                                       | 60     | 12                   | 625                   |  |
| Flash vs. ENVM | -    | x 3                                      | x 16.6 | x 8.3                | x 0.34                |  |

 TABLE III

 LOOK-UP TABLE PERFORMANCE EVALUATION (4 BITS LUT)

|                | Cell   | Area<br>(F <sup>2</sup> ) | Write time<br>(ns) | Prog.<br>energy (pJ) | Leakage<br>at 1V (nW) |
|----------------|--------|---------------------------|--------------------|----------------------|-----------------------|
| SRAM           | 120T   | 4 396                     | 3.2                | 8.10-3               | 2 272                 |
| Flash cell     | 72T    | 2 1 5 6                   | 16 000             | 1 600                | 3 360                 |
| ENVM cell      | 44T17R | 1 288                     | 480                | 96                   | 819                   |
| Flash vs. ENVM | -      | x 1.7                     | x 33.3             | x 16.6               | x 4.1                 |

structures. The associated numbers are extrapolated from the ITRS [35]. Note that we are dealing with non-volatile memories. Hence, we will stress the comparison with regards to Flash.

2) *1T2R Memory Blocks:* Table II shows some characterization results in terms of area, write time and programming energy and leakage power for the proposed 1T2R basic memory node and traditional FPGA counterparts.

We consider that all these elements drive an equal load. We see that the proposed cell is the most compact solution, even with the impact of the programming current on the access transistor. This advantage is due to the reduction of the memory front-end footprint to a single transistor, compared to 5 for the SRAM cell and 2 for the Flash solution (one pull-up transistor coupled to a floating gate transistor). It is also worth highlighting that ENVMs offer a significant advantage in write time and programming energy reduction for nonvolatile memory technologies. In the context of our technological hypotheses, it is possible to reduce the area by  $3\times$ , the writing time by  $16.6\times$  and the programming energy by  $8.3 \times$  compared to an equivalent Flash technology. However, leakage power drained by the node is around  $3 \times$ bigger than Flash. The 1T2R node has a high and continuous leakage current due to its voltage divider structure. Note that the node has been sized to maximize the resistance  $R_{OFF}$ , by using the smallest dimensions allowed by the technology. In order to further reduce the leakage, materials with higher resistivity might be envisaged.

3) Logic Blocks: Table III shows the characterization results for a four inputs look-up table and compares it to traditional FPGA logic blocks. We see that the proposed LUT structure is again the most compact one with a gain over Flash of  $1.7 \times$ . Write time is reduced by  $33 \times$  and programming energy by  $16 \times$ . In addition, leakage power is reduced by  $4.1 \times$ . In standard LUT, memories store logic values, which are further selected by a multiplexer. In such an implementation, the memories consume leakage power to retain information. In the proposed RRAM LUT, only the currently selected memory has a major contribution to leakage, leading to a significant gain. Finally, we stress that the structure's efficiency is even better if we consider that the multiplexer, used for normal LUT operations, is also used as an addressing device for programming operations. Hence,

TABLE IV Technology Performance Evaluation (2  $\times$  2 Wilton SB [40])

|                | Cell  | Area<br>(F <sup>2</sup> ) | Write time<br>(ns) | Prog. energy<br>(pJ) | Leakage<br>at 1V (nW) |
|----------------|-------|---------------------------|--------------------|----------------------|-----------------------|
| SRAM           | 72T   | 2 688                     | 2.4                | $6.10^{-3}$          | 1 704                 |
| Flash cell     | 24T   | 1 008                     | 12 000             | 1 200                | 0                     |
| ENVM cell      | 8T12R | 866                       | 360                | 72                   | 0                     |
| Flash vs. ENVM | -     | x 1.2                     | x 33.3             | x 16.6               | -                     |

such a structure further simplifies the complexity of the writing scheme.

4) Routing Blocks: Table IV shows the results of characterization for the proposed SB arrangement and the traditional FPGA counterpart. Note that only the specific access circuits required for programming have been included. Indeed, common programming circuits (such as address decoders and shift registers) are not included in the estimation as they are shared by all the considered technologies. We see that the proposed SB is still the most compact solution with a gain of  $1.2 \times$  compared to an equivalent Flash technology. The ENVM-based SB uses fewer transistors in the internal structure (8 versus 24) than the Flash-based structure. However, while this ratio should lead to a gain of  $3\times$ , we should bear in mind that the transistors used for ENVM are larger, in order to allow large programming currents to flow through the structure. Furthermore, as the memories are directly used to perform the routing operation, no leakage power is dissipated by the SBs (i.e., no permanent leakage path exist in the structure), offering significant interest for power reduction.

#### VI. ARCHITECTURAL IMPACT

In the previous section, we studied the blocks at the circuit level. In this section, we move toward the architectural level and study the impact of the blocks onto the FPGA architecture.

# A. Methodology

We used a set of logic circuits taken from the MCNC benchmark [36], which we first synthesized using the ABC tool [37]. We then performed the technology mapping with a library of four-input LUTs (K = 4) using ABC as well. Subsequently, we performed the logic packing of the mapped circuit into CLBs with N = 10 BLEs per CLB and I = 22 external inputs using AA-PACK [38]. Finally, the placement and routing were carried out using VPR6.0 [39]. Each benchmark is first synthesized on an SRAM-based LUTs and MUXs in the CMOS 45 nm process [16], using a pass-gate design. Then, we replaced the SRAM-based building blocks by their ENVM counterparts. The first studied material stack will be the GeTe phase-change material. Subsequently, the impact of each of the proposed blocks will be assessed. Finally, all the surveyed technologies are benchmarked with the same approach, in order to find the most suited technology for future applications.

## B. Architectural Impact Over CMOS-SRAM

We mapped the benchmark into both CMOS SRAM-based and GeTe-PCRAM based FPGAs. The ENVM-based FPGAs use the 1T2R basic memory node, ENVM-based LUTs and SBs. The area estimation, expressed in units of minimal size



Fig. 11. Area estimation for FPGAs synthesized with GeTe-phase change memory- and SRAM-based circuits.



Fig. 12. Delay estimation for FPGAs synthesized with GeTe-phase change memory- and SRAM-based circuits.

transistors, is shown in Fig. 11. A minimal size transistor corresponds to the minimal *n*-type transistor used in the design (here of area 0.118  $\mu$ m<sup>2</sup>, corresponding to W = 210 nm and L = 45 nm transistor as per [16]). The evaluation includes the specific programming devices (sized accordingly to the material requirements—see Table I). However, the circuits shared by the various technologies are not included. The benchmarks show an area reduction ranging from 19% to 23%, with 20% on average. The main benefit of using ENVMs instead of SRAM cells is the compact area of the cell, which reduces the silicon real estate occupied by peripheral programming circuits.

The critical path delay estimation for the same benchmark set is shown in Fig. 12. The simulations show a critical path delay reduction ranging from 17% to 34%, with 24% on average. The origin of this remarkable delay reduction is twofold. First, it is due to the low on-resistance of ENVM technologies: the internal on-resistance of an *n*-type transistor (extracted from our design kit) is 3.8 k $\Omega$ , while the GeTe-based PCRAM technology exhibits an on-resistance of 50  $\Omega$ . This makes the ENVM-based FPGA potentially faster than the SRAM-based counterparts, given the lower resistive path for data through the routing passgate multiplexers. Second, the area reduction depicted in the previous figure allows for a reduction of the logic block size and



Fig. 13. Contribution of the different resistive circuits analysis wrt. area and delay reduction.

consequently a lower wire delay. The exact contributions of area and delay improvements are analyzed in the next subsection.

#### C. Breakdown of ENVM Element Contributions

In the preceding sections, we mentioned that RRAM building blocks lead to significant reductions in area and critical path delay. However, it is of high interest to distinguish the various contributions of each circuit type. Fig. 13 depicts the contribution of the ENVM-based circuits to the area and delay improvement over CMOS SRAM-based circuits.

As suggested previously, the gain in area is mainly due to the memory circuits. Indeed, a one-to-one replacement of the SRAM by the 1T2R basic memory node contributes to 62.5% of the saved area, while SBs and LUTs have a contribution of 11% and 26.5%, respectively. Regarding delay, it is worth pointing out that the SB improvement contributes to 90% of the total critical path delay reduction. Effectively, the proposed SBs replace the CMOS pass-gates with low resistive memories. Such an improvement drastically reduces the impact of programmable wires along the data path and thus improves the global wire electrical properties. We also point out that the other blocks contribute to around 10% in the critical path reduction. While these blocks are not directly related to the data path, we have seen that they lead to an area reduction of the circuit. This reduction implies as a side effect a shortening of the global wires, which decreases parasitics and impacts electrical performances.

## D. Impact of Various Technology Flavors

ENVM technologies are of high interest for programmable applications due to their compact area and low on-resistance. While the GeTe technology demonstrated good performances metric improvements for FPGA applications, we should note that several other materials could be envisaged within the same evaluation framework. Fig. 14 is intended to sort the different technologies surveyed in Table I using three different metrics: overall circuit area, critical path delay, and the memory contribution to leakage power per CLB. All these figures are averaged over the complete benchmark set. Leakage power is a significant concern in modern FPGAs and is clearly a point to be addressed if we consider that the proposed 1T2R basic memory nodes can create leaky voltage dividers throughout the circuit.

Regarding the total area, the surveyed ENVM technologies lead globally to the same gain, compared to the SRAM technology. Note that memories requiring a low programming current demonstrate the best results. The Pt/TiO<sub>2</sub>/Pt stack demonstrates an average gain in area of 24%. In addition, the best results in terms of critical path delay reduction are also achieved by the Pt/TiO<sub>2</sub>/Pt stack, with a reduction ranging from 18% to



Fig. 14. Variation of total area, critical path delay, and memory leakage power among the surveyed technologies (numbers averaged over the whole benchmark set).

34%, with 24% on average. This material demonstrates an onresistance as low as 100 $\Omega$ . In addition, the stack allows small access transistors. As highlighted, this plays a role in shortening the interconnections and further decreasing the delay. However, the off-resistance of the technology is in the range of 1 M $\Omega$ . The off-resistance has a direct impact on the memory leakage in the 1T2R basic memory node. It is thus of high priority to focus on technologies with higher  $R_{\rm OFF}$  values. The best technology for this purpose is the InGeTe memory stack with the 85% reduction of the memory leakage power per CLB compared to SRAM circuits. Hence, the choice of the materials has a direct influence on the architecture performance, and thus can be considered as a new lever for performance sizing.

## VII. OVERALL DISCUSSION

The presented building blocks can be employed to store configuration data in programmable circuits (1T2R basic memory node and LUTs) and to create high performance routing elements (SBs). Even though PCRAM and OxRRAM technologies are approaching the maturity required for mass production, research and development on materials and cell design activities still maintain a key role for nonvolatile resistive memories. Indeed, we showed that various flavors of the technologies lead to different gain in area, delay, and power. Hence, an optimal choice might depend on the application context and the targeted performance metrics. The technological choice is so broad that many evolutions may come from the technology. For future FPGAs, we can envisage improvements in both technology and cell design to achieve better overall characteristics such as higher programming speed or lower programming current. For example, the programming current plays an influence on the size of selection devices and the write energy. New cell designs could reduce the programming energy, as shown in [41], where a sublithographic heater is used to increase current density. With such considerations, technology developments and architectural design will be strongly correlated, so that new work methodologies are required, such as the fast evaluation methodology for emerging technologies used in this study.

## VIII. CONCLUSION

This paper introduces a novel family of building blocks based on resistive memories, designed to replace traditional circuits in reconfigurable logic circuits. A basic memory node, a lookup table and a SB have been proposed, all using ENVMs to reduce the impact memory on area and to improve the electrical performance of the data path. We have shown that the proposed solution reduces the size of the building blocks by a factor of up to  $3 \times$  compared to traditional flash memories. We evaluated the impact in FPGA design and we showed that area and critical path delay could be reduced by a factor of up to 28% and 34%respectively due to the compactness of ENVMs and the speed of ENVM-based SBs. Finally, we surveyed different technologies and we showed that the PCRAM using InGeTe material leads to the lowest leakage power, while the OxRRAM Pt/TiO<sub>2</sub>/Pt stack gives the best area and delay improvements.

#### ACKNOWLEDGMENT

The authors would like to thank J. Luu and J. Rose from Toronto University for giving access of prerelease version of VTR6.0 flow with timing evaluation.

#### REFERENCES

- G. W. Burr, B. N. Kurdi, J. C. Scott, C. H. Lam, K. Gopalakrishnan, and R. S. Shenoy, "Overview of candidate device technologies for storageclass-memory," *IBM J. Res. Develop.*, vol. 52, no. 4/5, pp. 449–464, Jul/Sep. 2008.
- [2] G. Servalli, "A 45 nm generation phase change memory technology," in Proc. IEEE Int. Electron Dev. Meet. Tech. Dig., 2009, pp. 113–116.
- [3] S. H. Lee, H. C. Park, M. S. Kim, H. W Kim, M. R. Choi, H. G. Lee, J. W. Seo, S. C. Kim, S. G. Kim, S. B. Hong, S. Y. Lee, J. U. Lee, Y. S. Kim, K. S. Kim, J. I. Kim, M. Y. Lee, H. S. Shin, S. J. Chae, J. H. Song, H. S. Yoon, J. M. Oh, S. K. Min, H. M. Lee, K. R. Hong, J. T. Cheong, S. N. Park, J. C. Ku, H. S. Shin, Y. S. Sohn, S. K. Park, T. S. Kim, Y. K. Kim, K. W. Park, C. S. Han, H. W. Kim, W. Kim, H. J. Kim, K. S. Choi, J. H. Lee, and S. J. Hong, "Highly productive PCRAM technology platform and full chip operation: Based on 4F2 (84 nm Pitch) cell scheme for 1 Gb and beyond," in *Proc. IEEE Int. Electron Dev. Meet. Tech. Dig.*, 2011, pp. 3.3.1–3.3.4.
- [4] V. Betz, A. Marquardt, and J. Rose, Architecture and CAD for Deep-Submicron FPGAs. New York: Kluwer, 1999.
- [5] K. J. Han, N. Chan, S. Kim, B. Leung, V. Hecht, B. Conquist, D. Shum, A. Tilke, L. Pescini, M. Stiftinger, and R. Kakoschke, "A novel flashbased FPGA technology with deep trench isolation," in *Proc. Non-Volatile Semicond. Memory Workshop*, Aug. 2007, pp. 26–30.
- [6] Y. Y. Liauw, Z. Zhang, W. Kim, A. El Gamal, and S. Wong, "Nonvolatile 3D-FPGA with monolithically stacked RRAM-based configuration memory," in *Proc. Int. Solid-State Circuit Conf. Tech. Dig.*, 2012, pp. 406–408.
- [7] Y. Guillemenet, L. Torres, and G. Sassatelli, "Non-bolatile run-time fieldprogrammable gate array structures using thermally assisted switching magnetic random access memories," *IET Comput. Digit. Tech.*, vol. 4, no. 3, pp. 211–226, 2010.
- [8] P.-E. Gaillardon, M. H. ben-Jamaa, M. Reyboz, G. Betti Beneventi, F. Clermidy, L. Perniola, and I. O'Connor, "Phase-change-memory-based storage elements for configurable logic," in *Proc. Int. Conf. Field-Programmable Technol.*, 2010, pp. 17–20.
- [9] P.-E. Gaillardon, M. H. Ben-Jamaa, G. Betti Beneventi, F. Clermidy, and L. Perniola, "Emerging memory technologies for reconfigurable routing in FPGA architecture," in *Proc. Int. Conf. Electron., Circuits, Syst.*, 2010, pp. 62–65.
- [10] S. Tanachutiwat, M. Liu, and W. Wang, "FPGA based on integration of CMOS and RRAM," *IEEE Trans. Very Large Scale Integr. Syst.*, vol. 19, no. 11, pp. 2023–2032, Nov. 2011.
- [11] J. Cong and B. Xiao, "mrFPGA: A novel FPGA architecture with memristor-based reconfiguration," in *Proc. Int. Symp. Nanoscale Architect.*, 2011, pp. 1–8.

- [12] E. Ahmed, "The effect of logic block granularity on deep-submicron FPGA performance and density," Master thesis, Dept. Electr. Comput. Eng., Univ. Toronto, 2001.
- [13] M. Lin, A. El Gamal, Y.-C. Lu, and S. Wong, "Performance benefits of monolithically stacked 3-D FPGA," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 26, no. 2, pp. 216–229, Feb. 2007.
- [14] S. Yamamoto and S. Sugahar, "Nonvolatile delay flip-flop based on spintransistor architecture and its power-gating applications," *Jap. J. Appl. Phys.*, vol. 49, pp. 090204-1–090204-3, 2010.
- [15] W. Robinett, M. Pickett, J. Borghetti, Q. Xia, G. Snider, G. Medeiros-Ribeiro, and R. S. Williams, "A memristor-based nonvolatile latch circuit," *Nanotechnology*, vol. 21, no. 23, pp. 235203-1–235203-6, 2010.
- [16] (2011). [Online]. Available: http://www.eda.ncsu.edu/wiki/FreePDK/
- [17] R. Fallica, J.-L. Battaglia, S. Cocco, C. Monguzzi, A. Teren, C. Wiemer, E. Varesi, R. Cecchini, A. Gotti, and M. Fanciulli, "Thermal and electrical characterization of materials for phase-change memory cells," *J. Chem. Eng. Data*, vol. 54, pp. 1698–1701, 2009.
- [18] L. Perniola, V. Sousa, A. Fantini, E. Arbaoui, A. Bastard, M. Armans, A. Fargeix, C. Jahan, J.-F. Nodin, A. Persico, D. Blachier, A. Toffoli, S. Loubriat, E. Gourvest, G. Betti Beneventi, H. Feldis, S. Maitrejean, S. Lhostis, A. Roule, O. Cueto, G. Reimbold, L. Poupinet, T. Billon, B. De Salvo, D. Bensahel, P. Mazoyer, R. Annunziata, P. Zuliani, and F. Boulanger, "Electrical behavior of phase-change memory cells based on GeTe," *IEEE Electron Devices Lett.*, vol. 31, no. 5, pp. 488–490, May 2010.
- [19] G. B. Beneventi, L. Perniola, V. Sousa, E. Gourvest, S. Maitrejean, J. C. Bastien, A. Bestard, B. Hyot, A. Fargeix, C. jahan, J. F. Nodin, A. Persico, A. Fantin, D. Blachier, A. Toffoli, S. Loubriat, A. Roule, S. Lhostis, H. Feldis, G. Reimbold, T. Billon, B. De Salvo, L. Larcher, P. Pavan, D. Bensahel, P. Mazoyer, R. Annunziata, P. Zuliani, and F. Boulanger, "Carbon-doped GeTe: A promising material for phase-change memories," *Solid State Electron.*, vol. 65–66, pp. 197–204, 2011.
- [20] A. Fantini, L. Perniola, V. Sousa, E. Gourvest, S. Maitrejean, J. C. Bastien, A. Bestard, B. Hyot, A. Fargeix, C. jahan, J. F. Nodin, A. Persico, A. Fantin, D. Blachier, A. Toffoli, S. Loubriat, A. Roule, S. Lhostis, H. Feldis, G. Reimbold, T. Billon, B. De Salvo, L. Larcher, P. Pavan, D. Bensahel, P. Mazoyer, R. Annunziata, P. Zuliani, and F. Boulanger, "N-doped GeTe as performance booster for embedded phase-change memories," in *Proc. IEEE Int. Electron Dev. Meet. Tech. Dig.*, 2010, pp. 644–647.
- [21] H. Horii, J. H. Yi, J. H. Park, Y. H. Ha, I. G. Baek, S. O. Park, Y. N. Hwang, S. H. Lee, Y. T. Kim, K. H. Lee, U.-I. Chung, and J. T. Moon, "A novel cell technology using N-doped GeSbTe films for Phase-Change RAM," in *Proc. Symp. Very Large Scale Integr. Technol.*, 2003, pp. 177–178.
- [22] S. J. Ahn, Y. J. Song, C. W. Jeong, J. M. Shin, Y. Fai, Y. N. Hwang, S. H. Lee, K. C. Ryoo, S. Y. Lee, J. H. Park, H. Horii, Y. H. Hq, J. H. Yi, B. J. Kuh, G. H. Koh, G. T. Jeong, H. S. Jeong, K. Kim, and B. I. Ryu, "Highly manufacturable high density Phase Change Memory of 64 Mb and beyond," in *Proc. IEEE Int. Electron Dev. Meet. Tech. Dig.*, 2004, pp. 907–910.
- [23] G. W. Burr, M. J. Breitwisch, M. Franceschini, D. Garetto, K. Gopalakrishnan, B. Jackson, B. Kurdi, C. Lam, L. A. Lastras, A. Padilla, B. Rejendran, S. Raoux, and R. S. Shenoy, "Phase change memory technology," *J. Vacuum Sci. Technol. B*, vol. 28, no. 2, pp. 223–262, 2010.
- [24] W. Czubatyi, T. Lowrey, S. Kostylev, and I. Asano, "Current reduction in ovonic memory devices," in *Proc. Eur. Symp. Phase Change Ovon. Sci.*, 2006, pp. 143–152.
- [25] W. Czubatyi, S. J. Hudgens, C. Dennison, C. Shell, and T. Lowrey, "Nanocomposite phase-change memory alloys for very high temperature data retention," *IEEE Electron Dev. Lett.*, vol. 31, no. 8, pp. 869–871, Aug. 2010.
- [26] T. Morikawa, K. Kurotsuchi, M. Kinoshita, N. Matsuzaki, Y. Matsui, Y. Fuiisaki, S. Hanzawa, A. Kotabe, M. Terao, H. Moriya, T. Iwasaki, M. Matsuoka, F. Nitta, M. Moniwa, T. Koga, and N. Takaura, "Doped InGeTe phase change memory featuring stable operation and good data retention," in *Proc. IEEE Int. Electron Dev. Meet.Tech. Dig.*, 2007, pp. 307–310.
- [27] C. Cagli, J. Buckley, V. Jousseaume, T. Cabout, A. Salaun, H. Grampeix, J. F. Nodin, H. Feldis, A. Persico, J. Cluzel, P. Lorenzi, L. Massari, R. Rao, F. Irrera, F. Aussenac, C. Carabasse, M. Coue, P. Calka, E. Martinez, L. Perniola, P. Blaise, Z. Fang, Y. H. Yu, G. Ghibaudo, D. Deleruyelle, M. Bocquet, C. Mller, A. Padovani, O. Pirrotta, L. Vandelli, L. Larcher, G. Reimbold, and B. De Salvo, "Experimental and theoretical study of electrode effects in HfO2 based RRAM," in *Proc. IEEE Int. Electron Dev. Meet. Tech. Dig.*, 2011, pp. 658–661.
- [28] Y. Kim and J. Lee, "Reproducible resistance switching characteristics of hafnium oxide-based nonvolatile memory devices," *J. App. Phys.*, vol. 104, no. 11, pp. 114–115, Dec. 2008.

- [29] D. Sacchetto, M. Zervas, Y. Temiz, G. De Micheli, and Y. Leblebici, "Resistive programmable through silicon vias for reconfigurable 3D fabrics," *IEEE Trans. Nanotechnol.*, vol. 11, no. 1, pp. 8–11, Jan. 2012.
- [30] W. Zhu, T. P. Chen, Y. Liu, M. Yang, S. Zhang, W. L. Zhang, and S. Fung, "Charging-induced changes in reverse current-voltage characteristics of Al/Al-Rich Al<sub>2</sub>O<sub>3</sub> p-Si Diodes," *IEEE Trans. Electron Devices*, vol. 56, no. 9, pp. 2060–2064, Sep. 2009.
- [31] R. Waser and M. Aono, "Nanoionics-based resistive switching memories," *Nature Mater.*, vol. 6, pp. 833–840, Nov. 2007.
- [32] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, "The missing memristor found," vol. 453, pp. 80–83, 2008.
- [33] M. Reyboz, O. Rozeau, L. Perniola, and G. Betti Beneventi, "Compact Modeling of a PCRAM cell," presented at the MOS AK Workshop, Roma, Apr. 2010.
- [34] M. Bocquet, D. Deleruyelle, C. Muller, and J.-M. Portal, "Self-consistent physical modeling of set/reset operations in unipolar resistive-switching memories," *Appl. Phys. Lett.*, vol. 98, pp. 263507-1–263507-3, 2011.
- [35] International Technology Roadmap for Semiconductors, presented at the ITRS Winter Public Conference, HsinChu, Taiwan, 2011.
- [36] (2007). [Online]. Available: http://cadlab.cs.ucla.edu/~kirill/
- [37] (2007). [Online]. Available: http://www.eecs.berkeley.edu/~alanmi/abc/
- [38] J. Luu, J. Anderson, and J. Rose, "Architecture description and packing for logic blocks with hierarchy, modes and complex interconnect," in *Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays Conf.*, 2011, pp. 227–236.
- [39] (2009). [Online]. Available: http://www.eecg.utoronto.ca/vpr/
- [40] S. J. Wilton, "Architectures and algorithms for field-programmable gate arrays with embedded memory," Ph.D. dissertation, Dept. Electr. Comput. Eng., Univ. Toronto, 1997.
- [41] Y. Zhang et al., "An integrated phase change memory cell with Ge nanowire diode for cross-point memory," in Proc. IEEE Very Large Scale Integr. Tech. Dig., 2007, pp. 98–99.



**Pierre-Emmanuel Gaillardon** (S'10–M'11) received the Electrical Engineer degree from Ecole Superieure de Chimie Physique et Electronique de Lyon, Lyon, France, in 2008, the M.Sc. degree Institut national des sciences appliquées de Lyon, Lyon, France, 2008, and the Ph.D. degree in Electrical Engineering from the University of Lyon, Lyon, France, 2011.

He is currently with The École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, with Pr. Giovanni de Micheli as a Research Associate at the Laboratory of Integrated Systems. Previously, he

was a Research Assistant at Commissariat à l'Energie Atomique et aux Energies Alternatives. He is currently involded in the Nanosys project. His research activities and interests include digital architecture design based on emerging devices.

He is recipient of the C-Innov 2011 Best Thesis Award and the Nanoarch 2012 Best Paper Award. He has been serving as Technical Program Committee member for Nanoarch 2012 conferences and is a Reviewer for several journals (IEEE TRANSACTIONS ON NANOTECHNOLOGY and IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION) and conferences (International Conference on Electronics, Circuits and Systems 2012 and International Symposium on Circuits and Systems 2013).



**Davide Sacchetto** received the B.S. degree in physics engineering from Politecnico di Torino, Turin, Italy, in 2007, and the jointed M.S. degree in micro and nano technologies for Integrated Systems from the École Polytechnique Fédérale de Lausanne (EPFL), the Institut National Polytechnique de Grenoble and the Politecnico di Torino, in 2008. Since November 2008, he has been working toward the Ph.D. degree at the Microelectronic System Laboratory and the Integrated System Laboratory, EPFL.

His research interests include novel devices, investigating issues ranging from solid-state microfabrication to circuit implementation. He is currently involved in two projects, the ambipolar vertically-stacked Si nanowire transistors and the Resistive RAM (RRAM) co-integration with CMOS.



ories devices.

**Giovanni Betti Beneventi** (S'08–M'11) received the Ph.D. degree in micro and nanoelectronics from the Institut National Polytechnique de Grenoble, Grenoble, France, and Universita' degli studi di Modena and Reggio Emilia, Reggio Emilia, Italy, in 2011.

In the past, he was with STMicroelectronics, Numonyx, Commissariat à l'Energie Atomique et aux Energies Alternatives, and Micron Technology. He is currently a Research Fellow at Universita' degli Studi di Bologna, Bologna, Italy. Up to now, his research activities have been focused on phase-change mem-

**M. Haykel Ben Jamaa** (S'08–M'10) received a double M.S. degree in electrical engineering from Technische Universität München, Munich, Germany, and Ecole Centrale Paris, Paris, France, in 2004, and the Ph.D. degree from Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, in 2009.

He is currently with Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA-Leti), Grenoble, France. His previous research topics include design aspects for nanoelectronics with a tight link to emerging fabrication and packaging technolo-

gies, regular logic circuits such as field-programmable gate array, emerging memories, and 3-D integration. He is currently a Business Developer with CEA-Leti.

Dr. Jamaa was the recipient of the Electronic Design Automation Outstanding Dissertation Award at the Design, Automation, and Test in Europe (DATE) 2010. He was also engaged in many conferences as the TPC Member or Chair including DATE, Design Automation Conference, IEEE TRANSACTIONS ON NANOTECHNOLOGY, IEEE Electron Devices Meeting, Very-Large-Scale Integration (VLSI) System on Chip, and Great Lakes Symposium on VLSI.



Luca Perniola (M'10) was born in 1978 in Florence, in Italy. He received the Laurea degree in nuclear engineering from the Politecnico di Milano, Como, Italy, in 2002, and the Ph.D. degree from the University of Pisa, Pisa, Italy, and the Institut National Polytechnique de Grenoble, Grenoble, France, in 2005, with a thesis on modeling and electrical characterization of discrete-trap nonvolatile memory devices (i.e., nanocrystals).

Since 2005, he has been a Scientist in Commissariat à l'Energie Atomique et aux Energies Alterna-

tives, MINATEC Campus, Grenoble, France, where he has been researching on topics related to nonvolatile memory (NVM) field. In particular, he focused on device architecture impact on memory performances for nitride-based NVM (i.e., SONOS finFET), and more generally on resistive memories (i.e., PCM, OxRAM) with alternative active materials to boost memory performances.



**Fabien Clermidy** (M'10) received the Ph.D. degree from the Institut National Polytechnique de Grenoble, Grenoble, France, in 1999, and the Higher Degree Research Supervisor (supervisor position) from the same University, in 2011.

He is the Head of the Digital Design Laboratory in Commissariat à l'Energie Atomique et aux Energies Alternatives, Grenoble, France. He is currently in charge of a joint ST/LETI multicore project called P2012/STHORM for CEA-LETI. His research interests include communication infrastructure, power

management, telecommunication applications, 3-D design, FPGA architectures, and new technologies with a special focus on resistive memories. He has published more than 70 papers in peer-reviewed journals and conferences, 2 books and some book chapters, and is author or co-author of 14 patents.



Ian O'Connor (S'95–M'98–SM'07) received the B. Eng. degree in electronics engineering from the University of Essex, U.K., in 1990, the European M.Sc. degree in electronics engineering jointly from the University of Essex, U.K, the Ecole Supérieure d'Ingénieurs d'Electronique et d'Electrotechnique, Paris, France, and Universität Karlsruhe, Germany, in 1992, the Ph.D. degree in electronics from the University of Lille, France, in 1997, and the professoral dissertation (Habilitation à Diriger des Recherches) from Ecole Centrale de Lyon, France, in 2005.

He is currently Professor for heterogeneous and nanoelectronics systems design in the Department of Electronic, Electrical and Control Engineering, Ecole Centrale de Lyon, Lyon, France and Head of the Heterogeneous Systems Design Group, Lyon Institute of Nanotechnology, Lyon, France. In 2008, he also holding a position of an Adjunct Professor at the Ecole Polytechnique de Montréal, Montreal, Canada. His research interests include novel computing architectures based on emerging technologies, associated with methods for design exploration. He has authored or co-authored around 150 book chapters, journal publications, conference papers and patents, has held various positions of responsibility in the organization of several international conferences and has been workpackage leader or scientific coordinator for several national and European projects. He also serves as an expert with the French Observatory for Micro and Nano Technologies, International Federation for Information Processing WG10.5 (Design and Engineering of Electronic Systems), and Alliance for digital science and technology.



**Giovanni De Micheli** (F'94) received the Nuclear Eng. degree from Politecnico di Milano, in 1979, and the M.S. and Ph.D. degrees in electrical engineering and computer science from the University of California at Berkeley in 1980 and 1983, respectively.

He is a Professor and the Director of the Institute of Electrical Engineering and of the Integrated Systems Centre at EPF Lausanne, Lausanne, Switzerland. He is a Program Leader of the Nano-Tera.ch program. His research interests include several aspects of design technologies for integrated circuits and systems,

such as synthesis for emerging technologies, networks on chips and 3-D integration. He is also interested in heterogeneous platform design including electrical components and biosensors, as well as in data processing of biomedical information. He is author of *Synthesis and Optimization of Digital Circuits*, McGraw-Hill, 1994, co-author and/or co-editor of eight other books and of more than 500 technical articles. His citation h-index is 76 according to Google Scholar.

He is a Fellow of ACM and a member of the Academia Europaea. He is member of the Scientific Advisory Board of IMEC and STMicroelectronics. He is the recipient of the 2012 IEEE/CAS Mac Van Valkenburg award for contributions to theory, practice, and experimentation in design methods and tools and of the 2003 IEEE Emanuel Piore Award for contributions to computer-aided synthesis of digital systems. He received also the Golden Jubilee Medal for outstanding contributions to the IEEE Circuits and Systems Society in 2000, the D. Pederson Award for the best paper on the IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN (CAD)/INTEGRATED CIRCUITS AND SYSTEMS (ICAS), in 1987, and several Best Paper Awards, including Design Automation Conference (DAC) in 1983 and 1993, Design, Automation, and Test in Europe (DATE) in 2005, and Nanoscale Architectures in 2010 and 2012. He has been serving IEEE in several capacities, namely: Division 1 Director during 2008-2009, the Co-Founder and President Elect of the IEEE Council on Electronic Design Automation during 2005-2007, the President of the IEEE CAS Society in 2003, an Editor in Chief of the IEEE TRANSACTIONS ON CAD/ICAS during 1987-2001. He has been the Chair of several conferences, including DATE in 2010, Public Health Conferences in 2006, IEEE International Conference on Very Large Scale Integration in 2006, DAC in 2000, and IEEE International Conference on Computer Design in 1989.