LinkedIn Post Text: In this article, Gouri Sankar …

You need to login in order to view this content



Redirecting to the source: https://www.linkedin.com/feed/update/urn:li:activity:7154136238881456131/

LinkedIn Post Text: In this article, Gouri Sankar Kar, Subhali Subhechha and Attilio Belmonte present the first experimental demonstration of multilevel, multiply accumulate operations on IGZO-based 2-transistor-1-capacitor (2T1C) and 2T0C cells – important steps towards industrial adoption.Read more: https://lnkd.in/e26dih_m

…see more
Media Link: https://www.imec-int.com/en/articles/igzo-based-dram-energy-and-area-efficient-analog-memory-computing
Media Link Content: IGZO-based DRAM for energy and area-efficient analog in-memory computing
To Stories
Longread
IGZO-based DRAM for energy- and area-efficient analog in-memory computing
First experimental demonstration of multilevel, multiply accumulate operations on IGZO-based 2-transistor-1-capacitor (2T1C) and 2T0C cells
Summary
Indium-gallium-zinc-oxide (IGZO)-based two-transistor n-capacitor (2TnC) dynamic random-access memory (DRAM) cells are excellent candidates for analog in-memory computing.
They can accomplish the inference phase of machine learning applications much more efficiently than what is possible today.
In this article, the authors show how IGZO-based 2T1C and its capacitor-less variant (2T0C) can be optimized for high retention time. They demonstrate the possibility of multilevel programming and multiply accumulate operations – important steps towards industrial adoption.
In-memory computing: key to hardware-efficient machine learning
Machine learning
, a subset of artificial intelligence, has become integral to our lives. It allows us to learn and reason from data using techniques such as deep neural network algorithms. Machine learning enables data-intensive tasks such as image classification and language modeling, from which many new applications emerge.
There are two phases in the process of machine learning.
First
is the
training phase
, where intelligence is developed by storing and labeling information into weights – a computationally intensive operation usually performed in the cloud. During this phase, the machine-learning algorithm is fed with a given dataset. The
weights are optimized
until the neural network can make predictions with the desired level of accuracy.
In the
second phase
, referred to as
inference
, the machine uses the intelligence stored in the first phase to process previously unseen data. The dominant operations for inference are
matrix-vector multiplications
of a weight matrix and an input vector. For example, when a model has been trained for image classification, the input vector contains the pixels of the unknown images. The weight matrix comprises all the different parameters by which the images can be identified, stored as weights during the training phase. For large and complex problems, this matrix is organized into different layers. The input data are ‘forwarded’ through the neural net to calculate the output: a prediction of what’s contained in the image – a cat, a human, a car, for example.
On the technology side, inputs and weights are usually stored in
conventional memories
and fetched towards the processing unit to perform the multiplications. For complex problems, a gigantic amount of data thus needs to be moved around, compromising power efficiency and speed, and leaving a large carbon footprint.
However, much of this data traffic can be avoided if (some of) the computational work can be done in the memory itself. When implemented in an energy-efficient way, this
in-memory computing
reduces the dependence of the inference on the cloud – largely
improving latency and energy consumption
.
A generic architecture for analog in-memory computing
Unlike traditional memory operations, in-memory computation does not happen at the granularity of a single memory element. Instead, it is a
cumulative operation
performed on a group of memory devices, exploiting the array-level organization, the peripheral circuitry, and the control logic. The common step is a
multiply accumulate operation (MAC)
, which computes the product of two numbers and adds that product to an accumulator.
While in-memory computation can be performed digitally, this work focuses on its
analog implementation
, using actual current or charge values. Analog in-memory computing (AiMC) presents several
advantages over digital in-memory computing
. Provided that multilevel programming is possible, each cell can more easily represent several bits of information (in both weights and inputs), allowing to reduce the number of memory devices. Also, following Kirchoff’s circuit laws, working with charges or currents provides an almost natural way to do the MAC operations.
Figure 1: General concept of multi-vector multiplications for AiMC (as shown at IMW 2023)
In a
generic AiMC architecture
, the activation signals from the input (or from the previous layer) are first converted to analog signals using digital-to-analog converters (DACs) on the activation lines (figure 1). The analog activation (act
i
) is then multiplied with the weights (w
ij
) and stored in an array of memory cells. Each cell contributes w
ij
.act
i
as a current or charge to the summation line. On the summation line, the output is the sum of all the contributions. The output is then converted to digital values. After post-processing, the results are transferred to the next layer or a buffer memory.
In search of a suitable memory technology
Most AiMC-based machine learning systems today rely on conventional static random-access memory (SRAM) technology. But
SRAM-based solutions
have proven to be expensive, power-hungry, and challenging to scale for larger computational densities. To overcome these issues, the AI community is investigating
alternative memory technologies
.
At the 2019 ISSCC and IEDM conferences, imec presented a
benchmark study
of different memory device technologies for energy-efficient inference applications [1,2]. The analysis connected circuit design with technology options and requirements, projecting an energy efficiency of 10,000 tera-operations per second per Watt (TOPS/W), which is beyond the efficiency of the most advanced digital implementations. The researchers identified high cell resistance or low cell current, a low variation, and small cell area as key parameters.
These
specifications limit the use of
the most popular cell types, including spin-torque-transfer magnetic RAM (
STT-MRAM
) and resistive RAM (
ReRAM
). Resistive types of memories store the weights as conductance and encode activation as voltage levels. One of the issues with resistive memories is the IR or voltage drop occurring on both the activation and summation lines, affecting the output. Additionally, a selector device is required for optimized cell access within the array, increasing the cell area and challenges for voltage distribution. Phase change memory (
PCM or PCRAM
) is limited by similar issues. For spin-orbit torque MRAM (
SOT-MRAM
), the high current needed to switch the device and the cell’s low on/off ratio is a disadvantage but not necessarily a showstopper.
Of all investigated memory technologies, the imec researchers identified an indium-gallium-zinc-oxide
(IGZO)-based 2-transistor 1-capacitor (2T1C)
device as the
most promising
candidate for AiMC. The 2T1C cell, initially proposed for DRAM applications, has two main advantages over SRAM for AiMC applications. First, it enables significantly
lower standby power consumption
. Second, IGZO transistors can be processed in the chip’s back-end-of-line (BEOL), where they can be stacked on top of the peripheral circuit located in the front-end-of-line (FEOL). This way,
no FEOL footprint is required
for building the memory array. Further, the IGZO technology also allows stacking multiple cells on top of each other, enabling a
denser array
.
Engineering IGZO-based 2T1C devices for AiMC applications
At the 2023 International Memory Workshop (IMW), imec researchers addressed the
remaining challenges
: optimizing the gain cell’s
retention time
, exploring the possibility of
multilevel programming
, and demonstrating the
MAC operation
in an array configuration [3].
Each memory cell within the weight matrix consists of one capacitor and two IGZO transistors. One transistor serves as the
write transistor
, used to program the weight as a voltage on the (storage node) capacitor, connected to the gate of the second transistor. The second transistor is designed as the
read transistor
and acts as a current source element, allowing for a non-destructive read. The current through the read transistor depends on both the activation input and the weight stored in the
storage node capacitor
. This current, hence, naturally represents the output of the multiply operation (w
ij
.act
i
). Since the readout current is amplified compared to the storage charge flow, 2T1C cells are also referred to as ‘
gain cells
’.
Figure 2: Schematic of a 2T1C DRAM gain cell.
To be suitable for energy-efficient MAC operations, the three key components of the cell need to meet some target specifications: long retention time, low off currents, and suitable on currents.
The retention time of the gain cell determines how long the cell can retain the programmed weight. The longer the retention time, the less frequently the cell must be refreshed, benefiting power consumption. Also, a
long retention time
is required for multilevel operation, i.e., the ability to store different voltage levels on the storage node capacitor.
The
storage node capacitance
is determined by the external capacitor, the gate oxide capacitance of the read transistor, and a parasitic capacitance. The programmed weight can change due to leakage currents. This sets requirements on the leakage currents of the external capacitor and the IGZO transistors – requiring
low off currents
for the latter.
The read and write transistors mainly differ in the target
on current
. While a low on current is required for the read transistor to limit IR drop, the on current of the write transistor must be high enough to program the weight in a reasonable write time – i.e., > 1µA/µm.
Figure 3: Stack schematics of the write (left) and read (right) transistors (as shown at IMW 2023).
Amorphous IGZO-based transistors and capacitors have been
engineered
to meet the different criteria and have been fabricated at a 300 mm wafer scale. The presented solution is
CMOS and BEOL compatible
, with
no FEOL footprint
required for fabricating the memory array. The
write transistor
‘s high on current and low off current were achieved by adopting a gate last configuration with an oxygen tunnel module and raised source/drain contacts and by using a relatively thick gate dielectric (15 nm). The
read transistor
has a thinner a-IGZO channel (5 nm) and thinner gate dielectric (5 nm). For the
external capacitor
, the researchers implemented a 9 nm thick Al
2
O
3
-based metal-insulator-metal (MIM) capacitor.
High retention, multilevel programming, and MAC operation: experimental demonstration
As the read and write transistors are engineered differently, they can ideally be integrated on different layers, leveraging the 3D stackability of the IGZO transistors and facilitating denser arrays. To obtain a
proof-of-concept for MAC operations
, it is, however, sufficient to implement read and write transistors of similar design, i.e., the design of the write transistors.
First, the retention time and off current of a single 2T1C cell were measured. The experiments revealed a
retention time as high as 130 s
and a median off current as low as 1.5×10
-19
A/µm – originating from the low bandgap of the IGZO channel material.
Figure 4: Evolution of the storage node voltage (V
SN
) for multiple devices used for estimating retention and off current (as shown at IMW 2023).
To demonstrate multilevel operation, different devices were programmed to different weight levels, and the evolution of the storage node voltages was monitored. Even after 400s, distinct voltage levels could still be observed, showing the
ability for single-cell multilevel programming
.
Next, the 2T1C gain cells have been implemented in a 2×2
array configuration
to
verify the MAC operation
. The researchers observed increased read current on the summation line when activating two cells on the same activation line (with equally stored weights on the capacitor nodes). This current was almost equal to the sum of currents obtained after activating each cell individually. The results have been extended to 4×2 arrays. In another set of experiments, a change in the summation line’s current was observed when changing the stored weights or the activations. These measurements show that the 2T1C gain cells with IGZO can successfully be used for matrix-vector multiplications in machine-learning applications.
Figure 5: Multilevel MAC operation for a 2×2 array, with storage nodes programmed to different weights (as shown at IMW 2023).
From 2T1C to 2T0C to further reduce cost and area consumption
For the 2T1C cell, a high retention time was achieved by optimizing the transistors and the external capacitor for low off-current and high capacitance, respectively. But earlier work, carried out by imec in the frame of (3D) DRAM applications, proved that a long retention time could also be obtained in a
capacitor-less implementation
, i.e., in
2T0C gain cells
. Thanks to the
ultra-low off current
in IGZO transistors,
long retention
is achieved even by only using the gate stack of the read transistor as a storage capacitor. Leaving out the external capacitor has some notable
advantages
. It lowers the
cost
and, as the capacitor consumes a considerable
area
, results in an even smaller footprint. At IEDM 2021, imec presented an IGZO-based 2T0C DRAM cell with >10
3
s retention time, a consequence of the very low off current of the IGZO transistors [4].
Recently, the imec researchers further
improved the retention time
of IGZO-based 2T0C devices to
> 4.5 hours
and achieved an
off current
10
3
s retention, >10
11
cycles endurance and L
g
scalability down to 14nm,’ A. Belmonte et al., IEDM 2021
[5] ‘Lowest I
OFF
Necessary cookies Some cookies are required to provide core functionality. The website may not function properly if these cookies are not accepted.Preferences Preference cookies enables the web site to remember information to customize how the web site looks or behaves for each user. This may include storing selected currency, region, language or color theme.Analytical cookies Analytical cookies help us improve our website by collecting and reporting information on its usage.Marketing cookies Marketing cookies are used to track visitors across websites to allow publishers to display relevant and engaging advertisements. By enabling marketing cookies, you grant permission for personalized advertising across various platforms. Cookies used on the site are categorized and below you can read about each category and allow or deny some or all of them. When categories than have been previously allowed are disabled, all cookies assigned to that category will be removed from your browser. Additionally you can see a list of cookies assigned to each category and detailed information in the cookie declaration. Learn more Necessary cookies Some cookies are required to provide core functionality. The website may not function properly if these cookies are not accepted. Necessary cookies Name Hostname Vendor Expiry ai_session netzero.imec-int.com Microsoft 1 hour Preserves users states across page requests. __cf_bm .vimeo.com Cloudflare, Inc. 1 hour The __cf_bm cookie supports Cloudflare Bot Management by managing incoming traffic that matches criteria associated with bots. The cookie does not collect any personal data, and any information collected is subject to one-way encryption. cookiehub .imec-int.com CookieHub 365 days Used by CookieHub to store information about whether visitors have given or declined the use of cookie categories used on the site. __RequestVerificationToken forms.office.com Microsoft Session This cookie is set by ASP.NET and improves the security of the web site. It prevents Cross-Site Request Forgery-attacks and does not contain any user information. It is automatically removed when you turn off your web browser. Preferences Preference cookies enables the web site to remember information to customize how the web site looks or behaves for each user. This may include storing selected currency, region, language or color theme. Preferences Name Hostname Vendor Expiry lidc .linkedin.com LinkedIn Ireland Unlimited Company 1 day Used by LinkedIn for routing. li_gc .linkedin.com LinkedIn Ireland Unlimited Company 180 days Used by LinkedIn to store consent of guests regarding the use of cookies for non-essential purposes AWSALBCORS 6136076.global.siteimproveanalytics.io 7 days Amazon Web Services cookie. This cookie enables us to allocate server traffic to make the user experience as smooth as possible. A so-called load balancer is used to determine which server currently has the best availability. The information generated cannot identify you as an individual. vuid .vimeo.com 730 days These cookies are used by the Vimeo video player on websites. FormsWebSessionId forms.office.com 30 days The cookie is used if the visitor has filled in personal information on a formula. This information will be filled in automatically on other formulas. This process is used to optimize visitor experience. Analytical cookies Analytical cookies help us improve our website by collecting and reporting information on its usage. Analytical cookies Name Hostname Vendor Expiry _ga .imec-int.com Google Advertising Products 730 days Contains a unique identifier used by Google Analytics to determine that two distinct hits belong to the same user across browsing sessions. _gid .imec-int.com Google Advertising Products 1 day Contains a unique identifier used by Google Analytics to determine that two distinct hits belong to the same user across browsing sessions. _gat_ .imec-int.com Google Advertising Products 1 hour Used to throttle request rate. _ga_ .imec-int.com Google Advertising Products 730 days Contains a unique identifier used by Google Analytics to determine that two distinct hits belong to the same user across browsing sessions. nmstat .imec-int.com Siteimprove 1000 days This cookie is used to help record visitors' use of the website. It is used to collect statistics about site usage such as when the visitor last visited the site. This information is then used to improve the user experience on the website. This Siteimprove Analytics cookie contains a randomly generated ID used to recognize the browser when a visitor reads a page. The cookie contains no personal information and is used only for web analytics. FPLC .imec-int.com Google Advertising Products 20 hours Registers a unique ID that is used to generate statistical data on how the visitor uses the website. FPID .imec-int.com Google Advertising Products 730 days Registers statistical data on users' behaviour on the website. Used for internal analytics by the website operator. ai_user netzero.imec-int.com Microsoft 365 days Used by Microsoft Application Insights software to collect statistical usage and telemetry information. The cookie stores a unique identifier to recognize users on returning visits over time. RpsAuthNonce .forms.office.com Microsoft 30 days With this cookie, we enable placing Microsoft forms embedded on our website. It is used to distinguish users. We share this information with Microsoft. Marketing cookies Marketing cookies are used to track visitors across websites to allow publishers to display relevant and engaging advertisements. By enabling marketing cookies, you grant permission for personalized advertising across various platforms. Marketing cookies Name Hostname Vendor Expiry _fbp .imec-int.com Facebook 90 days Facebook Pixel advertising first-party cookie. Used by Facebook to track visits across websites to deliver a series of advertisement products such as real time bidding from third party advertisers. bcookie .linkedin.com LinkedIn Ireland Unlimited Company 365 days This is a Microsoft MSN 1st party cookie for sharing the content of the website via social media. ELOQUA .imec-int.com 397 days Registers a unique ID that identifies the user's device upon return visits. Used for auto-populating forms and to validate if a certain contact is registered to an email group. li_sugr .linkedin.com 90 days Used by LinkedIn to make a probabilistic match of a user's identity outside the Designated Countries ELOQUA .eloqua.com 397 days Registers a unique ID that identifies the user's device upon return visits. Used for auto-populating forms and to validate if a certain contact is registered to an email group. ELQSTATUS .eloqua.com 397 days Set by Eloqua to track individual visitors and their use of the site. It is set when you first visit the site and updated on subsequent visits. Save settings Cookie settings Links Inside Post Json: ["https://lnkd.in/e26dih_m"] Content For Links Inside Post Json: {"https://lnkd.in/e26dih_m": "IGZO-based DRAM for energy and area-efficient analog in-memory computing\nTo Stories\nLongread\nIGZO-based DRAM for energy- and area-efficient analog in-memory computing\nFirst experimental demonstration of multilevel, multiply accumulate operations on IGZO-based 2-transistor-1-capacitor (2T1C) and 2T0C cells\nSummary\nIndium-gallium-zinc-oxide (IGZO)-based two-transistor n-capacitor (2TnC) dynamic random-access memory (DRAM) cells are excellent candidates for analog in-memory computing.\nThey can accomplish the inference phase of machine learning applications much more efficiently than what is possible today.\nIn this article, the authors show how IGZO-based 2T1C and its capacitor-less variant (2T0C) can be optimized for high retention time. They demonstrate the possibility of multilevel programming and multiply accumulate operations \u2013 important steps towards industrial adoption.\nIn-memory computing: key to hardware-efficient machine learning\nMachine learning\n, a subset of artificial intelligence, has become integral to our lives. It allows us to learn and reason from data using techniques such as deep neural network algorithms. Machine learning enables data-intensive tasks such as image classification and language modeling, from which many new applications emerge.\nThere are two phases in the process of machine learning.\nFirst\nis the\ntraining phase\n, where intelligence is developed by storing and labeling information into weights \u2013 a computationally intensive operation usually performed in the cloud. During this phase, the machine-learning algorithm is fed with a given dataset. The\nweights are optimized\nuntil the neural network can make predictions with the desired level of accuracy.\nIn the\nsecond phase\n, referred to as\ninference\n, the machine uses the intelligence stored in the first phase to process previously unseen data. The dominant operations for inference are\nmatrix-vector multiplications\nof a weight matrix and an input vector. For example, when a model has been trained for image classification, the input vector contains the pixels of the unknown images. The weight matrix comprises all the different parameters by which the images can be identified, stored as weights during the training phase. For large and complex problems, this matrix is organized into different layers. The input data are \u2018forwarded\u2019 through the neural net to calculate the output: a prediction of what\u2019s contained in the image \u2013 a cat, a human, a car, for example.\nOn the technology side, inputs and weights are usually stored in\nconventional memories\nand fetched towards the processing unit to perform the multiplications. For complex problems, a gigantic amount of data thus needs to be moved around, compromising power efficiency and speed, and leaving a large carbon footprint.\nHowever, much of this data traffic can be avoided if (some of) the computational work can be done in the memory itself. When implemented in an energy-efficient way, this\nin-memory computing\nreduces the dependence of the inference on the cloud \u2013 largely\nimproving latency and energy consumption\n.\nA generic architecture for analog in-memory computing\nUnlike traditional memory operations, in-memory computation does not happen at the granularity of a single memory element. Instead, it is a\ncumulative operation\nperformed on a group of memory devices, exploiting the array-level organization, the peripheral circuitry, and the control logic. The common step is a\nmultiply accumulate operation (MAC)\n, which computes the product of two numbers and adds that product to an accumulator.\nWhile in-memory computation can be performed digitally, this work focuses on its\nanalog implementation\n, using actual current or charge values. Analog in-memory computing (AiMC) presents several\nadvantages over digital in-memory computing\n. Provided that multilevel programming is possible, each cell can more easily represent several bits of information (in both weights and inputs), allowing to reduce the number of memory devices. Also, following Kirchoff\u2019s circuit laws, working with charges or currents provides an almost natural way to do the MAC operations.\nFigure 1: General concept of multi-vector multiplications for AiMC (as shown at IMW 2023)\nIn a\ngeneric AiMC architecture\n, the activation signals from the input (or from the previous layer) are first converted to analog signals using digital-to-analog converters (DACs) on the activation lines (figure 1). The analog activation (act\ni\n) is then multiplied with the weights (w\nij\n) and stored in an array of memory cells. Each cell contributes w\nij\n.act\ni\nas a current or charge to the summation line. On the summation line, the output is the sum of all the contributions. The output is then converted to digital values. After post-processing, the results are transferred to the next layer or a buffer memory.\nIn search of a suitable memory technology\nMost AiMC-based machine learning systems today rely on conventional static random-access memory (SRAM) technology. But\nSRAM-based solutions\nhave proven to be expensive, power-hungry, and challenging to scale for larger computational densities. To overcome these issues, the AI community is investigating\nalternative memory technologies\n.\nAt the 2019 ISSCC and IEDM conferences, imec presented a\nbenchmark study\nof different memory device technologies for energy-efficient inference applications [1,2]. The analysis connected circuit design with technology options and requirements, projecting an energy efficiency of 10,000 tera-operations per second per Watt (TOPS/W), which is beyond the efficiency of the most advanced digital implementations. The researchers identified high cell resistance or low cell current, a low variation, and small cell area as key parameters.\nThese\nspecifications limit the use of\nthe most popular cell types, including spin-torque-transfer magnetic RAM (\nSTT-MRAM\n) and resistive RAM (\nReRAM\n). Resistive types of memories store the weights as conductance and encode activation as voltage levels. One of the issues with resistive memories is the IR or voltage drop occurring on both the activation and summation lines, affecting the output. Additionally, a selector device is required for optimized cell access within the array, increasing the cell area and challenges for voltage distribution. Phase change memory (\nPCM or PCRAM\n) is limited by similar issues. For spin-orbit torque MRAM (\nSOT-MRAM\n), the high current needed to switch the device and the cell\u2019s low on/off ratio is a disadvantage but not necessarily a showstopper.\nOf all investigated memory technologies, the imec researchers identified an indium-gallium-zinc-oxide\n(IGZO)-based 2-transistor 1-capacitor (2T1C)\ndevice as the\nmost promising\ncandidate for AiMC. The 2T1C cell, initially proposed for DRAM applications, has two main advantages over SRAM for AiMC applications. First, it enables significantly\nlower standby power consumption\n. Second, IGZO transistors can be processed in the chip\u2019s back-end-of-line (BEOL), where they can be stacked on top of the peripheral circuit located in the front-end-of-line (FEOL). This way,\nno FEOL footprint is required\nfor building the memory array. Further, the IGZO technology also allows stacking multiple cells on top of each other, enabling a\ndenser array\n.\nEngineering IGZO-based 2T1C devices for AiMC applications\nAt the 2023 International Memory Workshop (IMW), imec researchers addressed the\nremaining challenges\n: optimizing the gain cell\u2019s\nretention time\n, exploring the possibility of\nmultilevel programming\n, and demonstrating the\nMAC operation\nin an array configuration [3].\nEach memory cell within the weight matrix consists of one capacitor and two IGZO transistors. One transistor serves as the\nwrite transistor\n, used to program the weight as a voltage on the (storage node) capacitor, connected to the gate of the second transistor. The second transistor is designed as the\nread transistor\nand acts as a current source element, allowing for a non-destructive read. The current through the read transistor depends on both the activation input and the weight stored in the\nstorage node capacitor\n. This current, hence, naturally represents the output of the multiply operation (w\nij\n.act\ni\n). Since the readout current is amplified compared to the storage charge flow, 2T1C cells are also referred to as \u2018\ngain cells\n\u2019.\nFigure 2: Schematic of a 2T1C DRAM gain cell.\nTo be suitable for energy-efficient MAC operations, the three key components of the cell need to meet some target specifications: long retention time, low off currents, and suitable on currents.\nThe retention time of the gain cell determines how long the cell can retain the programmed weight. The longer the retention time, the less frequently the cell must be refreshed, benefiting power consumption. Also, a\nlong retention time\nis required for multilevel operation, i.e., the ability to store different voltage levels on the storage node capacitor.\nThe\nstorage node capacitance\nis determined by the external capacitor, the gate oxide capacitance of the read transistor, and a parasitic capacitance. The programmed weight can change due to leakage currents. This sets requirements on the leakage currents of the external capacitor and the IGZO transistors \u2013 requiring\nlow off currents\nfor the latter.\nThe read and write transistors mainly differ in the target\non current\n. While a low on current is required for the read transistor to limit IR drop, the on current of the write transistor must be high enough to program the weight in a reasonable write time \u2013 i.e., > 1\u00b5A/\u00b5m.\nFigure 3: Stack schematics of the write (left) and read (right) transistors (as shown at IMW 2023).\nAmorphous IGZO-based transistors and capacitors have been\nengineered\nto meet the different criteria and have been fabricated at a 300 mm wafer scale. The presented solution is\nCMOS and BEOL compatible\n, with\nno FEOL footprint\nrequired for fabricating the memory array. The\nwrite transistor\n’s high on current and low off current were achieved by adopting a gate last configuration with an oxygen tunnel module and raised source/drain contacts and by using a relatively thick gate dielectric (15 nm). The\nread transistor\nhas a thinner a-IGZO channel (5 nm) and thinner gate dielectric (5 nm). For the\nexternal capacitor\n, the researchers implemented a 9 nm thick Al\n2\nO\n3\n-based metal-insulator-metal (MIM) capacitor.\nHigh retention, multilevel programming, and MAC operation: experimental demonstration\nAs the read and write transistors are engineered differently, they can ideally be integrated on different layers, leveraging the 3D stackability of the IGZO transistors and facilitating denser arrays. To obtain a\nproof-of-concept for MAC operations\n, it is, however, sufficient to implement read and write transistors of similar design, i.e., the design of the write transistors.\nFirst, the retention time and off current of a single 2T1C cell were measured. The experiments revealed a\nretention time as high as 130 s\nand a median off current as low as 1.5×10\n-19\nA/\u00b5m \u2013 originating from the low bandgap of the IGZO channel material.\nFigure 4: Evolution of the storage node voltage (V\nSN\n) for multiple devices used for estimating retention and off current (as shown at IMW 2023).\nTo demonstrate multilevel operation, different devices were programmed to different weight levels, and the evolution of the storage node voltages was monitored. Even after 400s, distinct voltage levels could still be observed, showing the\nability for single-cell multilevel programming\n.\nNext, the 2T1C gain cells have been implemented in a 2×2\narray configuration\nto\nverify the MAC operation\n. The researchers observed increased read current on the summation line when activating two cells on the same activation line (with equally stored weights on the capacitor nodes). This current was almost equal to the sum of currents obtained after activating each cell individually. The results have been extended to 4×2 arrays. In another set of experiments, a change in the summation line\u2019s current was observed when changing the stored weights or the activations. These measurements show that the 2T1C gain cells with IGZO can successfully be used for matrix-vector multiplications in machine-learning applications.\nFigure 5: Multilevel MAC operation for a 2×2 array, with storage nodes programmed to different weights (as shown at IMW 2023).\nFrom 2T1C to 2T0C to further reduce cost and area consumption\nFor the 2T1C cell, a high retention time was achieved by optimizing the transistors and the external capacitor for low off-current and high capacitance, respectively. But earlier work, carried out by imec in the frame of (3D) DRAM applications, proved that a long retention time could also be obtained in a\ncapacitor-less implementation\n, i.e., in\n2T0C gain cells\n. Thanks to the\nultra-low off current\nin IGZO transistors,\nlong retention\nis achieved even by only using the gate stack of the read transistor as a storage capacitor. Leaving out the external capacitor has some notable\nadvantages\n. It lowers the\ncost\nand, as the capacitor consumes a considerable\narea\n, results in an even smaller footprint. At IEDM 2021, imec presented an IGZO-based 2T0C DRAM cell with >10\n3\ns retention time, a consequence of the very low off current of the IGZO transistors [4].\nRecently, the imec researchers further\nimproved the retention time\nof IGZO-based 2T0C devices to\n> 4.5 hours\nand achieved an\noff current\n10\n3\ns retention, >10\n11\ncycles endurance and L\ng\nscalability down to 14nm,\u2019 A. Belmonte et al., IEDM 2021\n[5] \u2018Lowest I\nOFF\n

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top