Cargando…

Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices

Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Me...

Descripción completa

Detalles Bibliográficos
Autores principales: Spoon, Katie, Tsai, Hsinyu, Chen, An, Rasch, Malte J., Ambrogio, Stefano, Mackin, Charles, Fasoli, Andrea, Friz, Alexander M., Narayanan, Pritish, Stanisavljevic, Milos, Burr, Geoffrey W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8287521/
https://www.ncbi.nlm.nih.gov/pubmed/34290595
http://dx.doi.org/10.3389/fncom.2021.675741
_version_ 1783723923839385600
author Spoon, Katie
Tsai, Hsinyu
Chen, An
Rasch, Malte J.
Ambrogio, Stefano
Mackin, Charles
Fasoli, Andrea
Friz, Alexander M.
Narayanan, Pritish
Stanisavljevic, Milos
Burr, Geoffrey W.
author_facet Spoon, Katie
Tsai, Hsinyu
Chen, An
Rasch, Malte J.
Ambrogio, Stefano
Mackin, Charles
Fasoli, Andrea
Friz, Alexander M.
Narayanan, Pritish
Stanisavljevic, Milos
Burr, Geoffrey W.
author_sort Spoon, Katie
collection PubMed
description Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6.
format Online
Article
Text
id pubmed-8287521
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-82875212021-07-20 Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices Spoon, Katie Tsai, Hsinyu Chen, An Rasch, Malte J. Ambrogio, Stefano Mackin, Charles Fasoli, Andrea Friz, Alexander M. Narayanan, Pritish Stanisavljevic, Milos Burr, Geoffrey W. Front Comput Neurosci Neuroscience Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6. Frontiers Media S.A. 2021-07-05 /pmc/articles/PMC8287521/ /pubmed/34290595 http://dx.doi.org/10.3389/fncom.2021.675741 Text en Copyright © 2021 Spoon, Tsai, Chen, Rasch, Ambrogio, Mackin, Fasoli, Friz, Narayanan, Stanisavljevic and Burr. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Spoon, Katie
Tsai, Hsinyu
Chen, An
Rasch, Malte J.
Ambrogio, Stefano
Mackin, Charles
Fasoli, Andrea
Friz, Alexander M.
Narayanan, Pritish
Stanisavljevic, Milos
Burr, Geoffrey W.
Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
title Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
title_full Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
title_fullStr Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
title_full_unstemmed Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
title_short Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
title_sort toward software-equivalent accuracy on transformer-based deep neural networks with analog memory devices
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8287521/
https://www.ncbi.nlm.nih.gov/pubmed/34290595
http://dx.doi.org/10.3389/fncom.2021.675741
work_keys_str_mv AT spoonkatie towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT tsaihsinyu towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT chenan towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT raschmaltej towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT ambrogiostefano towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT mackincharles towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT fasoliandrea towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT frizalexanderm towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT narayananpritish towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT stanisavljevicmilos towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices
AT burrgeoffreyw towardsoftwareequivalentaccuracyontransformerbaseddeepneuralnetworkswithanalogmemorydevices