Cargando…

Methods for identifying emergent concepts in deep neural networks

The present perspective discusses methods to detect concepts in internal representations (hidden layers) of deep neural networks (DNNs), such as network dissection, feature visualization, and testing with concept activation vectors (TCAV). I argue that these methods provide evidence that DNNs are ab...

Descripción completa

Detalles Bibliográficos
Autor principal: Räz, Tim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318355/
https://www.ncbi.nlm.nih.gov/pubmed/37409048
http://dx.doi.org/10.1016/j.patter.2023.100761
_version_ 1785068019055067136
author Räz, Tim
author_facet Räz, Tim
author_sort Räz, Tim
collection PubMed
description The present perspective discusses methods to detect concepts in internal representations (hidden layers) of deep neural networks (DNNs), such as network dissection, feature visualization, and testing with concept activation vectors (TCAV). I argue that these methods provide evidence that DNNs are able to learn non-trivial relations between concepts. However, the methods also require users to specify or detect concepts via (sets of) instances. This underdetermines the meaning of concepts, making the methods unreliable. The problem could be overcome, to some extent, by systematically combining the methods and by using synthetic datasets. The perspective also discusses how conceptual spaces—sets of concepts in internal representations—are shaped by a trade-off between predictive accuracy and compression. I argue that conceptual spaces are useful, or even necessary, to understand how concepts are formed in DNNs but that there is a lack of method for studying conceptual spaces.
format Online
Article
Text
id pubmed-10318355
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-103183552023-07-05 Methods for identifying emergent concepts in deep neural networks Räz, Tim Patterns (N Y) Perspective The present perspective discusses methods to detect concepts in internal representations (hidden layers) of deep neural networks (DNNs), such as network dissection, feature visualization, and testing with concept activation vectors (TCAV). I argue that these methods provide evidence that DNNs are able to learn non-trivial relations between concepts. However, the methods also require users to specify or detect concepts via (sets of) instances. This underdetermines the meaning of concepts, making the methods unreliable. The problem could be overcome, to some extent, by systematically combining the methods and by using synthetic datasets. The perspective also discusses how conceptual spaces—sets of concepts in internal representations—are shaped by a trade-off between predictive accuracy and compression. I argue that conceptual spaces are useful, or even necessary, to understand how concepts are formed in DNNs but that there is a lack of method for studying conceptual spaces. Elsevier 2023-06-09 /pmc/articles/PMC10318355/ /pubmed/37409048 http://dx.doi.org/10.1016/j.patter.2023.100761 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Perspective
Räz, Tim
Methods for identifying emergent concepts in deep neural networks
title Methods for identifying emergent concepts in deep neural networks
title_full Methods for identifying emergent concepts in deep neural networks
title_fullStr Methods for identifying emergent concepts in deep neural networks
title_full_unstemmed Methods for identifying emergent concepts in deep neural networks
title_short Methods for identifying emergent concepts in deep neural networks
title_sort methods for identifying emergent concepts in deep neural networks
topic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318355/
https://www.ncbi.nlm.nih.gov/pubmed/37409048
http://dx.doi.org/10.1016/j.patter.2023.100761
work_keys_str_mv AT raztim methodsforidentifyingemergentconceptsindeepneuralnetworks