Cargando…
A machine learning toolkit for genetic engineering attribution to facilitate biosecurity
The promise of biotechnology is tempered by its potential for accidental or deliberate misuse. Reliably identifying telltale signatures characteristic to different genetic designers, termed ‘genetic engineering attribution’, would deter misuse, yet is still considered unsolved. Here, we show that re...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722865/ https://www.ncbi.nlm.nih.gov/pubmed/33293535 http://dx.doi.org/10.1038/s41467-020-19612-0 |
_version_ | 1783620238472904704 |
---|---|
author | Alley, Ethan C. Turpin, Miles Liu, Andrew Bo Kulp-McDowall, Taylor Swett, Jacob Edison, Rey Von Stetina, Stephen E. Church, George M. Esvelt, Kevin M. |
author_facet | Alley, Ethan C. Turpin, Miles Liu, Andrew Bo Kulp-McDowall, Taylor Swett, Jacob Edison, Rey Von Stetina, Stephen E. Church, George M. Esvelt, Kevin M. |
author_sort | Alley, Ethan C. |
collection | PubMed |
description | The promise of biotechnology is tempered by its potential for accidental or deliberate misuse. Reliably identifying telltale signatures characteristic to different genetic designers, termed ‘genetic engineering attribution’, would deter misuse, yet is still considered unsolved. Here, we show that recurrent neural networks trained on DNA motifs and basic phenotype data can reach 70% attribution accuracy in distinguishing between over 1,300 labs. To make these models usable in practice, we introduce a framework for weighing predictions against other investigative evidence using calibration, and bring our model to within 1.6% of perfect calibration. Additionally, we demonstrate that simple models can accurately predict both the nation-state-of-origin and ancestor labs, forming the foundation of an integrated attribution toolkit which should promote responsible innovation and international security alike. |
format | Online Article Text |
id | pubmed-7722865 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-77228652020-12-11 A machine learning toolkit for genetic engineering attribution to facilitate biosecurity Alley, Ethan C. Turpin, Miles Liu, Andrew Bo Kulp-McDowall, Taylor Swett, Jacob Edison, Rey Von Stetina, Stephen E. Church, George M. Esvelt, Kevin M. Nat Commun Article The promise of biotechnology is tempered by its potential for accidental or deliberate misuse. Reliably identifying telltale signatures characteristic to different genetic designers, termed ‘genetic engineering attribution’, would deter misuse, yet is still considered unsolved. Here, we show that recurrent neural networks trained on DNA motifs and basic phenotype data can reach 70% attribution accuracy in distinguishing between over 1,300 labs. To make these models usable in practice, we introduce a framework for weighing predictions against other investigative evidence using calibration, and bring our model to within 1.6% of perfect calibration. Additionally, we demonstrate that simple models can accurately predict both the nation-state-of-origin and ancestor labs, forming the foundation of an integrated attribution toolkit which should promote responsible innovation and international security alike. Nature Publishing Group UK 2020-12-08 /pmc/articles/PMC7722865/ /pubmed/33293535 http://dx.doi.org/10.1038/s41467-020-19612-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Alley, Ethan C. Turpin, Miles Liu, Andrew Bo Kulp-McDowall, Taylor Swett, Jacob Edison, Rey Von Stetina, Stephen E. Church, George M. Esvelt, Kevin M. A machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
title | A machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
title_full | A machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
title_fullStr | A machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
title_full_unstemmed | A machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
title_short | A machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
title_sort | machine learning toolkit for genetic engineering attribution to facilitate biosecurity |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722865/ https://www.ncbi.nlm.nih.gov/pubmed/33293535 http://dx.doi.org/10.1038/s41467-020-19612-0 |
work_keys_str_mv | AT alleyethanc amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT turpinmiles amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT liuandrewbo amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT kulpmcdowalltaylor amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT swettjacob amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT edisonrey amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT vonstetinastephene amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT churchgeorgem amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT esveltkevinm amachinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT alleyethanc machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT turpinmiles machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT liuandrewbo machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT kulpmcdowalltaylor machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT swettjacob machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT edisonrey machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT vonstetinastephene machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT churchgeorgem machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity AT esveltkevinm machinelearningtoolkitforgeneticengineeringattributiontofacilitatebiosecurity |