Cargando…
A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data
Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications o...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6829143/ https://www.ncbi.nlm.nih.gov/pubmed/31455677 http://dx.doi.org/10.1534/g3.119.400596 |
_version_ | 1783465486360510464 |
---|---|
author | Hill, Tom Unckless, Robert L. |
author_facet | Hill, Tom Unckless, Robert L. |
author_sort | Hill, Tom |
collection | PubMed |
description | Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data. |
format | Online Article Text |
id | pubmed-6829143 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-68291432019-11-06 A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data Hill, Tom Unckless, Robert L. G3 (Bethesda) Software and Data Resources Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data. Genetics Society of America 2019-08-27 /pmc/articles/PMC6829143/ /pubmed/31455677 http://dx.doi.org/10.1534/g3.119.400596 Text en Copyright © 2019 Hill, Unckless http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software and Data Resources Hill, Tom Unckless, Robert L. A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title | A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_full | A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_fullStr | A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_full_unstemmed | A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_short | A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_sort | deep learning approach for detecting copy number variation in next-generation sequencing data |
topic | Software and Data Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6829143/ https://www.ncbi.nlm.nih.gov/pubmed/31455677 http://dx.doi.org/10.1534/g3.119.400596 |
work_keys_str_mv | AT hilltom adeeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata AT uncklessrobertl adeeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata AT hilltom deeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata AT uncklessrobertl deeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata |