Cargando…

An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments

Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still...

Descripción completa

Detalles Bibliográficos
Autores principales: Duitama, Jorge, Quintero, Juan Camilo, Cruz, Daniel Felipe, Quintero, Constanza, Hubmann, Georg, Foulquié-Moreno, Maria R., Verstrepen, Kevin J., Thevelein, Johan M., Tohme, Joe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3973327/
https://www.ncbi.nlm.nih.gov/pubmed/24413664
http://dx.doi.org/10.1093/nar/gkt1381
_version_ 1782309706654547968
author Duitama, Jorge
Quintero, Juan Camilo
Cruz, Daniel Felipe
Quintero, Constanza
Hubmann, Georg
Foulquié-Moreno, Maria R.
Verstrepen, Kevin J.
Thevelein, Johan M.
Tohme, Joe
author_facet Duitama, Jorge
Quintero, Juan Camilo
Cruz, Daniel Felipe
Quintero, Constanza
Hubmann, Georg
Foulquié-Moreno, Maria R.
Verstrepen, Kevin J.
Thevelein, Johan M.
Tohme, Joe
author_sort Duitama, Jorge
collection PubMed
description Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.
format Online
Article
Text
id pubmed-3973327
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-39733272014-04-04 An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments Duitama, Jorge Quintero, Juan Camilo Cruz, Daniel Felipe Quintero, Constanza Hubmann, Georg Foulquié-Moreno, Maria R. Verstrepen, Kevin J. Thevelein, Johan M. Tohme, Joe Nucleic Acids Res Methods Online Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species. Oxford University Press 2014-04 2014-01-11 /pmc/articles/PMC3973327/ /pubmed/24413664 http://dx.doi.org/10.1093/nar/gkt1381 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Duitama, Jorge
Quintero, Juan Camilo
Cruz, Daniel Felipe
Quintero, Constanza
Hubmann, Georg
Foulquié-Moreno, Maria R.
Verstrepen, Kevin J.
Thevelein, Johan M.
Tohme, Joe
An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
title An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
title_full An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
title_fullStr An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
title_full_unstemmed An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
title_short An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
title_sort integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3973327/
https://www.ncbi.nlm.nih.gov/pubmed/24413664
http://dx.doi.org/10.1093/nar/gkt1381
work_keys_str_mv AT duitamajorge anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT quinterojuancamilo anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT cruzdanielfelipe anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT quinteroconstanza anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT hubmanngeorg anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT foulquiemorenomariar anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT verstrepenkevinj anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT theveleinjohanm anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT tohmejoe anintegratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT duitamajorge integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT quinterojuancamilo integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT cruzdanielfelipe integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT quinteroconstanza integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT hubmanngeorg integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT foulquiemorenomariar integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT verstrepenkevinj integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT theveleinjohanm integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments
AT tohmejoe integratedframeworkfordiscoveryandgenotypingofgenomicvariantsfromhighthroughputsequencingexperiments