Cargando…

Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach

The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation...

Descripción completa

Detalles Bibliográficos
Autores principales: Pattnaik, Swetansu, Vaidyanathan, Srividya, Pooja, Durgad G., Deepak, Sa, Panda, Binay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3253117/
https://www.ncbi.nlm.nih.gov/pubmed/22238694
http://dx.doi.org/10.1371/journal.pone.0030080
_version_ 1782220708886085632
author Pattnaik, Swetansu
Vaidyanathan, Srividya
Pooja, Durgad G.
Deepak, Sa
Panda, Binay
author_facet Pattnaik, Swetansu
Vaidyanathan, Srividya
Pooja, Durgad G.
Deepak, Sa
Panda, Binay
author_sort Pattnaik, Swetansu
collection PubMed
description The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.
format Online
Article
Text
id pubmed-3253117
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32531172012-01-11 Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach Pattnaik, Swetansu Vaidyanathan, Srividya Pooja, Durgad G. Deepak, Sa Panda, Binay PLoS One Research Article The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets. Public Library of Science 2012-01-06 /pmc/articles/PMC3253117/ /pubmed/22238694 http://dx.doi.org/10.1371/journal.pone.0030080 Text en Pattnaik et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Pattnaik, Swetansu
Vaidyanathan, Srividya
Pooja, Durgad G.
Deepak, Sa
Panda, Binay
Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
title Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
title_full Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
title_fullStr Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
title_full_unstemmed Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
title_short Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
title_sort customisation of the exome data analysis pipeline using a combinatorial approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3253117/
https://www.ncbi.nlm.nih.gov/pubmed/22238694
http://dx.doi.org/10.1371/journal.pone.0030080
work_keys_str_mv AT pattnaikswetansu customisationoftheexomedataanalysispipelineusingacombinatorialapproach
AT vaidyanathansrividya customisationoftheexomedataanalysispipelineusingacombinatorialapproach
AT poojadurgadg customisationoftheexomedataanalysispipelineusingacombinatorialapproach
AT deepaksa customisationoftheexomedataanalysispipelineusingacombinatorialapproach
AT pandabinay customisationoftheexomedataanalysispipelineusingacombinatorialapproach