Cargando…

Identification of pathogen genomic variants through an integrated pipeline

BACKGROUND: Whole-genome sequencing represents a powerful experimental tool for pathogen research. We present methods for the analysis of small eukaryotic genomes, including a streamlined system (called Platypus) for finding single nucleotide and copy number variants as well as recombination events....

Descripción completa

Detalles Bibliográficos
Autores principales: Manary, Micah J, Singhakul, Suriya S, Flannery, Erika L, Bopp, Selina ER, Corey, Victoria C, Bright, Andrew Taylor, McNamara, Case W, Walker, John R, Winzeler, Elizabeth A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3945619/
https://www.ncbi.nlm.nih.gov/pubmed/24589256
http://dx.doi.org/10.1186/1471-2105-15-63
_version_ 1782306553546670080
author Manary, Micah J
Singhakul, Suriya S
Flannery, Erika L
Bopp, Selina ER
Corey, Victoria C
Bright, Andrew Taylor
McNamara, Case W
Walker, John R
Winzeler, Elizabeth A
author_facet Manary, Micah J
Singhakul, Suriya S
Flannery, Erika L
Bopp, Selina ER
Corey, Victoria C
Bright, Andrew Taylor
McNamara, Case W
Walker, John R
Winzeler, Elizabeth A
author_sort Manary, Micah J
collection PubMed
description BACKGROUND: Whole-genome sequencing represents a powerful experimental tool for pathogen research. We present methods for the analysis of small eukaryotic genomes, including a streamlined system (called Platypus) for finding single nucleotide and copy number variants as well as recombination events. RESULTS: We have validated our pipeline using four sets of Plasmodium falciparum drug resistant data containing 26 clones from 3D7 and Dd2 background strains, identifying an average of 11 single nucleotide variants per clone. We also identify 8 copy number variants with contributions to resistance, and report for the first time that all analyzed amplification events are in tandem. CONCLUSIONS: The Platypus pipeline provides malaria researchers with a powerful tool to analyze short read sequencing data. It provides an accurate way to detect SNVs using known software packages, and a novel methodology for detection of CNVs, though it does not currently support detection of small indels. We have validated that the pipeline detects known SNVs in a variety of samples while filtering out spurious data. We bundle the methods into a freely available package.
format Online
Article
Text
id pubmed-3945619
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39456192014-03-20 Identification of pathogen genomic variants through an integrated pipeline Manary, Micah J Singhakul, Suriya S Flannery, Erika L Bopp, Selina ER Corey, Victoria C Bright, Andrew Taylor McNamara, Case W Walker, John R Winzeler, Elizabeth A BMC Bioinformatics Software BACKGROUND: Whole-genome sequencing represents a powerful experimental tool for pathogen research. We present methods for the analysis of small eukaryotic genomes, including a streamlined system (called Platypus) for finding single nucleotide and copy number variants as well as recombination events. RESULTS: We have validated our pipeline using four sets of Plasmodium falciparum drug resistant data containing 26 clones from 3D7 and Dd2 background strains, identifying an average of 11 single nucleotide variants per clone. We also identify 8 copy number variants with contributions to resistance, and report for the first time that all analyzed amplification events are in tandem. CONCLUSIONS: The Platypus pipeline provides malaria researchers with a powerful tool to analyze short read sequencing data. It provides an accurate way to detect SNVs using known software packages, and a novel methodology for detection of CNVs, though it does not currently support detection of small indels. We have validated that the pipeline detects known SNVs in a variety of samples while filtering out spurious data. We bundle the methods into a freely available package. BioMed Central 2014-03-03 /pmc/articles/PMC3945619/ /pubmed/24589256 http://dx.doi.org/10.1186/1471-2105-15-63 Text en Copyright © 2014 Manary et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Manary, Micah J
Singhakul, Suriya S
Flannery, Erika L
Bopp, Selina ER
Corey, Victoria C
Bright, Andrew Taylor
McNamara, Case W
Walker, John R
Winzeler, Elizabeth A
Identification of pathogen genomic variants through an integrated pipeline
title Identification of pathogen genomic variants through an integrated pipeline
title_full Identification of pathogen genomic variants through an integrated pipeline
title_fullStr Identification of pathogen genomic variants through an integrated pipeline
title_full_unstemmed Identification of pathogen genomic variants through an integrated pipeline
title_short Identification of pathogen genomic variants through an integrated pipeline
title_sort identification of pathogen genomic variants through an integrated pipeline
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3945619/
https://www.ncbi.nlm.nih.gov/pubmed/24589256
http://dx.doi.org/10.1186/1471-2105-15-63
work_keys_str_mv AT manarymicahj identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT singhakulsuriyas identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT flanneryerikal identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT boppselinaer identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT coreyvictoriac identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT brightandrewtaylor identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT mcnamaracasew identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT walkerjohnr identificationofpathogengenomicvariantsthroughanintegratedpipeline
AT winzelerelizabetha identificationofpathogengenomicvariantsthroughanintegratedpipeline