Cargando…
StrAuto: automation and parallelization of STRUCTURE analysis
BACKGROUND: Population structure inference using the software STRUCTURE has become an integral part of population genetic studies covering a broad spectrum of taxa including humans. The ever-expanding size of genetic data sets poses computational challenges for this analysis. Although at least one t...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5366143/ https://www.ncbi.nlm.nih.gov/pubmed/28340552 http://dx.doi.org/10.1186/s12859-017-1593-0 |
_version_ | 1782517537487978496 |
---|---|
author | Chhatre, Vikram E. Emerson, Kevin J. |
author_facet | Chhatre, Vikram E. Emerson, Kevin J. |
author_sort | Chhatre, Vikram E. |
collection | PubMed |
description | BACKGROUND: Population structure inference using the software STRUCTURE has become an integral part of population genetic studies covering a broad spectrum of taxa including humans. The ever-expanding size of genetic data sets poses computational challenges for this analysis. Although at least one tool currently implements parallel computing to reduce computational overload of this analysis, it does not fully automate the use of replicate STRUCTURE analysis runs required for downstream inference of optimal K. There is pressing need for a tool that can deploy population structure analysis on high performance computing clusters. RESULTS: We present an updated version of the popular Python program StrAuto, to streamline population structure analysis using parallel computing. StrAuto implements a pipeline that combines STRUCTURE analysis with the Evanno Δ K analysis and visualization of results using STRUCTURE HARVESTER. Using benchmarking tests, we demonstrate that StrAuto significantly reduces the computational time needed to perform iterative STRUCTURE analysis by distributing runs over two or more processors. CONCLUSION: StrAuto is the first tool to integrate STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation – a set up ideal for deployment on computing clusters. StrAuto is distributed under the GNU GPL (General Public License) and available to download from http://strauto.popgen.org. |
format | Online Article Text |
id | pubmed-5366143 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-53661432017-03-28 StrAuto: automation and parallelization of STRUCTURE analysis Chhatre, Vikram E. Emerson, Kevin J. BMC Bioinformatics Software BACKGROUND: Population structure inference using the software STRUCTURE has become an integral part of population genetic studies covering a broad spectrum of taxa including humans. The ever-expanding size of genetic data sets poses computational challenges for this analysis. Although at least one tool currently implements parallel computing to reduce computational overload of this analysis, it does not fully automate the use of replicate STRUCTURE analysis runs required for downstream inference of optimal K. There is pressing need for a tool that can deploy population structure analysis on high performance computing clusters. RESULTS: We present an updated version of the popular Python program StrAuto, to streamline population structure analysis using parallel computing. StrAuto implements a pipeline that combines STRUCTURE analysis with the Evanno Δ K analysis and visualization of results using STRUCTURE HARVESTER. Using benchmarking tests, we demonstrate that StrAuto significantly reduces the computational time needed to perform iterative STRUCTURE analysis by distributing runs over two or more processors. CONCLUSION: StrAuto is the first tool to integrate STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation – a set up ideal for deployment on computing clusters. StrAuto is distributed under the GNU GPL (General Public License) and available to download from http://strauto.popgen.org. BioMed Central 2017-03-24 /pmc/articles/PMC5366143/ /pubmed/28340552 http://dx.doi.org/10.1186/s12859-017-1593-0 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Chhatre, Vikram E. Emerson, Kevin J. StrAuto: automation and parallelization of STRUCTURE analysis |
title | StrAuto: automation and parallelization of STRUCTURE analysis |
title_full | StrAuto: automation and parallelization of STRUCTURE analysis |
title_fullStr | StrAuto: automation and parallelization of STRUCTURE analysis |
title_full_unstemmed | StrAuto: automation and parallelization of STRUCTURE analysis |
title_short | StrAuto: automation and parallelization of STRUCTURE analysis |
title_sort | strauto: automation and parallelization of structure analysis |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5366143/ https://www.ncbi.nlm.nih.gov/pubmed/28340552 http://dx.doi.org/10.1186/s12859-017-1593-0 |
work_keys_str_mv | AT chhatrevikrame strautoautomationandparallelizationofstructureanalysis AT emersonkevinj strautoautomationandparallelizationofstructureanalysis |