Cargando…

Efficient analysis of large datasets and sex bias with ADMIXTURE

BACKGROUND: A number of large genomic datasets are being generated for studies of human ancestry and diseases. The ADMIXTURE program is commonly used to infer individual ancestry from genomic data. RESULTS: We describe two improvements to the ADMIXTURE software. The first enables ADMIXTURE to infer...

Descripción completa

Detalles Bibliográficos
Autores principales: Shringarpure, Suyash S., Bustamante, Carlos D., Lange, Kenneth, Alexander, David H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877806/
https://www.ncbi.nlm.nih.gov/pubmed/27216439
http://dx.doi.org/10.1186/s12859-016-1082-x
_version_ 1782433450986307584
author Shringarpure, Suyash S.
Bustamante, Carlos D.
Lange, Kenneth
Alexander, David H.
author_facet Shringarpure, Suyash S.
Bustamante, Carlos D.
Lange, Kenneth
Alexander, David H.
author_sort Shringarpure, Suyash S.
collection PubMed
description BACKGROUND: A number of large genomic datasets are being generated for studies of human ancestry and diseases. The ADMIXTURE program is commonly used to infer individual ancestry from genomic data. RESULTS: We describe two improvements to the ADMIXTURE software. The first enables ADMIXTURE to infer ancestry for a new set of individuals using cluster allele frequencies from a reference set of individuals. Using data from the 1000 Genomes Project, we show that this allows ADMIXTURE to infer ancestry for 10,920 individuals in a few hours (a 5 × speedup). This mode also allows ADMIXTURE to correctly estimate individual ancestry and allele frequencies from a set of related individuals. The second modification allows ADMIXTURE to correctly handle X-chromosome (and other haploid) data from both males and females. We demonstrate increased power to detect sex-biased admixture in African-American individuals from the 1000 Genomes project using this extension. CONCLUSIONS: These modifications make ADMIXTURE more efficient and versatile, allowing users to extract more information from large genomic datasets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1082-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4877806
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48778062016-06-07 Efficient analysis of large datasets and sex bias with ADMIXTURE Shringarpure, Suyash S. Bustamante, Carlos D. Lange, Kenneth Alexander, David H. BMC Bioinformatics Software BACKGROUND: A number of large genomic datasets are being generated for studies of human ancestry and diseases. The ADMIXTURE program is commonly used to infer individual ancestry from genomic data. RESULTS: We describe two improvements to the ADMIXTURE software. The first enables ADMIXTURE to infer ancestry for a new set of individuals using cluster allele frequencies from a reference set of individuals. Using data from the 1000 Genomes Project, we show that this allows ADMIXTURE to infer ancestry for 10,920 individuals in a few hours (a 5 × speedup). This mode also allows ADMIXTURE to correctly estimate individual ancestry and allele frequencies from a set of related individuals. The second modification allows ADMIXTURE to correctly handle X-chromosome (and other haploid) data from both males and females. We demonstrate increased power to detect sex-biased admixture in African-American individuals from the 1000 Genomes project using this extension. CONCLUSIONS: These modifications make ADMIXTURE more efficient and versatile, allowing users to extract more information from large genomic datasets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1082-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-05-23 /pmc/articles/PMC4877806/ /pubmed/27216439 http://dx.doi.org/10.1186/s12859-016-1082-x Text en © Shringarpure et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Shringarpure, Suyash S.
Bustamante, Carlos D.
Lange, Kenneth
Alexander, David H.
Efficient analysis of large datasets and sex bias with ADMIXTURE
title Efficient analysis of large datasets and sex bias with ADMIXTURE
title_full Efficient analysis of large datasets and sex bias with ADMIXTURE
title_fullStr Efficient analysis of large datasets and sex bias with ADMIXTURE
title_full_unstemmed Efficient analysis of large datasets and sex bias with ADMIXTURE
title_short Efficient analysis of large datasets and sex bias with ADMIXTURE
title_sort efficient analysis of large datasets and sex bias with admixture
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877806/
https://www.ncbi.nlm.nih.gov/pubmed/27216439
http://dx.doi.org/10.1186/s12859-016-1082-x
work_keys_str_mv AT shringarpuresuyashs efficientanalysisoflargedatasetsandsexbiaswithadmixture
AT bustamantecarlosd efficientanalysisoflargedatasetsandsexbiaswithadmixture
AT langekenneth efficientanalysisoflargedatasetsandsexbiaswithadmixture
AT alexanderdavidh efficientanalysisoflargedatasetsandsexbiaswithadmixture