Cargando…
A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data
BACKGROUND: Low-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce MCPCA_PopGen to analyze population structure of low-depth sequencing data. RESULTS: The method optimizes the ch...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236193/ https://www.ncbi.nlm.nih.gov/pubmed/34174829 http://dx.doi.org/10.1186/s12859-021-04265-7 |
_version_ | 1783714488056283136 |
---|---|
author | Zhang, Miao Liu, Yiwen Zhou, Hua Watkins, Joseph Zhou, Jin |
author_facet | Zhang, Miao Liu, Yiwen Zhou, Hua Watkins, Joseph Zhou, Jin |
author_sort | Zhang, Miao |
collection | PubMed |
description | BACKGROUND: Low-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce MCPCA_PopGen to analyze population structure of low-depth sequencing data. RESULTS: The method optimizes the choice of nonlinear transformations of dosages to maximize the Ky Fan norm of the covariance matrix. The transformation incorporates the uncertainty in calling between heterozygotes and the common homozygotes for loci having a rare allele and is more linear when both variants are common. CONCLUSIONS: We apply MCPCA_PopGen to samples from two indigenous Siberian populations and reveal hidden population structure accurately using only a single chromosome. The MCPCA_PopGen package is available on https://github.com/yiwenstat/MCPCA_PopGen. SUPPLEMENTARY INFORMATION: The online version supplementary material available at 10.1186/s12859-021-04265-7. |
format | Online Article Text |
id | pubmed-8236193 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-82361932021-06-28 A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data Zhang, Miao Liu, Yiwen Zhou, Hua Watkins, Joseph Zhou, Jin BMC Bioinformatics Methodology Article BACKGROUND: Low-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce MCPCA_PopGen to analyze population structure of low-depth sequencing data. RESULTS: The method optimizes the choice of nonlinear transformations of dosages to maximize the Ky Fan norm of the covariance matrix. The transformation incorporates the uncertainty in calling between heterozygotes and the common homozygotes for loci having a rare allele and is more linear when both variants are common. CONCLUSIONS: We apply MCPCA_PopGen to samples from two indigenous Siberian populations and reveal hidden population structure accurately using only a single chromosome. The MCPCA_PopGen package is available on https://github.com/yiwenstat/MCPCA_PopGen. SUPPLEMENTARY INFORMATION: The online version supplementary material available at 10.1186/s12859-021-04265-7. BioMed Central 2021-06-26 /pmc/articles/PMC8236193/ /pubmed/34174829 http://dx.doi.org/10.1186/s12859-021-04265-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Zhang, Miao Liu, Yiwen Zhou, Hua Watkins, Joseph Zhou, Jin A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
title | A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
title_full | A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
title_fullStr | A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
title_full_unstemmed | A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
title_short | A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
title_sort | novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236193/ https://www.ncbi.nlm.nih.gov/pubmed/34174829 http://dx.doi.org/10.1186/s12859-021-04265-7 |
work_keys_str_mv | AT zhangmiao anovelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT liuyiwen anovelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT zhouhua anovelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT watkinsjoseph anovelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT zhoujin anovelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT zhangmiao novelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT liuyiwen novelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT zhouhua novelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT watkinsjoseph novelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata AT zhoujin novelnonlineardimensionreductionapproachtoinferpopulationstructureforlowcoveragesequencingdata |