Cargando…
Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing
The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9184528/ https://www.ncbi.nlm.nih.gov/pubmed/35680869 http://dx.doi.org/10.1038/s41467-022-30930-3 |
_version_ | 1784724540401647616 |
---|---|
author | Prodanov, Timofey Bansal, Vikas |
author_facet | Prodanov, Timofey Bansal, Vikas |
author_sort | Prodanov, Timofey |
collection | PubMed |
description | The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been implicated in monogenic and complex human diseases. We describe a computational tool, Parascopy, for estimating the aggregate and paralog-specific copy number of duplicated genes using whole-genome sequencing (WGS). Parascopy is an efficient method that jointly analyzes reads mapped to different repeat copies without the need for global realignment. It leverages multiple samples to mitigate sequencing bias and to identify reliable paralogous sequence variants (PSVs) that differentiate repeat copies. Analysis of WGS data for 2504 individuals from diverse populations showed that Parascopy is robust to sequencing bias, has higher accuracy compared to existing methods and enables prioritization of pathogenic copy number changes in duplicated genes. |
format | Online Article Text |
id | pubmed-9184528 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-91845282022-06-11 Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing Prodanov, Timofey Bansal, Vikas Nat Commun Article The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been implicated in monogenic and complex human diseases. We describe a computational tool, Parascopy, for estimating the aggregate and paralog-specific copy number of duplicated genes using whole-genome sequencing (WGS). Parascopy is an efficient method that jointly analyzes reads mapped to different repeat copies without the need for global realignment. It leverages multiple samples to mitigate sequencing bias and to identify reliable paralogous sequence variants (PSVs) that differentiate repeat copies. Analysis of WGS data for 2504 individuals from diverse populations showed that Parascopy is robust to sequencing bias, has higher accuracy compared to existing methods and enables prioritization of pathogenic copy number changes in duplicated genes. Nature Publishing Group UK 2022-06-09 /pmc/articles/PMC9184528/ /pubmed/35680869 http://dx.doi.org/10.1038/s41467-022-30930-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Prodanov, Timofey Bansal, Vikas Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
title | Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
title_full | Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
title_fullStr | Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
title_full_unstemmed | Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
title_short | Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
title_sort | robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9184528/ https://www.ncbi.nlm.nih.gov/pubmed/35680869 http://dx.doi.org/10.1038/s41467-022-30930-3 |
work_keys_str_mv | AT prodanovtimofey robustandaccurateestimationofparalogspecificcopynumberforduplicatedgenesusingwholegenomesequencing AT bansalvikas robustandaccurateestimationofparalogspecificcopynumberforduplicatedgenesusingwholegenomesequencing |