Cargando…
Detecting tandem repeat variants in coding regions using code-adVNTR
The human genome contains more than one million tandem repeats (TRs), DNA sequences containing multiple approximate copies of a motif repeated contiguously. TRs account for significant genetic variation, with 50 + diseases attributed to changes in motif number. A few diseases have been to be caused...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9379575/ https://www.ncbi.nlm.nih.gov/pubmed/35982790 http://dx.doi.org/10.1016/j.isci.2022.104785 |
_version_ | 1784768702923669504 |
---|---|
author | Park, Jonghun Bakhtiari, Mehrdad Popp, Bernt Wiesener, Michael Bafna, Vineet |
author_facet | Park, Jonghun Bakhtiari, Mehrdad Popp, Bernt Wiesener, Michael Bafna, Vineet |
author_sort | Park, Jonghun |
collection | PubMed |
description | The human genome contains more than one million tandem repeats (TRs), DNA sequences containing multiple approximate copies of a motif repeated contiguously. TRs account for significant genetic variation, with 50 + diseases attributed to changes in motif number. A few diseases have been to be caused by small indels in variable number tandem repeats (VNTRs) including poly-cystic kidney disease type 1 (MCKD1) and monogenic type 1 diabetes. However, small indels in VNTRs are largely unexplored mainly due to the long and complex structure of VNTRs with multiple motifs. We developed a method, code-adVNTR, that utilizes multi-motif hidden Markov models to detect both, motif count variation and small indels, within VNTRs. In simulated data, code-adVNTR outperformed GATK-HaplotypeCaller in calling small indels within large VNTRs. We used code-adVNTR to characterize coding VNTRs in the 1000 genomes data identifying many population-specific variants, and to reliably call MUC1 mutations for MCKD1. |
format | Online Article Text |
id | pubmed-9379575 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-93795752022-08-17 Detecting tandem repeat variants in coding regions using code-adVNTR Park, Jonghun Bakhtiari, Mehrdad Popp, Bernt Wiesener, Michael Bafna, Vineet iScience Article The human genome contains more than one million tandem repeats (TRs), DNA sequences containing multiple approximate copies of a motif repeated contiguously. TRs account for significant genetic variation, with 50 + diseases attributed to changes in motif number. A few diseases have been to be caused by small indels in variable number tandem repeats (VNTRs) including poly-cystic kidney disease type 1 (MCKD1) and monogenic type 1 diabetes. However, small indels in VNTRs are largely unexplored mainly due to the long and complex structure of VNTRs with multiple motifs. We developed a method, code-adVNTR, that utilizes multi-motif hidden Markov models to detect both, motif count variation and small indels, within VNTRs. In simulated data, code-adVNTR outperformed GATK-HaplotypeCaller in calling small indels within large VNTRs. We used code-adVNTR to characterize coding VNTRs in the 1000 genomes data identifying many population-specific variants, and to reliably call MUC1 mutations for MCKD1. Elsevier 2022-07-19 /pmc/articles/PMC9379575/ /pubmed/35982790 http://dx.doi.org/10.1016/j.isci.2022.104785 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Park, Jonghun Bakhtiari, Mehrdad Popp, Bernt Wiesener, Michael Bafna, Vineet Detecting tandem repeat variants in coding regions using code-adVNTR |
title | Detecting tandem repeat variants in coding regions using code-adVNTR |
title_full | Detecting tandem repeat variants in coding regions using code-adVNTR |
title_fullStr | Detecting tandem repeat variants in coding regions using code-adVNTR |
title_full_unstemmed | Detecting tandem repeat variants in coding regions using code-adVNTR |
title_short | Detecting tandem repeat variants in coding regions using code-adVNTR |
title_sort | detecting tandem repeat variants in coding regions using code-advntr |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9379575/ https://www.ncbi.nlm.nih.gov/pubmed/35982790 http://dx.doi.org/10.1016/j.isci.2022.104785 |
work_keys_str_mv | AT parkjonghun detectingtandemrepeatvariantsincodingregionsusingcodeadvntr AT bakhtiarimehrdad detectingtandemrepeatvariantsincodingregionsusingcodeadvntr AT poppbernt detectingtandemrepeatvariantsincodingregionsusingcodeadvntr AT wiesenermichael detectingtandemrepeatvariantsincodingregionsusingcodeadvntr AT bafnavineet detectingtandemrepeatvariantsincodingregionsusingcodeadvntr |