Cargando…
Profiling the genome-wide landscape of tandem repeat expansions
Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735967/ https://www.ncbi.nlm.nih.gov/pubmed/31194863 http://dx.doi.org/10.1093/nar/gkz501 |
_version_ | 1783450443554226176 |
---|---|
author | Mousavi, Nima Shleizer-Burko, Sharona Yanicky, Richard Gymrek, Melissa |
author_facet | Mousavi, Nima Shleizer-Burko, Sharona Yanicky, Richard Gymrek, Melissa |
author_sort | Mousavi, Nima |
collection | PubMed |
description | Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS. |
format | Online Article Text |
id | pubmed-6735967 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-67359672019-09-16 Profiling the genome-wide landscape of tandem repeat expansions Mousavi, Nima Shleizer-Burko, Sharona Yanicky, Richard Gymrek, Melissa Nucleic Acids Res Methods Online Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS. Oxford University Press 2019-09-05 2019-06-13 /pmc/articles/PMC6735967/ /pubmed/31194863 http://dx.doi.org/10.1093/nar/gkz501 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Mousavi, Nima Shleizer-Burko, Sharona Yanicky, Richard Gymrek, Melissa Profiling the genome-wide landscape of tandem repeat expansions |
title | Profiling the genome-wide landscape of tandem repeat expansions |
title_full | Profiling the genome-wide landscape of tandem repeat expansions |
title_fullStr | Profiling the genome-wide landscape of tandem repeat expansions |
title_full_unstemmed | Profiling the genome-wide landscape of tandem repeat expansions |
title_short | Profiling the genome-wide landscape of tandem repeat expansions |
title_sort | profiling the genome-wide landscape of tandem repeat expansions |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735967/ https://www.ncbi.nlm.nih.gov/pubmed/31194863 http://dx.doi.org/10.1093/nar/gkz501 |
work_keys_str_mv | AT mousavinima profilingthegenomewidelandscapeoftandemrepeatexpansions AT shleizerburkosharona profilingthegenomewidelandscapeoftandemrepeatexpansions AT yanickyrichard profilingthegenomewidelandscapeoftandemrepeatexpansions AT gymrekmelissa profilingthegenomewidelandscapeoftandemrepeatexpansions |