Cargando…

Profiling the genome-wide landscape of tandem repeat expansions

Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome...

Descripción completa

Detalles Bibliográficos
Autores principales: Mousavi, Nima, Shleizer-Burko, Sharona, Yanicky, Richard, Gymrek, Melissa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735967/
https://www.ncbi.nlm.nih.gov/pubmed/31194863
http://dx.doi.org/10.1093/nar/gkz501
_version_ 1783450443554226176
author Mousavi, Nima
Shleizer-Burko, Sharona
Yanicky, Richard
Gymrek, Melissa
author_facet Mousavi, Nima
Shleizer-Burko, Sharona
Yanicky, Richard
Gymrek, Melissa
author_sort Mousavi, Nima
collection PubMed
description Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS.
format Online
Article
Text
id pubmed-6735967
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-67359672019-09-16 Profiling the genome-wide landscape of tandem repeat expansions Mousavi, Nima Shleizer-Burko, Sharona Yanicky, Richard Gymrek, Melissa Nucleic Acids Res Methods Online Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS. Oxford University Press 2019-09-05 2019-06-13 /pmc/articles/PMC6735967/ /pubmed/31194863 http://dx.doi.org/10.1093/nar/gkz501 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Mousavi, Nima
Shleizer-Burko, Sharona
Yanicky, Richard
Gymrek, Melissa
Profiling the genome-wide landscape of tandem repeat expansions
title Profiling the genome-wide landscape of tandem repeat expansions
title_full Profiling the genome-wide landscape of tandem repeat expansions
title_fullStr Profiling the genome-wide landscape of tandem repeat expansions
title_full_unstemmed Profiling the genome-wide landscape of tandem repeat expansions
title_short Profiling the genome-wide landscape of tandem repeat expansions
title_sort profiling the genome-wide landscape of tandem repeat expansions
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735967/
https://www.ncbi.nlm.nih.gov/pubmed/31194863
http://dx.doi.org/10.1093/nar/gkz501
work_keys_str_mv AT mousavinima profilingthegenomewidelandscapeoftandemrepeatexpansions
AT shleizerburkosharona profilingthegenomewidelandscapeoftandemrepeatexpansions
AT yanickyrichard profilingthegenomewidelandscapeoftandemrepeatexpansions
AT gymrekmelissa profilingthegenomewidelandscapeoftandemrepeatexpansions