Cargando…

Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing

Motivation: Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2–6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating S...

Descripción completa

Detalles Bibliográficos
Autores principales: Doi, Koichiro, Monjo, Taku, Hoang, Pham H., Yoshimura, Jun, Yurino, Hideaki, Mitsui, Jun, Ishiura, Hiroyuki, Takahashi, Yuji, Ichikawa, Yaeko, Goto, Jun, Tsuji, Shoji, Morishita, Shinichi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3957077/
https://www.ncbi.nlm.nih.gov/pubmed/24215022
http://dx.doi.org/10.1093/bioinformatics/btt647
_version_ 1782307763403096064
author Doi, Koichiro
Monjo, Taku
Hoang, Pham H.
Yoshimura, Jun
Yurino, Hideaki
Mitsui, Jun
Ishiura, Hiroyuki
Takahashi, Yuji
Ichikawa, Yaeko
Goto, Jun
Tsuji, Shoji
Morishita, Shinichi
author_facet Doi, Koichiro
Monjo, Taku
Hoang, Pham H.
Yoshimura, Jun
Yurino, Hideaki
Mitsui, Jun
Ishiura, Hiroyuki
Takahashi, Yuji
Ichikawa, Yaeko
Goto, Jun
Tsuji, Shoji
Morishita, Shinichi
author_sort Doi, Koichiro
collection PubMed
description Motivation: Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2–6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much longer than 100 bp, the typical length of short reads. Results: We propose ab initio procedures for sensing and locating long STRs promptly by using the frequency distribution of all STRs and paired-end read information. We validated the reproducibility of this method using biological replicates and used it to locate an STR associated with a brain disease (SCA31). Subsequently, we sequenced this STR site in 11 SCA31 samples using SMRT(TM) sequencing (Pacific Biosciences), determined 2.3–3.1 kb sequences at nucleotide resolution and revealed that (TGGAA)- and (TAAAATAGAA)-repeat expansions determined the instability of the repeat expansions associated with SCA31. Our method could also identify common STRs, (AAAG)- and (AAAAG)-repeat expansions, which are remarkably expanded at four positions in an SCA31 sample. This is the first proposed method for rapidly finding disease-associated long STRs in personal genomes using hybrid sequencing of short and long reads. Availability and implementation: Our TRhist software is available at http://trhist.gi.k.u-tokyo.ac.jp/. Contact: moris@cb.k.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3957077
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-39570772014-03-19 Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing Doi, Koichiro Monjo, Taku Hoang, Pham H. Yoshimura, Jun Yurino, Hideaki Mitsui, Jun Ishiura, Hiroyuki Takahashi, Yuji Ichikawa, Yaeko Goto, Jun Tsuji, Shoji Morishita, Shinichi Bioinformatics Original Papers Motivation: Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2–6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much longer than 100 bp, the typical length of short reads. Results: We propose ab initio procedures for sensing and locating long STRs promptly by using the frequency distribution of all STRs and paired-end read information. We validated the reproducibility of this method using biological replicates and used it to locate an STR associated with a brain disease (SCA31). Subsequently, we sequenced this STR site in 11 SCA31 samples using SMRT(TM) sequencing (Pacific Biosciences), determined 2.3–3.1 kb sequences at nucleotide resolution and revealed that (TGGAA)- and (TAAAATAGAA)-repeat expansions determined the instability of the repeat expansions associated with SCA31. Our method could also identify common STRs, (AAAG)- and (AAAAG)-repeat expansions, which are remarkably expanded at four positions in an SCA31 sample. This is the first proposed method for rapidly finding disease-associated long STRs in personal genomes using hybrid sequencing of short and long reads. Availability and implementation: Our TRhist software is available at http://trhist.gi.k.u-tokyo.ac.jp/. Contact: moris@cb.k.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-03-15 2013-11-08 /pmc/articles/PMC3957077/ /pubmed/24215022 http://dx.doi.org/10.1093/bioinformatics/btt647 Text en © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Doi, Koichiro
Monjo, Taku
Hoang, Pham H.
Yoshimura, Jun
Yurino, Hideaki
Mitsui, Jun
Ishiura, Hiroyuki
Takahashi, Yuji
Ichikawa, Yaeko
Goto, Jun
Tsuji, Shoji
Morishita, Shinichi
Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
title Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
title_full Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
title_fullStr Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
title_full_unstemmed Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
title_short Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
title_sort rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3957077/
https://www.ncbi.nlm.nih.gov/pubmed/24215022
http://dx.doi.org/10.1093/bioinformatics/btt647
work_keys_str_mv AT doikoichiro rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT monjotaku rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT hoangphamh rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT yoshimurajun rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT yurinohideaki rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT mitsuijun rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT ishiurahiroyuki rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT takahashiyuji rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT ichikawayaeko rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT gotojun rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT tsujishoji rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing
AT morishitashinichi rapiddetectionofexpandedshorttandemrepeatsinpersonalgenomicsusinghybridsequencing