Cargando…

Identification of conserved and polymorphic STRs for personal genomes

BACKGROUND: Short tandem repeats (STRs) are abundant in human genomes. Numerous STRs have been shown to be associated with genetic diseases and gene regulatory functions, and have been selected as genetic markers for evolutionary and forensic analyses. High-throughput next generation sequencers have...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Chien-Ming, Sio, Chi-Pong, Lu, Yu-Lun, Chang, Hao-Teng, Hu, Chin-Hwa, Pai, Tun-Wen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304208/
https://www.ncbi.nlm.nih.gov/pubmed/25560225
http://dx.doi.org/10.1186/1471-2164-15-S10-S3
_version_ 1782354056286568448
author Chen, Chien-Ming
Sio, Chi-Pong
Lu, Yu-Lun
Chang, Hao-Teng
Hu, Chin-Hwa
Pai, Tun-Wen
author_facet Chen, Chien-Ming
Sio, Chi-Pong
Lu, Yu-Lun
Chang, Hao-Teng
Hu, Chin-Hwa
Pai, Tun-Wen
author_sort Chen, Chien-Ming
collection PubMed
description BACKGROUND: Short tandem repeats (STRs) are abundant in human genomes. Numerous STRs have been shown to be associated with genetic diseases and gene regulatory functions, and have been selected as genetic markers for evolutionary and forensic analyses. High-throughput next generation sequencers have fostered new cutting-edge computing techniques for genome-scale analyses, and cross-genome comparisons have facilitated the efficient identification of polymorphic STR markers for various applications. RESULTS: An automated and efficient system for detecting human polymorphic STRs at the genome scale is proposed in this study. Assembled contigs from next generation sequencing data were aligned and calibrated according to selected reference sequences. To verify identified polymorphic STRs, human genomes from the 1000 Genomes Project were employed for comprehensive analyses, and STR markers from the Combined DNA Index System (CODIS) and disease-related STR motifs were also applied as cases for evaluation. In addition, we analyzed STR variations for highly conserved homologous genes and human-unique genes. In total 477 polymorphic STRs were identified from 492 human-unique genes, among which 26 STRs were retrieved and clustered into three different groups for efficient comparison. CONCLUSIONS: We have developed an online system that efficiently identifies polymorphic STRs and provides novel distinguishable STR biomarkers for different levels of specificity. Candidate polymorphic STRs within a personal genome could be easily retrieved and compared to the constructed STR profile through query keywords, gene names, or assembled contigs.
format Online
Article
Text
id pubmed-4304208
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43042082015-02-09 Identification of conserved and polymorphic STRs for personal genomes Chen, Chien-Ming Sio, Chi-Pong Lu, Yu-Lun Chang, Hao-Teng Hu, Chin-Hwa Pai, Tun-Wen BMC Genomics Research BACKGROUND: Short tandem repeats (STRs) are abundant in human genomes. Numerous STRs have been shown to be associated with genetic diseases and gene regulatory functions, and have been selected as genetic markers for evolutionary and forensic analyses. High-throughput next generation sequencers have fostered new cutting-edge computing techniques for genome-scale analyses, and cross-genome comparisons have facilitated the efficient identification of polymorphic STR markers for various applications. RESULTS: An automated and efficient system for detecting human polymorphic STRs at the genome scale is proposed in this study. Assembled contigs from next generation sequencing data were aligned and calibrated according to selected reference sequences. To verify identified polymorphic STRs, human genomes from the 1000 Genomes Project were employed for comprehensive analyses, and STR markers from the Combined DNA Index System (CODIS) and disease-related STR motifs were also applied as cases for evaluation. In addition, we analyzed STR variations for highly conserved homologous genes and human-unique genes. In total 477 polymorphic STRs were identified from 492 human-unique genes, among which 26 STRs were retrieved and clustered into three different groups for efficient comparison. CONCLUSIONS: We have developed an online system that efficiently identifies polymorphic STRs and provides novel distinguishable STR biomarkers for different levels of specificity. Candidate polymorphic STRs within a personal genome could be easily retrieved and compared to the constructed STR profile through query keywords, gene names, or assembled contigs. BioMed Central 2014-12-12 /pmc/articles/PMC4304208/ /pubmed/25560225 http://dx.doi.org/10.1186/1471-2164-15-S10-S3 Text en Copyright © 2014 Chen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Chien-Ming
Sio, Chi-Pong
Lu, Yu-Lun
Chang, Hao-Teng
Hu, Chin-Hwa
Pai, Tun-Wen
Identification of conserved and polymorphic STRs for personal genomes
title Identification of conserved and polymorphic STRs for personal genomes
title_full Identification of conserved and polymorphic STRs for personal genomes
title_fullStr Identification of conserved and polymorphic STRs for personal genomes
title_full_unstemmed Identification of conserved and polymorphic STRs for personal genomes
title_short Identification of conserved and polymorphic STRs for personal genomes
title_sort identification of conserved and polymorphic strs for personal genomes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304208/
https://www.ncbi.nlm.nih.gov/pubmed/25560225
http://dx.doi.org/10.1186/1471-2164-15-S10-S3
work_keys_str_mv AT chenchienming identificationofconservedandpolymorphicstrsforpersonalgenomes
AT siochipong identificationofconservedandpolymorphicstrsforpersonalgenomes
AT luyulun identificationofconservedandpolymorphicstrsforpersonalgenomes
AT changhaoteng identificationofconservedandpolymorphicstrsforpersonalgenomes
AT huchinhwa identificationofconservedandpolymorphicstrsforpersonalgenomes
AT paitunwen identificationofconservedandpolymorphicstrsforpersonalgenomes