Cargando…
A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5941971/ https://www.ncbi.nlm.nih.gov/pubmed/29770143 http://dx.doi.org/10.3389/fgene.2018.00155 |
_version_ | 1783321387808587776 |
---|---|
author | Genovese, Loredana M. Geraci, Filippo Corrado, Lucia Mangano, Eleonora D'Aurizio, Romina Bordoni, Roberta Severgnini, Marco Manzini, Giovanni De Bellis, Gianluca D'Alfonso, Sandra Pellegrini, Marco |
author_facet | Genovese, Loredana M. Geraci, Filippo Corrado, Lucia Mangano, Eleonora D'Aurizio, Romina Bordoni, Roberta Severgnini, Marco Manzini, Giovanni De Bellis, Gianluca D'Alfonso, Sandra Pellegrini, Marco |
author_sort | Genovese, Loredana M. |
collection | PubMed |
description | Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits and diseases have been discovered to be strongly associated with or caused by specific PTR loci. PTR are further distinguished in two main classes: Short Tandem Repeats (STR) when the repeating unit has size up to 6 base pairs, and Variable Number Tandem Repeats (VNTR) for repeating units of size above 6 base pairs. As larger and larger populations are screened via high throughput sequencing projects, it becomes technically feasible and desirable to explore the association between PTR and a panoply of such traits and conditions. In order to facilitate these studies, we have devised a method for compiling catalogs of PTR from assembled genomes, and we have produced a catalog of PTR for genic regions (exons, introns, UTR and adjacent regions) of the human genome (GRCh38). We applied four different TR discovery software tools to uncover in the first phase 55,223,485 TR (after duplicate removal) in GRCh38, of which 373,173 were determined to be PTR in the second phase by comparison with five assembled human genomes. Of these, 263,266 are not included by state-of-the-art PTR catalogs. The new methodology is mainly based on a hierarchical and systematic application of alignment-based sequence comparisons to identify and measure the polymorphism of TR. While previous catalogs focus on the class of STR of small total size, we remove any size restrictions, aiming at the more general class of PTR, and we also target fuzzy TR by using specific detection tools. Similarly to other previous catalogs of human polymorphic loci, we focus our catalog toward applications in the discovery of disease-associated loci. Validation by cross-referencing with existing catalogs on common clinically-relevant loci shows good concordance. Overall, this proposed census of human PTR in genic regions is a shared resource (web accessible), complementary to existing catalogs, facilitating future genome-wide studies involving PTR. |
format | Online Article Text |
id | pubmed-5941971 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-59419712018-05-16 A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies Genovese, Loredana M. Geraci, Filippo Corrado, Lucia Mangano, Eleonora D'Aurizio, Romina Bordoni, Roberta Severgnini, Marco Manzini, Giovanni De Bellis, Gianluca D'Alfonso, Sandra Pellegrini, Marco Front Genet Genetics Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits and diseases have been discovered to be strongly associated with or caused by specific PTR loci. PTR are further distinguished in two main classes: Short Tandem Repeats (STR) when the repeating unit has size up to 6 base pairs, and Variable Number Tandem Repeats (VNTR) for repeating units of size above 6 base pairs. As larger and larger populations are screened via high throughput sequencing projects, it becomes technically feasible and desirable to explore the association between PTR and a panoply of such traits and conditions. In order to facilitate these studies, we have devised a method for compiling catalogs of PTR from assembled genomes, and we have produced a catalog of PTR for genic regions (exons, introns, UTR and adjacent regions) of the human genome (GRCh38). We applied four different TR discovery software tools to uncover in the first phase 55,223,485 TR (after duplicate removal) in GRCh38, of which 373,173 were determined to be PTR in the second phase by comparison with five assembled human genomes. Of these, 263,266 are not included by state-of-the-art PTR catalogs. The new methodology is mainly based on a hierarchical and systematic application of alignment-based sequence comparisons to identify and measure the polymorphism of TR. While previous catalogs focus on the class of STR of small total size, we remove any size restrictions, aiming at the more general class of PTR, and we also target fuzzy TR by using specific detection tools. Similarly to other previous catalogs of human polymorphic loci, we focus our catalog toward applications in the discovery of disease-associated loci. Validation by cross-referencing with existing catalogs on common clinically-relevant loci shows good concordance. Overall, this proposed census of human PTR in genic regions is a shared resource (web accessible), complementary to existing catalogs, facilitating future genome-wide studies involving PTR. Frontiers Media S.A. 2018-05-02 /pmc/articles/PMC5941971/ /pubmed/29770143 http://dx.doi.org/10.3389/fgene.2018.00155 Text en Copyright © 2018 Genovese, Geraci, Corrado, Mangano, D'Aurizio, Bordoni, Severgnini, Manzini, De Bellis, D'Alfonso and Pellegrini. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Genovese, Loredana M. Geraci, Filippo Corrado, Lucia Mangano, Eleonora D'Aurizio, Romina Bordoni, Roberta Severgnini, Marco Manzini, Giovanni De Bellis, Gianluca D'Alfonso, Sandra Pellegrini, Marco A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies |
title | A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies |
title_full | A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies |
title_fullStr | A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies |
title_full_unstemmed | A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies |
title_short | A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies |
title_sort | census of tandemly repeated polymorphic loci in genic regions through the comparative integration of human genome assemblies |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5941971/ https://www.ncbi.nlm.nih.gov/pubmed/29770143 http://dx.doi.org/10.3389/fgene.2018.00155 |
work_keys_str_mv | AT genoveseloredanam acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT geracifilippo acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT corradolucia acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT manganoeleonora acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT daurizioromina acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT bordoniroberta acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT severgninimarco acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT manzinigiovanni acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT debellisgianluca acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT dalfonsosandra acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT pellegrinimarco acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT genoveseloredanam censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT geracifilippo censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT corradolucia censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT manganoeleonora censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT daurizioromina censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT bordoniroberta censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT severgninimarco censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT manzinigiovanni censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT debellisgianluca censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT dalfonsosandra censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies AT pellegrinimarco censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies |