Cargando…

A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies

Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits...

Descripción completa

Detalles Bibliográficos
Autores principales: Genovese, Loredana M., Geraci, Filippo, Corrado, Lucia, Mangano, Eleonora, D'Aurizio, Romina, Bordoni, Roberta, Severgnini, Marco, Manzini, Giovanni, De Bellis, Gianluca, D'Alfonso, Sandra, Pellegrini, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5941971/
https://www.ncbi.nlm.nih.gov/pubmed/29770143
http://dx.doi.org/10.3389/fgene.2018.00155
_version_ 1783321387808587776
author Genovese, Loredana M.
Geraci, Filippo
Corrado, Lucia
Mangano, Eleonora
D'Aurizio, Romina
Bordoni, Roberta
Severgnini, Marco
Manzini, Giovanni
De Bellis, Gianluca
D'Alfonso, Sandra
Pellegrini, Marco
author_facet Genovese, Loredana M.
Geraci, Filippo
Corrado, Lucia
Mangano, Eleonora
D'Aurizio, Romina
Bordoni, Roberta
Severgnini, Marco
Manzini, Giovanni
De Bellis, Gianluca
D'Alfonso, Sandra
Pellegrini, Marco
author_sort Genovese, Loredana M.
collection PubMed
description Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits and diseases have been discovered to be strongly associated with or caused by specific PTR loci. PTR are further distinguished in two main classes: Short Tandem Repeats (STR) when the repeating unit has size up to 6 base pairs, and Variable Number Tandem Repeats (VNTR) for repeating units of size above 6 base pairs. As larger and larger populations are screened via high throughput sequencing projects, it becomes technically feasible and desirable to explore the association between PTR and a panoply of such traits and conditions. In order to facilitate these studies, we have devised a method for compiling catalogs of PTR from assembled genomes, and we have produced a catalog of PTR for genic regions (exons, introns, UTR and adjacent regions) of the human genome (GRCh38). We applied four different TR discovery software tools to uncover in the first phase 55,223,485 TR (after duplicate removal) in GRCh38, of which 373,173 were determined to be PTR in the second phase by comparison with five assembled human genomes. Of these, 263,266 are not included by state-of-the-art PTR catalogs. The new methodology is mainly based on a hierarchical and systematic application of alignment-based sequence comparisons to identify and measure the polymorphism of TR. While previous catalogs focus on the class of STR of small total size, we remove any size restrictions, aiming at the more general class of PTR, and we also target fuzzy TR by using specific detection tools. Similarly to other previous catalogs of human polymorphic loci, we focus our catalog toward applications in the discovery of disease-associated loci. Validation by cross-referencing with existing catalogs on common clinically-relevant loci shows good concordance. Overall, this proposed census of human PTR in genic regions is a shared resource (web accessible), complementary to existing catalogs, facilitating future genome-wide studies involving PTR.
format Online
Article
Text
id pubmed-5941971
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-59419712018-05-16 A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies Genovese, Loredana M. Geraci, Filippo Corrado, Lucia Mangano, Eleonora D'Aurizio, Romina Bordoni, Roberta Severgnini, Marco Manzini, Giovanni De Bellis, Gianluca D'Alfonso, Sandra Pellegrini, Marco Front Genet Genetics Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits and diseases have been discovered to be strongly associated with or caused by specific PTR loci. PTR are further distinguished in two main classes: Short Tandem Repeats (STR) when the repeating unit has size up to 6 base pairs, and Variable Number Tandem Repeats (VNTR) for repeating units of size above 6 base pairs. As larger and larger populations are screened via high throughput sequencing projects, it becomes technically feasible and desirable to explore the association between PTR and a panoply of such traits and conditions. In order to facilitate these studies, we have devised a method for compiling catalogs of PTR from assembled genomes, and we have produced a catalog of PTR for genic regions (exons, introns, UTR and adjacent regions) of the human genome (GRCh38). We applied four different TR discovery software tools to uncover in the first phase 55,223,485 TR (after duplicate removal) in GRCh38, of which 373,173 were determined to be PTR in the second phase by comparison with five assembled human genomes. Of these, 263,266 are not included by state-of-the-art PTR catalogs. The new methodology is mainly based on a hierarchical and systematic application of alignment-based sequence comparisons to identify and measure the polymorphism of TR. While previous catalogs focus on the class of STR of small total size, we remove any size restrictions, aiming at the more general class of PTR, and we also target fuzzy TR by using specific detection tools. Similarly to other previous catalogs of human polymorphic loci, we focus our catalog toward applications in the discovery of disease-associated loci. Validation by cross-referencing with existing catalogs on common clinically-relevant loci shows good concordance. Overall, this proposed census of human PTR in genic regions is a shared resource (web accessible), complementary to existing catalogs, facilitating future genome-wide studies involving PTR. Frontiers Media S.A. 2018-05-02 /pmc/articles/PMC5941971/ /pubmed/29770143 http://dx.doi.org/10.3389/fgene.2018.00155 Text en Copyright © 2018 Genovese, Geraci, Corrado, Mangano, D'Aurizio, Bordoni, Severgnini, Manzini, De Bellis, D'Alfonso and Pellegrini. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Genovese, Loredana M.
Geraci, Filippo
Corrado, Lucia
Mangano, Eleonora
D'Aurizio, Romina
Bordoni, Roberta
Severgnini, Marco
Manzini, Giovanni
De Bellis, Gianluca
D'Alfonso, Sandra
Pellegrini, Marco
A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
title A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
title_full A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
title_fullStr A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
title_full_unstemmed A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
title_short A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies
title_sort census of tandemly repeated polymorphic loci in genic regions through the comparative integration of human genome assemblies
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5941971/
https://www.ncbi.nlm.nih.gov/pubmed/29770143
http://dx.doi.org/10.3389/fgene.2018.00155
work_keys_str_mv AT genoveseloredanam acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT geracifilippo acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT corradolucia acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT manganoeleonora acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT daurizioromina acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT bordoniroberta acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT severgninimarco acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT manzinigiovanni acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT debellisgianluca acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT dalfonsosandra acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT pellegrinimarco acensusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT genoveseloredanam censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT geracifilippo censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT corradolucia censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT manganoeleonora censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT daurizioromina censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT bordoniroberta censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT severgninimarco censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT manzinigiovanni censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT debellisgianluca censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT dalfonsosandra censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies
AT pellegrinimarco censusoftandemlyrepeatedpolymorphiclociingenicregionsthroughthecomparativeintegrationofhumangenomeassemblies