Cargando…

Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis

Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration...

Descripción completa

Detalles Bibliográficos
Autores principales: Campos-Sánchez, Rebeca, Cremona, Marzia A., Pini, Alessia, Chiaromonte, Francesca, Makova, Kateryna D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4911145/
https://www.ncbi.nlm.nih.gov/pubmed/27309962
http://dx.doi.org/10.1371/journal.pcbi.1004956
_version_ 1782438094078738432
author Campos-Sánchez, Rebeca
Cremona, Marzia A.
Pini, Alessia
Chiaromonte, Francesca
Makova, Kateryna D.
author_facet Campos-Sánchez, Rebeca
Cremona, Marzia A.
Pini, Alessia
Chiaromonte, Francesca
Makova, Kateryna D.
author_sort Campos-Sánchez, Rebeca
collection PubMed
description Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-wide study of the most recently active ERVs in the human and mouse genome. We investigated 826 fixed and 1,065 in vitro HERV-Ks in human, and 1,624 fixed and 242 polymorphic ETns, as well as 3,964 fixed and 1,986 polymorphic IAPs, in mouse. We quantitated >40 human and mouse genomic features (e.g., non-B DNA structure, recombination rates, and histone modifications) in ±32 kb of these ERVs’ integration sites and in control regions, and analyzed them using Functional Data Analysis (FDA) methodology. In one of the first applications of FDA in genomics, we identified genomic scales and locations at which these features display their influence, and how they work in concert, to provide signals essential for integration and fixation of ERVs. The investigation of ERVs of different evolutionary ages (young in vitro and polymorphic ERVs, older fixed ERVs) allowed us to disentangle integration vs. fixation preferences. As a result of these analyses, we built a comprehensive model explaining the uneven distribution of ERVs along the genome. We found that ERVs integrate in late-replicating AT-rich regions with abundant microsatellites, mirror repeats, and repressive histone marks. Regions favoring fixation are depleted of genes and evolutionarily conserved elements, and have low recombination rates, reflecting the effects of purifying selection and ectopic recombination removing ERVs from the genome. In addition to providing these biological insights, our study demonstrates the power of exploiting multiple scales and localization with FDA. These powerful techniques are expected to be applicable to many other genomic investigations.
format Online
Article
Text
id pubmed-4911145
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49111452016-07-06 Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis Campos-Sánchez, Rebeca Cremona, Marzia A. Pini, Alessia Chiaromonte, Francesca Makova, Kateryna D. PLoS Comput Biol Research Article Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-wide study of the most recently active ERVs in the human and mouse genome. We investigated 826 fixed and 1,065 in vitro HERV-Ks in human, and 1,624 fixed and 242 polymorphic ETns, as well as 3,964 fixed and 1,986 polymorphic IAPs, in mouse. We quantitated >40 human and mouse genomic features (e.g., non-B DNA structure, recombination rates, and histone modifications) in ±32 kb of these ERVs’ integration sites and in control regions, and analyzed them using Functional Data Analysis (FDA) methodology. In one of the first applications of FDA in genomics, we identified genomic scales and locations at which these features display their influence, and how they work in concert, to provide signals essential for integration and fixation of ERVs. The investigation of ERVs of different evolutionary ages (young in vitro and polymorphic ERVs, older fixed ERVs) allowed us to disentangle integration vs. fixation preferences. As a result of these analyses, we built a comprehensive model explaining the uneven distribution of ERVs along the genome. We found that ERVs integrate in late-replicating AT-rich regions with abundant microsatellites, mirror repeats, and repressive histone marks. Regions favoring fixation are depleted of genes and evolutionarily conserved elements, and have low recombination rates, reflecting the effects of purifying selection and ectopic recombination removing ERVs from the genome. In addition to providing these biological insights, our study demonstrates the power of exploiting multiple scales and localization with FDA. These powerful techniques are expected to be applicable to many other genomic investigations. Public Library of Science 2016-06-16 /pmc/articles/PMC4911145/ /pubmed/27309962 http://dx.doi.org/10.1371/journal.pcbi.1004956 Text en © 2016 Campos-Sánchez et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Campos-Sánchez, Rebeca
Cremona, Marzia A.
Pini, Alessia
Chiaromonte, Francesca
Makova, Kateryna D.
Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis
title Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis
title_full Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis
title_fullStr Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis
title_full_unstemmed Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis
title_short Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis
title_sort integration and fixation preferences of human and mouse endogenous retroviruses uncovered with functional data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4911145/
https://www.ncbi.nlm.nih.gov/pubmed/27309962
http://dx.doi.org/10.1371/journal.pcbi.1004956
work_keys_str_mv AT campossanchezrebeca integrationandfixationpreferencesofhumanandmouseendogenousretrovirusesuncoveredwithfunctionaldataanalysis
AT cremonamarziaa integrationandfixationpreferencesofhumanandmouseendogenousretrovirusesuncoveredwithfunctionaldataanalysis
AT pinialessia integrationandfixationpreferencesofhumanandmouseendogenousretrovirusesuncoveredwithfunctionaldataanalysis
AT chiaromontefrancesca integrationandfixationpreferencesofhumanandmouseendogenousretrovirusesuncoveredwithfunctionaldataanalysis
AT makovakaterynad integrationandfixationpreferencesofhumanandmouseendogenousretrovirusesuncoveredwithfunctionaldataanalysis