Cargando…

Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)

Endogenous viral elements (EVEs) are viral sequences that have been integrated into the nuclear chromosomes. Endogenous pararetrovirus (EPRV) are a class of EVEs derived from DNA viruses of the family Caulimoviridae. Previous works based on a limited number of genome assemblies demonstrated that EPR...

Descripción completa

Detalles Bibliográficos
Autores principales: de Tomás, Carlos, Vicient, Carlos M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794742/
https://www.ncbi.nlm.nih.gov/pubmed/36589050
http://dx.doi.org/10.3389/fpls.2022.1011565
_version_ 1784860097526104064
author de Tomás, Carlos
Vicient, Carlos M.
author_facet de Tomás, Carlos
Vicient, Carlos M.
author_sort de Tomás, Carlos
collection PubMed
description Endogenous viral elements (EVEs) are viral sequences that have been integrated into the nuclear chromosomes. Endogenous pararetrovirus (EPRV) are a class of EVEs derived from DNA viruses of the family Caulimoviridae. Previous works based on a limited number of genome assemblies demonstrated that EPRVs are abundant in plants and are present in several species. The availability of genome sequences has been immensely increased in the recent years and we took advantage of these resources to have a more extensive view of the presence of EPRVs in plant genomes. We analyzed 278 genome assemblies corresponding to 267 species (254 from Viridiplantae) using tBLASTn against a collection of conserved domains of the Reverse Transcriptases (RT) of Caulimoviridae. We concentrated our search on complete and well-conserved RT domains with an uninterrupted ORF comprising the genetic information for at least 300 amino acids. We obtained 11.527 sequences from the genomes of 202 species spanning the whole Tracheophyta clade. These elements were grouped in 57 clusters and classified in 13 genera, including a newly proposed genus we called Wendovirus. Wendoviruses are characterized by the presence of four open reading frames and two of them encode for aspartic proteinases. Comparing plant genomes, we observed important differences between the plant families and genera in the number and type of EPRVs found. In general, florendoviruses are the most abundant and widely distributed EPRVs. The presence of multiple identical RT domain sequences in some of the genomes suggests their recent amplification.
format Online
Article
Text
id pubmed-9794742
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97947422022-12-29 Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae) de Tomás, Carlos Vicient, Carlos M. Front Plant Sci Plant Science Endogenous viral elements (EVEs) are viral sequences that have been integrated into the nuclear chromosomes. Endogenous pararetrovirus (EPRV) are a class of EVEs derived from DNA viruses of the family Caulimoviridae. Previous works based on a limited number of genome assemblies demonstrated that EPRVs are abundant in plants and are present in several species. The availability of genome sequences has been immensely increased in the recent years and we took advantage of these resources to have a more extensive view of the presence of EPRVs in plant genomes. We analyzed 278 genome assemblies corresponding to 267 species (254 from Viridiplantae) using tBLASTn against a collection of conserved domains of the Reverse Transcriptases (RT) of Caulimoviridae. We concentrated our search on complete and well-conserved RT domains with an uninterrupted ORF comprising the genetic information for at least 300 amino acids. We obtained 11.527 sequences from the genomes of 202 species spanning the whole Tracheophyta clade. These elements were grouped in 57 clusters and classified in 13 genera, including a newly proposed genus we called Wendovirus. Wendoviruses are characterized by the presence of four open reading frames and two of them encode for aspartic proteinases. Comparing plant genomes, we observed important differences between the plant families and genera in the number and type of EPRVs found. In general, florendoviruses are the most abundant and widely distributed EPRVs. The presence of multiple identical RT domain sequences in some of the genomes suggests their recent amplification. Frontiers Media S.A. 2022-12-14 /pmc/articles/PMC9794742/ /pubmed/36589050 http://dx.doi.org/10.3389/fpls.2022.1011565 Text en Copyright © 2022 de Tomás and Vicient https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
de Tomás, Carlos
Vicient, Carlos M.
Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)
title Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)
title_full Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)
title_fullStr Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)
title_full_unstemmed Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)
title_short Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae)
title_sort genome-wide identification of reverse transcriptase domains of recently inserted endogenous plant pararetrovirus (caulimoviridae)
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794742/
https://www.ncbi.nlm.nih.gov/pubmed/36589050
http://dx.doi.org/10.3389/fpls.2022.1011565
work_keys_str_mv AT detomascarlos genomewideidentificationofreversetranscriptasedomainsofrecentlyinsertedendogenousplantpararetroviruscaulimoviridae
AT vicientcarlosm genomewideidentificationofreversetranscriptasedomainsofrecentlyinsertedendogenousplantpararetroviruscaulimoviridae