Cargando…

Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome

BACKGROUND: Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of L...

Descripción completa

Detalles Bibliográficos
Autores principales: Alonso, Graciela, Rastrojo, Alberto, López-Pérez, Sara, Requena, Jose M., Aguado, Begoña
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4746890/
https://www.ncbi.nlm.nih.gov/pubmed/26857920
http://dx.doi.org/10.1186/s13071-016-1329-4
_version_ 1782414887229587456
author Alonso, Graciela
Rastrojo, Alberto
López-Pérez, Sara
Requena, Jose M.
Aguado, Begoña
author_facet Alonso, Graciela
Rastrojo, Alberto
López-Pérez, Sara
Requena, Jose M.
Aguado, Begoña
author_sort Alonso, Graciela
collection PubMed
description BACKGROUND: Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of Leishmania major, and subsequently the genome of other related species, was paramount for highlighting these peculiar molecular aspects. Recently, we carried out an analysis of gene expression by massive sequencing of RNA in the L. major promastigote, and data derived from that analysis were suggestive of possible errors in the current genome assembly for this Leishmania species. RESULTS: During the analysis by RNA-Seq of the transcriptome for L. major Friedlin strain, 163,714 reads could not be aligned with the reference genome. Thus, de novo assembly with these reads was carried out and the resulting contigs were further analyzed. After detailed homology searches using available databases, it was postulated that 15 contigs might correspond to genomic sequences lost during the initial genome assembly of the L. major Friedlin strain. This was experimentally confirmed by PCR amplification, cloning and sequencing of the new genomic regions. As a result, we have identified seven regions of the L. major (Friedlin) genome that were lost during the sequence assembly. This led to the uncovering of six new genes (LmjF.15.1475, LmjF.15.0285, LmjF.24.0765, LmjF.14.0860, LmjF.19.0305, and LmjF.27.2035), and correction of the annotation for two others (LmjF.15.1480 and LmjF.27.2030). Our data suggest that these genomic regions probably collapsed during the genome assembly due to the existence of gene duplications and/or repeated regions surrounding the missed genes. CONCLUSION: RNA-seq data helped to reconstruct some genomic regions misassembled during the L. major Friedlin genome assembly, which is otherwise quite robust. On the other hand, this study shows that data derived from massive sequencing approaches, including RNA-Seq, should be carefully inspected to improve current genome definition and gene annotations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13071-016-1329-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4746890
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47468902016-02-10 Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome Alonso, Graciela Rastrojo, Alberto López-Pérez, Sara Requena, Jose M. Aguado, Begoña Parasit Vectors Research BACKGROUND: Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of Leishmania major, and subsequently the genome of other related species, was paramount for highlighting these peculiar molecular aspects. Recently, we carried out an analysis of gene expression by massive sequencing of RNA in the L. major promastigote, and data derived from that analysis were suggestive of possible errors in the current genome assembly for this Leishmania species. RESULTS: During the analysis by RNA-Seq of the transcriptome for L. major Friedlin strain, 163,714 reads could not be aligned with the reference genome. Thus, de novo assembly with these reads was carried out and the resulting contigs were further analyzed. After detailed homology searches using available databases, it was postulated that 15 contigs might correspond to genomic sequences lost during the initial genome assembly of the L. major Friedlin strain. This was experimentally confirmed by PCR amplification, cloning and sequencing of the new genomic regions. As a result, we have identified seven regions of the L. major (Friedlin) genome that were lost during the sequence assembly. This led to the uncovering of six new genes (LmjF.15.1475, LmjF.15.0285, LmjF.24.0765, LmjF.14.0860, LmjF.19.0305, and LmjF.27.2035), and correction of the annotation for two others (LmjF.15.1480 and LmjF.27.2030). Our data suggest that these genomic regions probably collapsed during the genome assembly due to the existence of gene duplications and/or repeated regions surrounding the missed genes. CONCLUSION: RNA-seq data helped to reconstruct some genomic regions misassembled during the L. major Friedlin genome assembly, which is otherwise quite robust. On the other hand, this study shows that data derived from massive sequencing approaches, including RNA-Seq, should be carefully inspected to improve current genome definition and gene annotations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13071-016-1329-4) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-08 /pmc/articles/PMC4746890/ /pubmed/26857920 http://dx.doi.org/10.1186/s13071-016-1329-4 Text en © Alonso et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Alonso, Graciela
Rastrojo, Alberto
López-Pérez, Sara
Requena, Jose M.
Aguado, Begoña
Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
title Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
title_full Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
title_fullStr Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
title_full_unstemmed Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
title_short Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
title_sort resequencing and assembly of seven complex loci to improve the leishmania major (friedlin strain) reference genome
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4746890/
https://www.ncbi.nlm.nih.gov/pubmed/26857920
http://dx.doi.org/10.1186/s13071-016-1329-4
work_keys_str_mv AT alonsograciela resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome
AT rastrojoalberto resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome
AT lopezperezsara resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome
AT requenajosem resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome
AT aguadobegona resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome