Cargando…
Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome
BACKGROUND: Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of L...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4746890/ https://www.ncbi.nlm.nih.gov/pubmed/26857920 http://dx.doi.org/10.1186/s13071-016-1329-4 |
_version_ | 1782414887229587456 |
---|---|
author | Alonso, Graciela Rastrojo, Alberto López-Pérez, Sara Requena, Jose M. Aguado, Begoña |
author_facet | Alonso, Graciela Rastrojo, Alberto López-Pérez, Sara Requena, Jose M. Aguado, Begoña |
author_sort | Alonso, Graciela |
collection | PubMed |
description | BACKGROUND: Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of Leishmania major, and subsequently the genome of other related species, was paramount for highlighting these peculiar molecular aspects. Recently, we carried out an analysis of gene expression by massive sequencing of RNA in the L. major promastigote, and data derived from that analysis were suggestive of possible errors in the current genome assembly for this Leishmania species. RESULTS: During the analysis by RNA-Seq of the transcriptome for L. major Friedlin strain, 163,714 reads could not be aligned with the reference genome. Thus, de novo assembly with these reads was carried out and the resulting contigs were further analyzed. After detailed homology searches using available databases, it was postulated that 15 contigs might correspond to genomic sequences lost during the initial genome assembly of the L. major Friedlin strain. This was experimentally confirmed by PCR amplification, cloning and sequencing of the new genomic regions. As a result, we have identified seven regions of the L. major (Friedlin) genome that were lost during the sequence assembly. This led to the uncovering of six new genes (LmjF.15.1475, LmjF.15.0285, LmjF.24.0765, LmjF.14.0860, LmjF.19.0305, and LmjF.27.2035), and correction of the annotation for two others (LmjF.15.1480 and LmjF.27.2030). Our data suggest that these genomic regions probably collapsed during the genome assembly due to the existence of gene duplications and/or repeated regions surrounding the missed genes. CONCLUSION: RNA-seq data helped to reconstruct some genomic regions misassembled during the L. major Friedlin genome assembly, which is otherwise quite robust. On the other hand, this study shows that data derived from massive sequencing approaches, including RNA-Seq, should be carefully inspected to improve current genome definition and gene annotations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13071-016-1329-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4746890 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47468902016-02-10 Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome Alonso, Graciela Rastrojo, Alberto López-Pérez, Sara Requena, Jose M. Aguado, Begoña Parasit Vectors Research BACKGROUND: Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of Leishmania major, and subsequently the genome of other related species, was paramount for highlighting these peculiar molecular aspects. Recently, we carried out an analysis of gene expression by massive sequencing of RNA in the L. major promastigote, and data derived from that analysis were suggestive of possible errors in the current genome assembly for this Leishmania species. RESULTS: During the analysis by RNA-Seq of the transcriptome for L. major Friedlin strain, 163,714 reads could not be aligned with the reference genome. Thus, de novo assembly with these reads was carried out and the resulting contigs were further analyzed. After detailed homology searches using available databases, it was postulated that 15 contigs might correspond to genomic sequences lost during the initial genome assembly of the L. major Friedlin strain. This was experimentally confirmed by PCR amplification, cloning and sequencing of the new genomic regions. As a result, we have identified seven regions of the L. major (Friedlin) genome that were lost during the sequence assembly. This led to the uncovering of six new genes (LmjF.15.1475, LmjF.15.0285, LmjF.24.0765, LmjF.14.0860, LmjF.19.0305, and LmjF.27.2035), and correction of the annotation for two others (LmjF.15.1480 and LmjF.27.2030). Our data suggest that these genomic regions probably collapsed during the genome assembly due to the existence of gene duplications and/or repeated regions surrounding the missed genes. CONCLUSION: RNA-seq data helped to reconstruct some genomic regions misassembled during the L. major Friedlin genome assembly, which is otherwise quite robust. On the other hand, this study shows that data derived from massive sequencing approaches, including RNA-Seq, should be carefully inspected to improve current genome definition and gene annotations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13071-016-1329-4) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-08 /pmc/articles/PMC4746890/ /pubmed/26857920 http://dx.doi.org/10.1186/s13071-016-1329-4 Text en © Alonso et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Alonso, Graciela Rastrojo, Alberto López-Pérez, Sara Requena, Jose M. Aguado, Begoña Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome |
title | Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome |
title_full | Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome |
title_fullStr | Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome |
title_full_unstemmed | Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome |
title_short | Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome |
title_sort | resequencing and assembly of seven complex loci to improve the leishmania major (friedlin strain) reference genome |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4746890/ https://www.ncbi.nlm.nih.gov/pubmed/26857920 http://dx.doi.org/10.1186/s13071-016-1329-4 |
work_keys_str_mv | AT alonsograciela resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome AT rastrojoalberto resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome AT lopezperezsara resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome AT requenajosem resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome AT aguadobegona resequencingandassemblyofsevencomplexlocitoimprovetheleishmaniamajorfriedlinstrainreferencegenome |