Cargando…
Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum
BACKGROUND: Plasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expre...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4658763/ https://www.ncbi.nlm.nih.gov/pubmed/26607328 http://dx.doi.org/10.1186/s12864-015-2214-9 |
_version_ | 1782402563726901248 |
---|---|
author | Lu, Xueqing Maggie Bunnik, Evelien M. Pokhriyal, Neeti Nasseri, Sara Lonardi, Stefano Le Roch, Karine G. |
author_facet | Lu, Xueqing Maggie Bunnik, Evelien M. Pokhriyal, Neeti Nasseri, Sara Lonardi, Stefano Le Roch, Karine G. |
author_sort | Lu, Xueqing Maggie |
collection | PubMed |
description | BACKGROUND: Plasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expression in the parasite, a more complete genome annotation as well as analysis tools adapted for AT-rich genomes are needed. Recent studies on genome-wide nucleosome positioning in eukaryotes have shown that nucleosome landscapes exhibit regular characteristic patterns at the 5’- and 3’-end of protein and non-protein coding genes. In addition, nucleosome depleted regions can be found near transcription start sites. These unique nucleosome landscape patterns may be exploited for the identification of novel genes. In this paper, we propose a computational approach to discover novel putative genes based exclusively on nucleosome positioning data in the AT-rich genome of P. falciparum. RESULTS: Using binary classifiers trained on nucleosome landscapes at the gene boundaries from two independent nucleosome positioning data sets, we were able to detect a total of 231 regions containing putative genes in the genome of Plasmodium falciparum, of which 67 highly confident genes were found in both data sets. Eighty-eight of these 231 newly predicted genes exhibited transcription signal in RNA-Seq data, indicative of active transcription. In addition, 20 out of 21 selected gene candidates were further validated by RT-PCR, and 28 out of the 231 genes showed significant matches using BLASTN against an expressed sequence tag (EST) database. Furthermore, 108 (47 %) out of the 231 putative novel genes overlapped with previously identified but unannotated long non-coding RNAs. Collectively, these results provide experimental validation for 163 predicted genes (70.6 %). Finally, 73 out of 231 genes were found to be potentially translated based on their signal in polysome-associated RNA-Seq representing transcripts that are actively being translated. CONCLUSION: Our results clearly indicate that nucleosome positioning data contains sufficient information for novel gene discovery. As distinct nucleosome landscapes around genes are found in many other eukaryotic organisms, this methodology could be used to characterize the transcriptome of any organism, especially when coupled with other DNA-based gene finding and experimental methods (e.g., RNA-Seq). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2214-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4658763 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-46587632015-11-26 Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum Lu, Xueqing Maggie Bunnik, Evelien M. Pokhriyal, Neeti Nasseri, Sara Lonardi, Stefano Le Roch, Karine G. BMC Genomics Methodology Article BACKGROUND: Plasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expression in the parasite, a more complete genome annotation as well as analysis tools adapted for AT-rich genomes are needed. Recent studies on genome-wide nucleosome positioning in eukaryotes have shown that nucleosome landscapes exhibit regular characteristic patterns at the 5’- and 3’-end of protein and non-protein coding genes. In addition, nucleosome depleted regions can be found near transcription start sites. These unique nucleosome landscape patterns may be exploited for the identification of novel genes. In this paper, we propose a computational approach to discover novel putative genes based exclusively on nucleosome positioning data in the AT-rich genome of P. falciparum. RESULTS: Using binary classifiers trained on nucleosome landscapes at the gene boundaries from two independent nucleosome positioning data sets, we were able to detect a total of 231 regions containing putative genes in the genome of Plasmodium falciparum, of which 67 highly confident genes were found in both data sets. Eighty-eight of these 231 newly predicted genes exhibited transcription signal in RNA-Seq data, indicative of active transcription. In addition, 20 out of 21 selected gene candidates were further validated by RT-PCR, and 28 out of the 231 genes showed significant matches using BLASTN against an expressed sequence tag (EST) database. Furthermore, 108 (47 %) out of the 231 putative novel genes overlapped with previously identified but unannotated long non-coding RNAs. Collectively, these results provide experimental validation for 163 predicted genes (70.6 %). Finally, 73 out of 231 genes were found to be potentially translated based on their signal in polysome-associated RNA-Seq representing transcripts that are actively being translated. CONCLUSION: Our results clearly indicate that nucleosome positioning data contains sufficient information for novel gene discovery. As distinct nucleosome landscapes around genes are found in many other eukaryotic organisms, this methodology could be used to characterize the transcriptome of any organism, especially when coupled with other DNA-based gene finding and experimental methods (e.g., RNA-Seq). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2214-9) contains supplementary material, which is available to authorized users. BioMed Central 2015-11-25 /pmc/articles/PMC4658763/ /pubmed/26607328 http://dx.doi.org/10.1186/s12864-015-2214-9 Text en © Lu et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Lu, Xueqing Maggie Bunnik, Evelien M. Pokhriyal, Neeti Nasseri, Sara Lonardi, Stefano Le Roch, Karine G. Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum |
title | Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum |
title_full | Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum |
title_fullStr | Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum |
title_full_unstemmed | Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum |
title_short | Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum |
title_sort | analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite plasmodium falciparum |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4658763/ https://www.ncbi.nlm.nih.gov/pubmed/26607328 http://dx.doi.org/10.1186/s12864-015-2214-9 |
work_keys_str_mv | AT luxueqingmaggie analysisofnucleosomepositioninglandscapesenablesgenediscoveryinthehumanmalariaparasiteplasmodiumfalciparum AT bunnikevelienm analysisofnucleosomepositioninglandscapesenablesgenediscoveryinthehumanmalariaparasiteplasmodiumfalciparum AT pokhriyalneeti analysisofnucleosomepositioninglandscapesenablesgenediscoveryinthehumanmalariaparasiteplasmodiumfalciparum AT nasserisara analysisofnucleosomepositioninglandscapesenablesgenediscoveryinthehumanmalariaparasiteplasmodiumfalciparum AT lonardistefano analysisofnucleosomepositioninglandscapesenablesgenediscoveryinthehumanmalariaparasiteplasmodiumfalciparum AT lerochkarineg analysisofnucleosomepositioninglandscapesenablesgenediscoveryinthehumanmalariaparasiteplasmodiumfalciparum |