Cargando…
EXFI: Exon and splice graph prediction without a reference genome
For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algo...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7452765/ https://www.ncbi.nlm.nih.gov/pubmed/32884664 http://dx.doi.org/10.1002/ece3.6587 |
_version_ | 1783575222563110912 |
---|---|
author | Langa, Jorge Estonba, Andone Conklin, Darrell |
author_facet | Langa, Jorge Estonba, Andone Conklin, Darrell |
author_sort | Langa, Jorge |
collection | PubMed |
description | For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi. |
format | Online Article Text |
id | pubmed-7452765 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-74527652020-09-02 EXFI: Exon and splice graph prediction without a reference genome Langa, Jorge Estonba, Andone Conklin, Darrell Ecol Evol Original Research For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi. John Wiley and Sons Inc. 2020-07-28 /pmc/articles/PMC7452765/ /pubmed/32884664 http://dx.doi.org/10.1002/ece3.6587 Text en © 2020 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Research Langa, Jorge Estonba, Andone Conklin, Darrell EXFI: Exon and splice graph prediction without a reference genome |
title | EXFI: Exon and splice graph prediction without a reference genome |
title_full | EXFI: Exon and splice graph prediction without a reference genome |
title_fullStr | EXFI: Exon and splice graph prediction without a reference genome |
title_full_unstemmed | EXFI: Exon and splice graph prediction without a reference genome |
title_short | EXFI: Exon and splice graph prediction without a reference genome |
title_sort | exfi: exon and splice graph prediction without a reference genome |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7452765/ https://www.ncbi.nlm.nih.gov/pubmed/32884664 http://dx.doi.org/10.1002/ece3.6587 |
work_keys_str_mv | AT langajorge exfiexonandsplicegraphpredictionwithoutareferencegenome AT estonbaandone exfiexonandsplicegraphpredictionwithoutareferencegenome AT conklindarrell exfiexonandsplicegraphpredictionwithoutareferencegenome |