Cargando…

EXFI: Exon and splice graph prediction without a reference genome

For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algo...

Descripción completa

Detalles Bibliográficos
Autores principales: Langa, Jorge, Estonba, Andone, Conklin, Darrell
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7452765/
https://www.ncbi.nlm.nih.gov/pubmed/32884664
http://dx.doi.org/10.1002/ece3.6587
_version_ 1783575222563110912
author Langa, Jorge
Estonba, Andone
Conklin, Darrell
author_facet Langa, Jorge
Estonba, Andone
Conklin, Darrell
author_sort Langa, Jorge
collection PubMed
description For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi.
format Online
Article
Text
id pubmed-7452765
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-74527652020-09-02 EXFI: Exon and splice graph prediction without a reference genome Langa, Jorge Estonba, Andone Conklin, Darrell Ecol Evol Original Research For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi. John Wiley and Sons Inc. 2020-07-28 /pmc/articles/PMC7452765/ /pubmed/32884664 http://dx.doi.org/10.1002/ece3.6587 Text en © 2020 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Langa, Jorge
Estonba, Andone
Conklin, Darrell
EXFI: Exon and splice graph prediction without a reference genome
title EXFI: Exon and splice graph prediction without a reference genome
title_full EXFI: Exon and splice graph prediction without a reference genome
title_fullStr EXFI: Exon and splice graph prediction without a reference genome
title_full_unstemmed EXFI: Exon and splice graph prediction without a reference genome
title_short EXFI: Exon and splice graph prediction without a reference genome
title_sort exfi: exon and splice graph prediction without a reference genome
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7452765/
https://www.ncbi.nlm.nih.gov/pubmed/32884664
http://dx.doi.org/10.1002/ece3.6587
work_keys_str_mv AT langajorge exfiexonandsplicegraphpredictionwithoutareferencegenome
AT estonbaandone exfiexonandsplicegraphpredictionwithoutareferencegenome
AT conklindarrell exfiexonandsplicegraphpredictionwithoutareferencegenome