Cargando…

FrangiPANe, a tool for creating a panreference using left behind reads

We present here FrangiPANe, a pipeline developed to build panreference using short reads through a map-then-assemble strategy. Applying it to 248 African rice genomes using an improved CG14 reference genome, we identified an average of 8 Mb of new sequences and 5290 new contigs per individual. In to...

Descripción completa

Detalles Bibliográficos
Autores principales: Christine, Tranchant-Dubreuil, Clothilde, Chenal, Mathieu, Blaison, Laurence, Albar, Valentin, Klein, Cédric, Mariac, Wing Rod, A, Yves, Vigouroux, Francois, Sabot
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940456/
https://www.ncbi.nlm.nih.gov/pubmed/36814455
http://dx.doi.org/10.1093/nargab/lqad013
_version_ 1784891081340485632
author Christine, Tranchant-Dubreuil
Clothilde, Chenal
Mathieu, Blaison
Laurence, Albar
Valentin, Klein
Cédric, Mariac
Wing Rod, A
Yves, Vigouroux
Francois, Sabot
author_facet Christine, Tranchant-Dubreuil
Clothilde, Chenal
Mathieu, Blaison
Laurence, Albar
Valentin, Klein
Cédric, Mariac
Wing Rod, A
Yves, Vigouroux
Francois, Sabot
author_sort Christine, Tranchant-Dubreuil
collection PubMed
description We present here FrangiPANe, a pipeline developed to build panreference using short reads through a map-then-assemble strategy. Applying it to 248 African rice genomes using an improved CG14 reference genome, we identified an average of 8 Mb of new sequences and 5290 new contigs per individual. In total, 1.4 G of new sequences, consisting of 1 306 676 contigs, were assembled. We validated 97.7% of the contigs of the TOG5681 cultivar individual assembly from short reads on a newly long reads genome assembly of the same TOG5681 cultivar. FrangiPANe also allowed the anchoring of 31.5% of the new contigs within the CG14 reference genome, with a 92.5% accuracy at 2 kb span. We annotated in addition 3252 new genes absent from the reference. FrangiPANe was developed as a modular and interactive application to simplify the construction of a panreference using the map-then-assemble approach. It is available as a Docker image containing (i) a Jupyter notebook centralizing codes, documentation and interactive visualization of results, (ii) python scripts and (iii) all the software and libraries requested for each step of the analysis. We foreseen our approach will help leverage large-scale illumina dataset for pangenome studies in GWAS or detection of selection.
format Online
Article
Text
id pubmed-9940456
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-99404562023-02-21 FrangiPANe, a tool for creating a panreference using left behind reads Christine, Tranchant-Dubreuil Clothilde, Chenal Mathieu, Blaison Laurence, Albar Valentin, Klein Cédric, Mariac Wing Rod, A Yves, Vigouroux Francois, Sabot NAR Genom Bioinform Application Notes We present here FrangiPANe, a pipeline developed to build panreference using short reads through a map-then-assemble strategy. Applying it to 248 African rice genomes using an improved CG14 reference genome, we identified an average of 8 Mb of new sequences and 5290 new contigs per individual. In total, 1.4 G of new sequences, consisting of 1 306 676 contigs, were assembled. We validated 97.7% of the contigs of the TOG5681 cultivar individual assembly from short reads on a newly long reads genome assembly of the same TOG5681 cultivar. FrangiPANe also allowed the anchoring of 31.5% of the new contigs within the CG14 reference genome, with a 92.5% accuracy at 2 kb span. We annotated in addition 3252 new genes absent from the reference. FrangiPANe was developed as a modular and interactive application to simplify the construction of a panreference using the map-then-assemble approach. It is available as a Docker image containing (i) a Jupyter notebook centralizing codes, documentation and interactive visualization of results, (ii) python scripts and (iii) all the software and libraries requested for each step of the analysis. We foreseen our approach will help leverage large-scale illumina dataset for pangenome studies in GWAS or detection of selection. Oxford University Press 2023-02-20 /pmc/articles/PMC9940456/ /pubmed/36814455 http://dx.doi.org/10.1093/nargab/lqad013 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Application Notes
Christine, Tranchant-Dubreuil
Clothilde, Chenal
Mathieu, Blaison
Laurence, Albar
Valentin, Klein
Cédric, Mariac
Wing Rod, A
Yves, Vigouroux
Francois, Sabot
FrangiPANe, a tool for creating a panreference using left behind reads
title FrangiPANe, a tool for creating a panreference using left behind reads
title_full FrangiPANe, a tool for creating a panreference using left behind reads
title_fullStr FrangiPANe, a tool for creating a panreference using left behind reads
title_full_unstemmed FrangiPANe, a tool for creating a panreference using left behind reads
title_short FrangiPANe, a tool for creating a panreference using left behind reads
title_sort frangipane, a tool for creating a panreference using left behind reads
topic Application Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940456/
https://www.ncbi.nlm.nih.gov/pubmed/36814455
http://dx.doi.org/10.1093/nargab/lqad013
work_keys_str_mv AT christinetranchantdubreuil frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT clothildechenal frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT mathieublaison frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT laurencealbar frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT valentinklein frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT cedricmariac frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT wingroda frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT yvesvigouroux frangipaneatoolforcreatingapanreferenceusingleftbehindreads
AT francoissabot frangipaneatoolforcreatingapanreferenceusingleftbehindreads