Cargando…

GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation

Crop pangenomes made from individual cultivar assemblies promise easy access to conserved genes, but genome content variability and inconsistent identifiers hamper their exploration. To address this, we define pangenes, which summarize a species coding potential and link back to original annotations...

Descripción completa

Detalles Bibliográficos
Autores principales: Contreras-Moreira, Bruno, Saraf, Shradha, Naamati, Guy, Casas, Ana M., Amberkar, Sandeep S., Flicek, Paul, Jones, Andrew R., Dyer, Sarah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10552430/
https://www.ncbi.nlm.nih.gov/pubmed/37798615
http://dx.doi.org/10.1186/s13059-023-03071-z
_version_ 1785115961413599232
author Contreras-Moreira, Bruno
Saraf, Shradha
Naamati, Guy
Casas, Ana M.
Amberkar, Sandeep S.
Flicek, Paul
Jones, Andrew R.
Dyer, Sarah
author_facet Contreras-Moreira, Bruno
Saraf, Shradha
Naamati, Guy
Casas, Ana M.
Amberkar, Sandeep S.
Flicek, Paul
Jones, Andrew R.
Dyer, Sarah
author_sort Contreras-Moreira, Bruno
collection PubMed
description Crop pangenomes made from individual cultivar assemblies promise easy access to conserved genes, but genome content variability and inconsistent identifiers hamper their exploration. To address this, we define pangenes, which summarize a species coding potential and link back to original annotations. The protocol get_pangenes performs whole genome alignments (WGA) to call syntenic gene models based on coordinate overlaps. A benchmark with small and large plant genomes shows that pangenes recapitulate phylogeny-based orthologies and produce complete soft-core gene sets. Moreover, WGAs support lift-over and help confirm gene presence-absence variation. Source code and documentation: https://github.com/Ensembl/plant-scripts. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03071-z.
format Online
Article
Text
id pubmed-10552430
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105524302023-10-06 GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation Contreras-Moreira, Bruno Saraf, Shradha Naamati, Guy Casas, Ana M. Amberkar, Sandeep S. Flicek, Paul Jones, Andrew R. Dyer, Sarah Genome Biol Method Crop pangenomes made from individual cultivar assemblies promise easy access to conserved genes, but genome content variability and inconsistent identifiers hamper their exploration. To address this, we define pangenes, which summarize a species coding potential and link back to original annotations. The protocol get_pangenes performs whole genome alignments (WGA) to call syntenic gene models based on coordinate overlaps. A benchmark with small and large plant genomes shows that pangenes recapitulate phylogeny-based orthologies and produce complete soft-core gene sets. Moreover, WGAs support lift-over and help confirm gene presence-absence variation. Source code and documentation: https://github.com/Ensembl/plant-scripts. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03071-z. BioMed Central 2023-10-05 /pmc/articles/PMC10552430/ /pubmed/37798615 http://dx.doi.org/10.1186/s13059-023-03071-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Method
Contreras-Moreira, Bruno
Saraf, Shradha
Naamati, Guy
Casas, Ana M.
Amberkar, Sandeep S.
Flicek, Paul
Jones, Andrew R.
Dyer, Sarah
GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation
title GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation
title_full GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation
title_fullStr GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation
title_full_unstemmed GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation
title_short GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation
title_sort get_pangenes: calling pangenes from plant genome alignments confirms presence-absence variation
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10552430/
https://www.ncbi.nlm.nih.gov/pubmed/37798615
http://dx.doi.org/10.1186/s13059-023-03071-z
work_keys_str_mv AT contrerasmoreirabruno getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT sarafshradha getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT naamatiguy getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT casasanam getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT amberkarsandeeps getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT flicekpaul getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT jonesandrewr getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation
AT dyersarah getpangenescallingpangenesfromplantgenomealignmentsconfirmspresenceabsencevariation