Cargando…

Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen

Species harbor extensive structural variation underpinning recent adaptive evolution. However, the causality between genomic features and the induction of new rearrangements is poorly established. Here, we analyze a global set of telomere-to-telomere genome assemblies of a fungal pathogen of wheat t...

Descripción completa

Detalles Bibliográficos
Autores principales: Badet, Thomas, Fouché, Simone, Hartmann, Fanny E., Zala, Marcello, Croll, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8192914/
https://www.ncbi.nlm.nih.gov/pubmed/34112792
http://dx.doi.org/10.1038/s41467-021-23862-x
_version_ 1783706139050901504
author Badet, Thomas
Fouché, Simone
Hartmann, Fanny E.
Zala, Marcello
Croll, Daniel
author_facet Badet, Thomas
Fouché, Simone
Hartmann, Fanny E.
Zala, Marcello
Croll, Daniel
author_sort Badet, Thomas
collection PubMed
description Species harbor extensive structural variation underpinning recent adaptive evolution. However, the causality between genomic features and the induction of new rearrangements is poorly established. Here, we analyze a global set of telomere-to-telomere genome assemblies of a fungal pathogen of wheat to establish a nucleotide-level map of structural variation. We show that the recent emergence of pesticide resistance has been disproportionally driven by rearrangements. We use machine learning to train a model on structural variation events based on 30 chromosomal sequence features. We show that base composition and gene density are the major determinants of structural variation. Retrotransposons explain most inversion, indel and duplication events. We apply our model to Arabidopsis thaliana and show that our approach extends to more complex genomes. Finally, we analyze complete genomes of haploid offspring in a four-generation pedigree. Meiotic crossover locations are enriched for new rearrangements consistent with crossovers being mutational hotspots. The model trained on species-wide structural variation accurately predicts the position of >74% of newly generated variants along the pedigree. The predictive power highlights causality between specific sequence features and the induction of chromosomal rearrangements. Our work demonstrates that training sequence-derived models can accurately identify regions of intrinsic DNA instability in eukaryotic genomes.
format Online
Article
Text
id pubmed-8192914
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-81929142021-06-17 Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen Badet, Thomas Fouché, Simone Hartmann, Fanny E. Zala, Marcello Croll, Daniel Nat Commun Article Species harbor extensive structural variation underpinning recent adaptive evolution. However, the causality between genomic features and the induction of new rearrangements is poorly established. Here, we analyze a global set of telomere-to-telomere genome assemblies of a fungal pathogen of wheat to establish a nucleotide-level map of structural variation. We show that the recent emergence of pesticide resistance has been disproportionally driven by rearrangements. We use machine learning to train a model on structural variation events based on 30 chromosomal sequence features. We show that base composition and gene density are the major determinants of structural variation. Retrotransposons explain most inversion, indel and duplication events. We apply our model to Arabidopsis thaliana and show that our approach extends to more complex genomes. Finally, we analyze complete genomes of haploid offspring in a four-generation pedigree. Meiotic crossover locations are enriched for new rearrangements consistent with crossovers being mutational hotspots. The model trained on species-wide structural variation accurately predicts the position of >74% of newly generated variants along the pedigree. The predictive power highlights causality between specific sequence features and the induction of chromosomal rearrangements. Our work demonstrates that training sequence-derived models can accurately identify regions of intrinsic DNA instability in eukaryotic genomes. Nature Publishing Group UK 2021-06-10 /pmc/articles/PMC8192914/ /pubmed/34112792 http://dx.doi.org/10.1038/s41467-021-23862-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Badet, Thomas
Fouché, Simone
Hartmann, Fanny E.
Zala, Marcello
Croll, Daniel
Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
title Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
title_full Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
title_fullStr Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
title_full_unstemmed Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
title_short Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
title_sort machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8192914/
https://www.ncbi.nlm.nih.gov/pubmed/34112792
http://dx.doi.org/10.1038/s41467-021-23862-x
work_keys_str_mv AT badetthomas machinelearningpredictsgenomicdeterminantsofmeiosisdrivenstructuralvariationinaeukaryoticpathogen
AT fouchesimone machinelearningpredictsgenomicdeterminantsofmeiosisdrivenstructuralvariationinaeukaryoticpathogen
AT hartmannfannye machinelearningpredictsgenomicdeterminantsofmeiosisdrivenstructuralvariationinaeukaryoticpathogen
AT zalamarcello machinelearningpredictsgenomicdeterminantsofmeiosisdrivenstructuralvariationinaeukaryoticpathogen
AT crolldaniel machinelearningpredictsgenomicdeterminantsofmeiosisdrivenstructuralvariationinaeukaryoticpathogen