Cargando…

Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function

The circadian clock is an important adaptation to life on Earth. Here, we use machine learning to predict complex, temporal, and circadian gene expression patterns in Arabidopsis. Most significantly, we classify circadian genes using DNA sequence features generated de novo from public, genomic resou...

Descripción completa

Detalles Bibliográficos
Autores principales: Gardiner, Laura-Jayne, Rusholme-Pilcher, Rachel, Colmer, Josh, Rees, Hannah, Crescente, Juan Manuel, Carrieri, Anna Paola, Duncan, Susan, Pyzer-Knapp, Edward O., Krishna, Ritesh, Hall, Anthony
Formato: Online Artículo Texto
Lenguaje:English
Publicado: National Academy of Sciences 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8364196/
https://www.ncbi.nlm.nih.gov/pubmed/34353905
http://dx.doi.org/10.1073/pnas.2103070118
_version_ 1783738488495013888
author Gardiner, Laura-Jayne
Rusholme-Pilcher, Rachel
Colmer, Josh
Rees, Hannah
Crescente, Juan Manuel
Carrieri, Anna Paola
Duncan, Susan
Pyzer-Knapp, Edward O.
Krishna, Ritesh
Hall, Anthony
author_facet Gardiner, Laura-Jayne
Rusholme-Pilcher, Rachel
Colmer, Josh
Rees, Hannah
Crescente, Juan Manuel
Carrieri, Anna Paola
Duncan, Susan
Pyzer-Knapp, Edward O.
Krishna, Ritesh
Hall, Anthony
author_sort Gardiner, Laura-Jayne
collection PubMed
description The circadian clock is an important adaptation to life on Earth. Here, we use machine learning to predict complex, temporal, and circadian gene expression patterns in Arabidopsis. Most significantly, we classify circadian genes using DNA sequence features generated de novo from public, genomic resources, facilitating downstream application of our methods with no experimental work or prior knowledge needed. We use local model explanation that is transcript specific to rank DNA sequence features, providing a detailed profile of the potential circadian regulatory mechanisms for each transcript. Furthermore, we can discriminate the temporal phase of transcript expression using the local, explanation-derived, and ranked DNA sequence features, revealing hidden subclasses within the circadian class. Model interpretation/explanation provides the backbone of our methodological advances, giving insight into biological processes and experimental design. Next, we use model interpretation to optimize sampling strategies when we predict circadian transcripts using reduced numbers of transcriptomic timepoints. Finally, we predict the circadian time from a single, transcriptomic timepoint, deriving marker transcripts that are most impactful for accurate prediction; this could facilitate the identification of altered clock function from existing datasets.
format Online
Article
Text
id pubmed-8364196
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher National Academy of Sciences
record_format MEDLINE/PubMed
spelling pubmed-83641962021-08-24 Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function Gardiner, Laura-Jayne Rusholme-Pilcher, Rachel Colmer, Josh Rees, Hannah Crescente, Juan Manuel Carrieri, Anna Paola Duncan, Susan Pyzer-Knapp, Edward O. Krishna, Ritesh Hall, Anthony Proc Natl Acad Sci U S A Biological Sciences The circadian clock is an important adaptation to life on Earth. Here, we use machine learning to predict complex, temporal, and circadian gene expression patterns in Arabidopsis. Most significantly, we classify circadian genes using DNA sequence features generated de novo from public, genomic resources, facilitating downstream application of our methods with no experimental work or prior knowledge needed. We use local model explanation that is transcript specific to rank DNA sequence features, providing a detailed profile of the potential circadian regulatory mechanisms for each transcript. Furthermore, we can discriminate the temporal phase of transcript expression using the local, explanation-derived, and ranked DNA sequence features, revealing hidden subclasses within the circadian class. Model interpretation/explanation provides the backbone of our methodological advances, giving insight into biological processes and experimental design. Next, we use model interpretation to optimize sampling strategies when we predict circadian transcripts using reduced numbers of transcriptomic timepoints. Finally, we predict the circadian time from a single, transcriptomic timepoint, deriving marker transcripts that are most impactful for accurate prediction; this could facilitate the identification of altered clock function from existing datasets. National Academy of Sciences 2021-08-10 2021-08-05 /pmc/articles/PMC8364196/ /pubmed/34353905 http://dx.doi.org/10.1073/pnas.2103070118 Text en Copyright © 2021 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Biological Sciences
Gardiner, Laura-Jayne
Rusholme-Pilcher, Rachel
Colmer, Josh
Rees, Hannah
Crescente, Juan Manuel
Carrieri, Anna Paola
Duncan, Susan
Pyzer-Knapp, Edward O.
Krishna, Ritesh
Hall, Anthony
Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
title Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
title_full Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
title_fullStr Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
title_full_unstemmed Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
title_short Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
title_sort interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function
topic Biological Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8364196/
https://www.ncbi.nlm.nih.gov/pubmed/34353905
http://dx.doi.org/10.1073/pnas.2103070118
work_keys_str_mv AT gardinerlaurajayne interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT rusholmepilcherrachel interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT colmerjosh interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT reeshannah interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT crescentejuanmanuel interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT carrieriannapaola interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT duncansusan interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT pyzerknappedwardo interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT krishnaritesh interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction
AT hallanthony interpretingmachinelearningmodelstoinvestigatecircadianregulationandfacilitateexplorationofclockfunction