Cargando…

How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data

Rare diseases, although individually rare, collectively affect approximately 350 million people worldwide. Currently, nearly 6,000 distinct rare disorders with a known molecular basis have been described, yet establishing a specific diagnosis based on the clinical phenotype is challenging. Increasin...

Descripción completa

Detalles Bibliográficos
Autores principales: Schlieben, Lea D., Prokisch, Holger, Yépez, Vicente A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8204083/
https://www.ncbi.nlm.nih.gov/pubmed/34141720
http://dx.doi.org/10.3389/fmolb.2021.647277
_version_ 1783708284331491328
author Schlieben, Lea D.
Prokisch, Holger
Yépez, Vicente A.
author_facet Schlieben, Lea D.
Prokisch, Holger
Yépez, Vicente A.
author_sort Schlieben, Lea D.
collection PubMed
description Rare diseases, although individually rare, collectively affect approximately 350 million people worldwide. Currently, nearly 6,000 distinct rare disorders with a known molecular basis have been described, yet establishing a specific diagnosis based on the clinical phenotype is challenging. Increasing integration of whole exome sequencing into routine diagnostics of rare diseases is improving diagnostic rates. Nevertheless, about half of the patients do not receive a genetic diagnosis due to the challenges of variant detection and interpretation. During the last years, RNA sequencing is increasingly used as a complementary diagnostic tool providing functional data. Initially, arbitrary thresholds have been applied to call aberrant expression, aberrant splicing, and mono-allelic expression. With the application of RNA sequencing to search for the molecular diagnosis, the implementation of robust statistical models on normalized read counts allowed for the detection of significant outliers corrected for multiple testing. More recently, machine learning methods have been developed to improve the normalization of RNA sequencing read count data by taking confounders into account. Together the methods have increased the power and sensitivity of detection and interpretation of pathogenic variants, leading to diagnostic rates of 10–35% in rare diseases. In this review, we provide an overview of the methods used for RNA sequencing and illustrate how these can improve the diagnostic yield of rare diseases.
format Online
Article
Text
id pubmed-8204083
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-82040832021-06-16 How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data Schlieben, Lea D. Prokisch, Holger Yépez, Vicente A. Front Mol Biosci Molecular Biosciences Rare diseases, although individually rare, collectively affect approximately 350 million people worldwide. Currently, nearly 6,000 distinct rare disorders with a known molecular basis have been described, yet establishing a specific diagnosis based on the clinical phenotype is challenging. Increasing integration of whole exome sequencing into routine diagnostics of rare diseases is improving diagnostic rates. Nevertheless, about half of the patients do not receive a genetic diagnosis due to the challenges of variant detection and interpretation. During the last years, RNA sequencing is increasingly used as a complementary diagnostic tool providing functional data. Initially, arbitrary thresholds have been applied to call aberrant expression, aberrant splicing, and mono-allelic expression. With the application of RNA sequencing to search for the molecular diagnosis, the implementation of robust statistical models on normalized read counts allowed for the detection of significant outliers corrected for multiple testing. More recently, machine learning methods have been developed to improve the normalization of RNA sequencing read count data by taking confounders into account. Together the methods have increased the power and sensitivity of detection and interpretation of pathogenic variants, leading to diagnostic rates of 10–35% in rare diseases. In this review, we provide an overview of the methods used for RNA sequencing and illustrate how these can improve the diagnostic yield of rare diseases. Frontiers Media S.A. 2021-06-01 /pmc/articles/PMC8204083/ /pubmed/34141720 http://dx.doi.org/10.3389/fmolb.2021.647277 Text en Copyright © 2021 Schlieben, Prokisch and Yépez. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Schlieben, Lea D.
Prokisch, Holger
Yépez, Vicente A.
How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data
title How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data
title_full How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data
title_fullStr How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data
title_full_unstemmed How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data
title_short How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data
title_sort how machine learning and statistical models advance molecular diagnostics of rare disorders via analysis of rna sequencing data
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8204083/
https://www.ncbi.nlm.nih.gov/pubmed/34141720
http://dx.doi.org/10.3389/fmolb.2021.647277
work_keys_str_mv AT schliebenlead howmachinelearningandstatisticalmodelsadvancemoleculardiagnosticsofraredisordersviaanalysisofrnasequencingdata
AT prokischholger howmachinelearningandstatisticalmodelsadvancemoleculardiagnosticsofraredisordersviaanalysisofrnasequencingdata
AT yepezvicentea howmachinelearningandstatisticalmodelsadvancemoleculardiagnosticsofraredisordersviaanalysisofrnasequencingdata