Cargando…

Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies

Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discove...

Descripción completa

Detalles Bibliográficos
Autores principales: Neto, Osorio Abath, Tassy, Olivier, Biancalana, Valérie, Zanoteli, Edmar, Pourquié, Olivier, Laporte, Jocelyn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4213015/
https://www.ncbi.nlm.nih.gov/pubmed/25353622
http://dx.doi.org/10.1371/journal.pone.0110888
_version_ 1782341785098387456
author Neto, Osorio Abath
Tassy, Olivier
Biancalana, Valérie
Zanoteli, Edmar
Pourquié, Olivier
Laporte, Jocelyn
author_facet Neto, Osorio Abath
Tassy, Olivier
Biancalana, Valérie
Zanoteli, Edmar
Pourquié, Olivier
Laporte, Jocelyn
author_sort Neto, Osorio Abath
collection PubMed
description Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discovery of new implicated genes, but a working list of prioritized candidate genes is necessary to deal with the complexity of analyzing large-scale sequencing data. Here we used an integrative data mining strategy to analyze the genetic network linked to myopathies, derive specific signatures for inherited myopathy and related disorders, and identify and rank candidate genes for these groups. Training sets of genes were selected after literature review and used in Manteia, a public web-based data mining system, to extract disease group signatures in the form of enriched descriptor terms, which include functional annotation, human and mouse phenotypes, as well as biological pathways and protein interactions. These specific signatures were then used as an input to mine and rank candidate genes, followed by filtration against skeletal muscle expression and association with known diseases. Signatures and identified candidate genes highlight both potential common pathological mechanisms and allelic disease groups. Recent discoveries of gene associations to diseases, like B3GALNT2, GMPPB and B3GNT1 to congenital muscular dystrophies, were prioritized in the ranked lists, suggesting a posteriori validation of our approach and predictions. We show an example of how the ranked lists can be used to help analyze high-throughput sequencing data to identify candidate genes, and highlight the best candidate genes matching genomic regions linked to myopathies without known causative genes. This strategy can be automatized to generate fresh candidate gene lists, which help cope with database annotation updates as new knowledge is incorporated.
format Online
Article
Text
id pubmed-4213015
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42130152014-11-05 Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies Neto, Osorio Abath Tassy, Olivier Biancalana, Valérie Zanoteli, Edmar Pourquié, Olivier Laporte, Jocelyn PLoS One Research Article Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discovery of new implicated genes, but a working list of prioritized candidate genes is necessary to deal with the complexity of analyzing large-scale sequencing data. Here we used an integrative data mining strategy to analyze the genetic network linked to myopathies, derive specific signatures for inherited myopathy and related disorders, and identify and rank candidate genes for these groups. Training sets of genes were selected after literature review and used in Manteia, a public web-based data mining system, to extract disease group signatures in the form of enriched descriptor terms, which include functional annotation, human and mouse phenotypes, as well as biological pathways and protein interactions. These specific signatures were then used as an input to mine and rank candidate genes, followed by filtration against skeletal muscle expression and association with known diseases. Signatures and identified candidate genes highlight both potential common pathological mechanisms and allelic disease groups. Recent discoveries of gene associations to diseases, like B3GALNT2, GMPPB and B3GNT1 to congenital muscular dystrophies, were prioritized in the ranked lists, suggesting a posteriori validation of our approach and predictions. We show an example of how the ranked lists can be used to help analyze high-throughput sequencing data to identify candidate genes, and highlight the best candidate genes matching genomic regions linked to myopathies without known causative genes. This strategy can be automatized to generate fresh candidate gene lists, which help cope with database annotation updates as new knowledge is incorporated. Public Library of Science 2014-10-29 /pmc/articles/PMC4213015/ /pubmed/25353622 http://dx.doi.org/10.1371/journal.pone.0110888 Text en © 2014 Neto et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Neto, Osorio Abath
Tassy, Olivier
Biancalana, Valérie
Zanoteli, Edmar
Pourquié, Olivier
Laporte, Jocelyn
Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies
title Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies
title_full Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies
title_fullStr Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies
title_full_unstemmed Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies
title_short Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies
title_sort integrative data mining highlights candidate genes for monogenic myopathies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4213015/
https://www.ncbi.nlm.nih.gov/pubmed/25353622
http://dx.doi.org/10.1371/journal.pone.0110888
work_keys_str_mv AT netoosorioabath integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT tassyolivier integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT biancalanavalerie integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT zanoteliedmar integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT pourquieolivier integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT laportejocelyn integrativedatamininghighlightscandidategenesformonogenicmyopathies