Cargando…

Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19

In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting point...

Descripción completa

Detalles Bibliográficos
Autores principales: König, Inke R., Auerbach, Jonathan, Gola, Damian, Held, Elizabeth, Holzinger, Emily R., Legault, Marc-André, Sun, Rui, Tintle, Nathan, Yang, Hsin-Chou
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895282/
https://www.ncbi.nlm.nih.gov/pubmed/26866367
http://dx.doi.org/10.1186/s12863-015-0315-8
_version_ 1782435818043867136
author König, Inke R.
Auerbach, Jonathan
Gola, Damian
Held, Elizabeth
Holzinger, Emily R.
Legault, Marc-André
Sun, Rui
Tintle, Nathan
Yang, Hsin-Chou
author_facet König, Inke R.
Auerbach, Jonathan
Gola, Damian
Held, Elizabeth
Holzinger, Emily R.
Legault, Marc-André
Sun, Rui
Tintle, Nathan
Yang, Hsin-Chou
author_sort König, Inke R.
collection PubMed
description In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data. In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets.
format Online
Article
Text
id pubmed-4895282
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48952822016-06-10 Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 König, Inke R. Auerbach, Jonathan Gola, Damian Held, Elizabeth Holzinger, Emily R. Legault, Marc-André Sun, Rui Tintle, Nathan Yang, Hsin-Chou BMC Genet Proceedings In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data. In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets. BioMed Central 2016-02-03 /pmc/articles/PMC4895282/ /pubmed/26866367 http://dx.doi.org/10.1186/s12863-015-0315-8 Text en © König et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
König, Inke R.
Auerbach, Jonathan
Gola, Damian
Held, Elizabeth
Holzinger, Emily R.
Legault, Marc-André
Sun, Rui
Tintle, Nathan
Yang, Hsin-Chou
Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
title Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
title_full Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
title_fullStr Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
title_full_unstemmed Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
title_short Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
title_sort machine learning and data mining in complex genomic data—a review on the lessons learned in genetic analysis workshop 19
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895282/
https://www.ncbi.nlm.nih.gov/pubmed/26866367
http://dx.doi.org/10.1186/s12863-015-0315-8
work_keys_str_mv AT koniginker machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT auerbachjonathan machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT goladamian machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT heldelizabeth machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT holzingeremilyr machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT legaultmarcandre machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT sunrui machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT tintlenathan machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19
AT yanghsinchou machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19