Cargando…
Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting point...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895282/ https://www.ncbi.nlm.nih.gov/pubmed/26866367 http://dx.doi.org/10.1186/s12863-015-0315-8 |
_version_ | 1782435818043867136 |
---|---|
author | König, Inke R. Auerbach, Jonathan Gola, Damian Held, Elizabeth Holzinger, Emily R. Legault, Marc-André Sun, Rui Tintle, Nathan Yang, Hsin-Chou |
author_facet | König, Inke R. Auerbach, Jonathan Gola, Damian Held, Elizabeth Holzinger, Emily R. Legault, Marc-André Sun, Rui Tintle, Nathan Yang, Hsin-Chou |
author_sort | König, Inke R. |
collection | PubMed |
description | In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data. In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets. |
format | Online Article Text |
id | pubmed-4895282 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-48952822016-06-10 Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 König, Inke R. Auerbach, Jonathan Gola, Damian Held, Elizabeth Holzinger, Emily R. Legault, Marc-André Sun, Rui Tintle, Nathan Yang, Hsin-Chou BMC Genet Proceedings In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data. In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets. BioMed Central 2016-02-03 /pmc/articles/PMC4895282/ /pubmed/26866367 http://dx.doi.org/10.1186/s12863-015-0315-8 Text en © König et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings König, Inke R. Auerbach, Jonathan Gola, Damian Held, Elizabeth Holzinger, Emily R. Legault, Marc-André Sun, Rui Tintle, Nathan Yang, Hsin-Chou Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 |
title | Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 |
title_full | Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 |
title_fullStr | Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 |
title_full_unstemmed | Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 |
title_short | Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19 |
title_sort | machine learning and data mining in complex genomic data—a review on the lessons learned in genetic analysis workshop 19 |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895282/ https://www.ncbi.nlm.nih.gov/pubmed/26866367 http://dx.doi.org/10.1186/s12863-015-0315-8 |
work_keys_str_mv | AT koniginker machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT auerbachjonathan machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT goladamian machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT heldelizabeth machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT holzingeremilyr machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT legaultmarcandre machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT sunrui machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT tintlenathan machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 AT yanghsinchou machinelearninganddataminingincomplexgenomicdataareviewonthelessonslearnedingeneticanalysisworkshop19 |