Cargando…

Machine learning patterns for neuroimaging-genetic studies in the cloud

Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statis...

Descripción completa

Detalles Bibliográficos
Autores principales: Da Mota, Benoit, Tudoran, Radu, Costan, Alexandru, Varoquaux, Gaël, Brasche, Goetz, Conrod, Patricia, Lemaitre, Herve, Paus, Tomas, Rietschel, Marcella, Frouin, Vincent, Poline, Jean-Baptiste, Antoniu, Gabriel, Thirion, Bertrand
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3986524/
https://www.ncbi.nlm.nih.gov/pubmed/24782753
http://dx.doi.org/10.3389/fninf.2014.00031
_version_ 1782311723394400256
author Da Mota, Benoit
Tudoran, Radu
Costan, Alexandru
Varoquaux, Gaël
Brasche, Goetz
Conrod, Patricia
Lemaitre, Herve
Paus, Tomas
Rietschel, Marcella
Frouin, Vincent
Poline, Jean-Baptiste
Antoniu, Gabriel
Thirion, Bertrand
author_facet Da Mota, Benoit
Tudoran, Radu
Costan, Alexandru
Varoquaux, Gaël
Brasche, Goetz
Conrod, Patricia
Lemaitre, Herve
Paus, Tomas
Rietschel, Marcella
Frouin, Vincent
Poline, Jean-Baptiste
Antoniu, Gabriel
Thirion, Bertrand
author_sort Da Mota, Benoit
collection PubMed
description Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.
format Online
Article
Text
id pubmed-3986524
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-39865242014-04-29 Machine learning patterns for neuroimaging-genetic studies in the cloud Da Mota, Benoit Tudoran, Radu Costan, Alexandru Varoquaux, Gaël Brasche, Goetz Conrod, Patricia Lemaitre, Herve Paus, Tomas Rietschel, Marcella Frouin, Vincent Poline, Jean-Baptiste Antoniu, Gabriel Thirion, Bertrand Front Neuroinform Neuroscience Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines. Frontiers Media S.A. 2014-04-08 /pmc/articles/PMC3986524/ /pubmed/24782753 http://dx.doi.org/10.3389/fninf.2014.00031 Text en Copyright © 2014 Da Mota, Tudoran, Costan, Varoquaux, Brasche, Conrod, Lemaitre, Paus, Rietschel, Frouin, Poline, Antoniu, Thirion and IMAGEN Consortium. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Da Mota, Benoit
Tudoran, Radu
Costan, Alexandru
Varoquaux, Gaël
Brasche, Goetz
Conrod, Patricia
Lemaitre, Herve
Paus, Tomas
Rietschel, Marcella
Frouin, Vincent
Poline, Jean-Baptiste
Antoniu, Gabriel
Thirion, Bertrand
Machine learning patterns for neuroimaging-genetic studies in the cloud
title Machine learning patterns for neuroimaging-genetic studies in the cloud
title_full Machine learning patterns for neuroimaging-genetic studies in the cloud
title_fullStr Machine learning patterns for neuroimaging-genetic studies in the cloud
title_full_unstemmed Machine learning patterns for neuroimaging-genetic studies in the cloud
title_short Machine learning patterns for neuroimaging-genetic studies in the cloud
title_sort machine learning patterns for neuroimaging-genetic studies in the cloud
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3986524/
https://www.ncbi.nlm.nih.gov/pubmed/24782753
http://dx.doi.org/10.3389/fninf.2014.00031
work_keys_str_mv AT damotabenoit machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT tudoranradu machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT costanalexandru machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT varoquauxgael machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT braschegoetz machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT conrodpatricia machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT lemaitreherve machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT paustomas machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT rietschelmarcella machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT frouinvincent machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT polinejeanbaptiste machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT antoniugabriel machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT thirionbertrand machinelearningpatternsforneuroimaginggeneticstudiesinthecloud
AT machinelearningpatternsforneuroimaginggeneticstudiesinthecloud