Cargando…

Machine learning and big scientific data

This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such ‘Big Scientific Data’ comes from the Diamond Ligh...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hey, Tony, Butler, Keith, Jackson, Sam, Thiyagalingam, Jeyarajan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	The Royal Society Publishing 2020
Materias:	Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7015290/ https://www.ncbi.nlm.nih.gov/pubmed/31955675 http://dx.doi.org/10.1098/rsta.2019.0054

_version_	1783496780853280768
author	Hey, Tony Butler, Keith Jackson, Sam Thiyagalingam, Jeyarajan
author_facet	Hey, Tony Butler, Keith Jackson, Sam Thiyagalingam, Jeyarajan
author_sort	Hey, Tony
collection	PubMed
description	This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such ‘Big Scientific Data’ comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility and the UK's Central Laser Facility. Increasingly, scientists are now required to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now used the deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, it has been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the RAL, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from several different scientific domains. We conclude with some initial examples of our ‘scientific machine learning’ benchmark suite and of the research challenges these benchmarks will enable. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
format	Online Article Text
id	pubmed-7015290
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	The Royal Society Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-70152902020-02-18 Machine learning and big scientific data Hey, Tony Butler, Keith Jackson, Sam Thiyagalingam, Jeyarajan Philos Trans A Math Phys Eng Sci Articles This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such ‘Big Scientific Data’ comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility and the UK's Central Laser Facility. Increasingly, scientists are now required to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now used the deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, it has been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the RAL, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from several different scientific domains. We conclude with some initial examples of our ‘scientific machine learning’ benchmark suite and of the research challenges these benchmarks will enable. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’. The Royal Society Publishing 2020-03-06 2020-01-20 /pmc/articles/PMC7015290/ /pubmed/31955675 http://dx.doi.org/10.1098/rsta.2019.0054 Text en © 2020 The Authors. http://creativecommons.org/licenses/by/4.0/ Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle	Articles Hey, Tony Butler, Keith Jackson, Sam Thiyagalingam, Jeyarajan Machine learning and big scientific data
title	Machine learning and big scientific data
title_full	Machine learning and big scientific data
title_fullStr	Machine learning and big scientific data
title_full_unstemmed	Machine learning and big scientific data
title_short	Machine learning and big scientific data
title_sort	machine learning and big scientific data
topic	Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7015290/ https://www.ncbi.nlm.nih.gov/pubmed/31955675 http://dx.doi.org/10.1098/rsta.2019.0054
work_keys_str_mv	AT heytony machinelearningandbigscientificdata AT butlerkeith machinelearningandbigscientificdata AT jacksonsam machinelearningandbigscientificdata AT thiyagalingamjeyarajan machinelearningandbigscientificdata

Machine learning and big scientific data

Ejemplares similares