Cargando…

Completing the Results of the 2013 Boston Marathon

The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's...

Descripción completa

Detalles Bibliográficos
Autores principales: Hammerling, Dorit, Cefalu, Matthew, Cisewski, Jessi, Dominici, Francesca, Parmigiani, Giovanni, Paulson, Charles, Smith, Richard L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984103/
https://www.ncbi.nlm.nih.gov/pubmed/24727904
http://dx.doi.org/10.1371/journal.pone.0093800
_version_ 1782311399258587136
author Hammerling, Dorit
Cefalu, Matthew
Cisewski, Jessi
Dominici, Francesca
Parmigiani, Giovanni
Paulson, Charles
Smith, Richard L.
author_facet Hammerling, Dorit
Cefalu, Matthew
Cisewski, Jessi
Dominici, Francesca
Parmigiani, Giovanni
Paulson, Charles
Smith, Richard L.
author_sort Hammerling, Dorit
collection PubMed
description The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches.
format Online
Article
Text
id pubmed-3984103
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39841032014-04-15 Completing the Results of the 2013 Boston Marathon Hammerling, Dorit Cefalu, Matthew Cisewski, Jessi Dominici, Francesca Parmigiani, Giovanni Paulson, Charles Smith, Richard L. PLoS One Research Article The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches. Public Library of Science 2014-04-11 /pmc/articles/PMC3984103/ /pubmed/24727904 http://dx.doi.org/10.1371/journal.pone.0093800 Text en © 2014 Hammerling et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hammerling, Dorit
Cefalu, Matthew
Cisewski, Jessi
Dominici, Francesca
Parmigiani, Giovanni
Paulson, Charles
Smith, Richard L.
Completing the Results of the 2013 Boston Marathon
title Completing the Results of the 2013 Boston Marathon
title_full Completing the Results of the 2013 Boston Marathon
title_fullStr Completing the Results of the 2013 Boston Marathon
title_full_unstemmed Completing the Results of the 2013 Boston Marathon
title_short Completing the Results of the 2013 Boston Marathon
title_sort completing the results of the 2013 boston marathon
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984103/
https://www.ncbi.nlm.nih.gov/pubmed/24727904
http://dx.doi.org/10.1371/journal.pone.0093800
work_keys_str_mv AT hammerlingdorit completingtheresultsofthe2013bostonmarathon
AT cefalumatthew completingtheresultsofthe2013bostonmarathon
AT cisewskijessi completingtheresultsofthe2013bostonmarathon
AT dominicifrancesca completingtheresultsofthe2013bostonmarathon
AT parmigianigiovanni completingtheresultsofthe2013bostonmarathon
AT paulsoncharles completingtheresultsofthe2013bostonmarathon
AT smithrichardl completingtheresultsofthe2013bostonmarathon