Cargando…

Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a c...

Descripción completa

Detalles Bibliográficos
Autores principales: McMurry, Julie A., Juty, Nick, Blomberg, Niklas, Burdett, Tony, Conlin, Tom, Conte, Nathalie, Courtot, Mélanie, Deck, John, Dumontier, Michel, Fellows, Donal K., Gonzalez-Beltran, Alejandra, Gormanns, Philipp, Grethe, Jeffrey, Hastings, Janna, Hériché, Jean-Karim, Hermjakob, Henning, Ison, Jon C., Jimenez, Rafael C., Jupp, Simon, Kunze, John, Laibe, Camille, Le Novère, Nicolas, Malone, James, Martin, Maria Jesus, McEntyre, Johanna R., Morris, Chris, Muilu, Juha, Müller, Wolfgang, Rocca-Serra, Philippe, Sansone, Susanna-Assunta, Sariyar, Murat, Snoep, Jacky L., Soiland-Reyes, Stian, Stanford, Natalie J., Swainston, Neil, Washington, Nicole, Williams, Alan R., Wimalaratne, Sarala M., Winfree, Lilly M., Wolstencroft, Katherine, Goble, Carole, Mungall, Christopher J., Haendel, Melissa A., Parkinson, Helen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5490878/
https://www.ncbi.nlm.nih.gov/pubmed/28662064
http://dx.doi.org/10.1371/journal.pbio.2001414
_version_ 1783247054741438464
author McMurry, Julie A.
Juty, Nick
Blomberg, Niklas
Burdett, Tony
Conlin, Tom
Conte, Nathalie
Courtot, Mélanie
Deck, John
Dumontier, Michel
Fellows, Donal K.
Gonzalez-Beltran, Alejandra
Gormanns, Philipp
Grethe, Jeffrey
Hastings, Janna
Hériché, Jean-Karim
Hermjakob, Henning
Ison, Jon C.
Jimenez, Rafael C.
Jupp, Simon
Kunze, John
Laibe, Camille
Le Novère, Nicolas
Malone, James
Martin, Maria Jesus
McEntyre, Johanna R.
Morris, Chris
Muilu, Juha
Müller, Wolfgang
Rocca-Serra, Philippe
Sansone, Susanna-Assunta
Sariyar, Murat
Snoep, Jacky L.
Soiland-Reyes, Stian
Stanford, Natalie J.
Swainston, Neil
Washington, Nicole
Williams, Alan R.
Wimalaratne, Sarala M.
Winfree, Lilly M.
Wolstencroft, Katherine
Goble, Carole
Mungall, Christopher J.
Haendel, Melissa A.
Parkinson, Helen
author_facet McMurry, Julie A.
Juty, Nick
Blomberg, Niklas
Burdett, Tony
Conlin, Tom
Conte, Nathalie
Courtot, Mélanie
Deck, John
Dumontier, Michel
Fellows, Donal K.
Gonzalez-Beltran, Alejandra
Gormanns, Philipp
Grethe, Jeffrey
Hastings, Janna
Hériché, Jean-Karim
Hermjakob, Henning
Ison, Jon C.
Jimenez, Rafael C.
Jupp, Simon
Kunze, John
Laibe, Camille
Le Novère, Nicolas
Malone, James
Martin, Maria Jesus
McEntyre, Johanna R.
Morris, Chris
Muilu, Juha
Müller, Wolfgang
Rocca-Serra, Philippe
Sansone, Susanna-Assunta
Sariyar, Murat
Snoep, Jacky L.
Soiland-Reyes, Stian
Stanford, Natalie J.
Swainston, Neil
Washington, Nicole
Williams, Alan R.
Wimalaratne, Sarala M.
Winfree, Lilly M.
Wolstencroft, Katherine
Goble, Carole
Mungall, Christopher J.
Haendel, Melissa A.
Parkinson, Helen
author_sort McMurry, Julie A.
collection PubMed
description In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.
format Online
Article
Text
id pubmed-5490878
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54908782017-07-18 Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data McMurry, Julie A. Juty, Nick Blomberg, Niklas Burdett, Tony Conlin, Tom Conte, Nathalie Courtot, Mélanie Deck, John Dumontier, Michel Fellows, Donal K. Gonzalez-Beltran, Alejandra Gormanns, Philipp Grethe, Jeffrey Hastings, Janna Hériché, Jean-Karim Hermjakob, Henning Ison, Jon C. Jimenez, Rafael C. Jupp, Simon Kunze, John Laibe, Camille Le Novère, Nicolas Malone, James Martin, Maria Jesus McEntyre, Johanna R. Morris, Chris Muilu, Juha Müller, Wolfgang Rocca-Serra, Philippe Sansone, Susanna-Assunta Sariyar, Murat Snoep, Jacky L. Soiland-Reyes, Stian Stanford, Natalie J. Swainston, Neil Washington, Nicole Williams, Alan R. Wimalaratne, Sarala M. Winfree, Lilly M. Wolstencroft, Katherine Goble, Carole Mungall, Christopher J. Haendel, Melissa A. Parkinson, Helen PLoS Biol Perspective In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines. Public Library of Science 2017-06-29 /pmc/articles/PMC5490878/ /pubmed/28662064 http://dx.doi.org/10.1371/journal.pbio.2001414 Text en © 2017 McMurry et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Perspective
McMurry, Julie A.
Juty, Nick
Blomberg, Niklas
Burdett, Tony
Conlin, Tom
Conte, Nathalie
Courtot, Mélanie
Deck, John
Dumontier, Michel
Fellows, Donal K.
Gonzalez-Beltran, Alejandra
Gormanns, Philipp
Grethe, Jeffrey
Hastings, Janna
Hériché, Jean-Karim
Hermjakob, Henning
Ison, Jon C.
Jimenez, Rafael C.
Jupp, Simon
Kunze, John
Laibe, Camille
Le Novère, Nicolas
Malone, James
Martin, Maria Jesus
McEntyre, Johanna R.
Morris, Chris
Muilu, Juha
Müller, Wolfgang
Rocca-Serra, Philippe
Sansone, Susanna-Assunta
Sariyar, Murat
Snoep, Jacky L.
Soiland-Reyes, Stian
Stanford, Natalie J.
Swainston, Neil
Washington, Nicole
Williams, Alan R.
Wimalaratne, Sarala M.
Winfree, Lilly M.
Wolstencroft, Katherine
Goble, Carole
Mungall, Christopher J.
Haendel, Melissa A.
Parkinson, Helen
Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
title Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
title_full Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
title_fullStr Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
title_full_unstemmed Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
title_short Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
title_sort identifiers for the 21st century: how to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
topic Perspective
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5490878/
https://www.ncbi.nlm.nih.gov/pubmed/28662064
http://dx.doi.org/10.1371/journal.pbio.2001414
work_keys_str_mv AT mcmurryjuliea identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT jutynick identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT blombergniklas identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT burdetttony identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT conlintom identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT contenathalie identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT courtotmelanie identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT deckjohn identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT dumontiermichel identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT fellowsdonalk identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT gonzalezbeltranalejandra identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT gormannsphilipp identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT grethejeffrey identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT hastingsjanna identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT herichejeankarim identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT hermjakobhenning identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT isonjonc identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT jimenezrafaelc identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT juppsimon identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT kunzejohn identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT laibecamille identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT lenoverenicolas identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT malonejames identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT martinmariajesus identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT mcentyrejohannar identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT morrischris identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT muilujuha identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT mullerwolfgang identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT roccaserraphilippe identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT sansonesusannaassunta identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT sariyarmurat identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT snoepjackyl identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT soilandreyesstian identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT stanfordnataliej identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT swainstonneil identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT washingtonnicole identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT williamsalanr identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT wimalaratnesaralam identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT winfreelillym identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT wolstencroftkatherine identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT goblecarole identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT mungallchristopherj identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT haendelmelissaa identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata
AT parkinsonhelen identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata