Cargando…
Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data
In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a c...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5490878/ https://www.ncbi.nlm.nih.gov/pubmed/28662064 http://dx.doi.org/10.1371/journal.pbio.2001414 |
_version_ | 1783247054741438464 |
---|---|
author | McMurry, Julie A. Juty, Nick Blomberg, Niklas Burdett, Tony Conlin, Tom Conte, Nathalie Courtot, Mélanie Deck, John Dumontier, Michel Fellows, Donal K. Gonzalez-Beltran, Alejandra Gormanns, Philipp Grethe, Jeffrey Hastings, Janna Hériché, Jean-Karim Hermjakob, Henning Ison, Jon C. Jimenez, Rafael C. Jupp, Simon Kunze, John Laibe, Camille Le Novère, Nicolas Malone, James Martin, Maria Jesus McEntyre, Johanna R. Morris, Chris Muilu, Juha Müller, Wolfgang Rocca-Serra, Philippe Sansone, Susanna-Assunta Sariyar, Murat Snoep, Jacky L. Soiland-Reyes, Stian Stanford, Natalie J. Swainston, Neil Washington, Nicole Williams, Alan R. Wimalaratne, Sarala M. Winfree, Lilly M. Wolstencroft, Katherine Goble, Carole Mungall, Christopher J. Haendel, Melissa A. Parkinson, Helen |
author_facet | McMurry, Julie A. Juty, Nick Blomberg, Niklas Burdett, Tony Conlin, Tom Conte, Nathalie Courtot, Mélanie Deck, John Dumontier, Michel Fellows, Donal K. Gonzalez-Beltran, Alejandra Gormanns, Philipp Grethe, Jeffrey Hastings, Janna Hériché, Jean-Karim Hermjakob, Henning Ison, Jon C. Jimenez, Rafael C. Jupp, Simon Kunze, John Laibe, Camille Le Novère, Nicolas Malone, James Martin, Maria Jesus McEntyre, Johanna R. Morris, Chris Muilu, Juha Müller, Wolfgang Rocca-Serra, Philippe Sansone, Susanna-Assunta Sariyar, Murat Snoep, Jacky L. Soiland-Reyes, Stian Stanford, Natalie J. Swainston, Neil Washington, Nicole Williams, Alan R. Wimalaratne, Sarala M. Winfree, Lilly M. Wolstencroft, Katherine Goble, Carole Mungall, Christopher J. Haendel, Melissa A. Parkinson, Helen |
author_sort | McMurry, Julie A. |
collection | PubMed |
description | In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines. |
format | Online Article Text |
id | pubmed-5490878 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-54908782017-07-18 Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data McMurry, Julie A. Juty, Nick Blomberg, Niklas Burdett, Tony Conlin, Tom Conte, Nathalie Courtot, Mélanie Deck, John Dumontier, Michel Fellows, Donal K. Gonzalez-Beltran, Alejandra Gormanns, Philipp Grethe, Jeffrey Hastings, Janna Hériché, Jean-Karim Hermjakob, Henning Ison, Jon C. Jimenez, Rafael C. Jupp, Simon Kunze, John Laibe, Camille Le Novère, Nicolas Malone, James Martin, Maria Jesus McEntyre, Johanna R. Morris, Chris Muilu, Juha Müller, Wolfgang Rocca-Serra, Philippe Sansone, Susanna-Assunta Sariyar, Murat Snoep, Jacky L. Soiland-Reyes, Stian Stanford, Natalie J. Swainston, Neil Washington, Nicole Williams, Alan R. Wimalaratne, Sarala M. Winfree, Lilly M. Wolstencroft, Katherine Goble, Carole Mungall, Christopher J. Haendel, Melissa A. Parkinson, Helen PLoS Biol Perspective In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines. Public Library of Science 2017-06-29 /pmc/articles/PMC5490878/ /pubmed/28662064 http://dx.doi.org/10.1371/journal.pbio.2001414 Text en © 2017 McMurry et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Perspective McMurry, Julie A. Juty, Nick Blomberg, Niklas Burdett, Tony Conlin, Tom Conte, Nathalie Courtot, Mélanie Deck, John Dumontier, Michel Fellows, Donal K. Gonzalez-Beltran, Alejandra Gormanns, Philipp Grethe, Jeffrey Hastings, Janna Hériché, Jean-Karim Hermjakob, Henning Ison, Jon C. Jimenez, Rafael C. Jupp, Simon Kunze, John Laibe, Camille Le Novère, Nicolas Malone, James Martin, Maria Jesus McEntyre, Johanna R. Morris, Chris Muilu, Juha Müller, Wolfgang Rocca-Serra, Philippe Sansone, Susanna-Assunta Sariyar, Murat Snoep, Jacky L. Soiland-Reyes, Stian Stanford, Natalie J. Swainston, Neil Washington, Nicole Williams, Alan R. Wimalaratne, Sarala M. Winfree, Lilly M. Wolstencroft, Katherine Goble, Carole Mungall, Christopher J. Haendel, Melissa A. Parkinson, Helen Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
title | Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
title_full | Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
title_fullStr | Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
title_full_unstemmed | Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
title_short | Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
title_sort | identifiers for the 21st century: how to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data |
topic | Perspective |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5490878/ https://www.ncbi.nlm.nih.gov/pubmed/28662064 http://dx.doi.org/10.1371/journal.pbio.2001414 |
work_keys_str_mv | AT mcmurryjuliea identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT jutynick identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT blombergniklas identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT burdetttony identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT conlintom identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT contenathalie identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT courtotmelanie identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT deckjohn identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT dumontiermichel identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT fellowsdonalk identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT gonzalezbeltranalejandra identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT gormannsphilipp identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT grethejeffrey identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT hastingsjanna identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT herichejeankarim identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT hermjakobhenning identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT isonjonc identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT jimenezrafaelc identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT juppsimon identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT kunzejohn identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT laibecamille identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT lenoverenicolas identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT malonejames identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT martinmariajesus identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT mcentyrejohannar identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT morrischris identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT muilujuha identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT mullerwolfgang identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT roccaserraphilippe identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT sansonesusannaassunta identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT sariyarmurat identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT snoepjackyl identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT soilandreyesstian identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT stanfordnataliej identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT swainstonneil identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT washingtonnicole identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT williamsalanr identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT wimalaratnesaralam identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT winfreelillym identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT wolstencroftkatherine identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT goblecarole identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT mungallchristopherj identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT haendelmelissaa identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata AT parkinsonhelen identifiersforthe21stcenturyhowtodesignprovisionandreusepersistentidentifierstomaximizeutilityandimpactoflifesciencedata |