Cargando…
Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation
The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an internat...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753299/ https://www.ncbi.nlm.nih.gov/pubmed/29126148 http://dx.doi.org/10.1093/nar/gkx1031 |
_version_ | 1783290251870994432 |
---|---|
author | Pujar, Shashikant O’Leary, Nuala A Farrell, Catherine M Loveland, Jane E Mudge, Jonathan M Wallin, Craig Girón, Carlos G Diekhans, Mark Barnes, If Bennett, Ruth Berry, Andrew E Cox, Eric Davidson, Claire Goldfarb, Tamara Gonzalez, Jose M Hunt, Toby Jackson, John Joardar, Vinita Kay, Mike P Kodali, Vamsi K Martin, Fergal J McAndrews, Monica McGarvey, Kelly M Murphy, Michael Rajput, Bhanu Rangwala, Sanjida H Riddick, Lillian D Seal, Ruth L Suner, Marie-Marthe Webb, David Zhu, Sophia Aken, Bronwen L Bruford, Elspeth A Bult, Carol J Frankish, Adam Murphy, Terence Pruitt, Kim D |
author_facet | Pujar, Shashikant O’Leary, Nuala A Farrell, Catherine M Loveland, Jane E Mudge, Jonathan M Wallin, Craig Girón, Carlos G Diekhans, Mark Barnes, If Bennett, Ruth Berry, Andrew E Cox, Eric Davidson, Claire Goldfarb, Tamara Gonzalez, Jose M Hunt, Toby Jackson, John Joardar, Vinita Kay, Mike P Kodali, Vamsi K Martin, Fergal J McAndrews, Monica McGarvey, Kelly M Murphy, Michael Rajput, Bhanu Rangwala, Sanjida H Riddick, Lillian D Seal, Ruth L Suner, Marie-Marthe Webb, David Zhu, Sophia Aken, Bronwen L Bruford, Elspeth A Bult, Carol J Frankish, Adam Murphy, Terence Pruitt, Kim D |
author_sort | Pujar, Shashikant |
collection | PubMed |
description | The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. |
format | Online Article Text |
id | pubmed-5753299 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-57532992018-01-05 Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation Pujar, Shashikant O’Leary, Nuala A Farrell, Catherine M Loveland, Jane E Mudge, Jonathan M Wallin, Craig Girón, Carlos G Diekhans, Mark Barnes, If Bennett, Ruth Berry, Andrew E Cox, Eric Davidson, Claire Goldfarb, Tamara Gonzalez, Jose M Hunt, Toby Jackson, John Joardar, Vinita Kay, Mike P Kodali, Vamsi K Martin, Fergal J McAndrews, Monica McGarvey, Kelly M Murphy, Michael Rajput, Bhanu Rangwala, Sanjida H Riddick, Lillian D Seal, Ruth L Suner, Marie-Marthe Webb, David Zhu, Sophia Aken, Bronwen L Bruford, Elspeth A Bult, Carol J Frankish, Adam Murphy, Terence Pruitt, Kim D Nucleic Acids Res Database Issue The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Oxford University Press 2018-01-04 2017-11-06 /pmc/articles/PMC5753299/ /pubmed/29126148 http://dx.doi.org/10.1093/nar/gkx1031 Text en Published by Oxford University Press on behalf of Nucleic Acids Research 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US. |
spellingShingle | Database Issue Pujar, Shashikant O’Leary, Nuala A Farrell, Catherine M Loveland, Jane E Mudge, Jonathan M Wallin, Craig Girón, Carlos G Diekhans, Mark Barnes, If Bennett, Ruth Berry, Andrew E Cox, Eric Davidson, Claire Goldfarb, Tamara Gonzalez, Jose M Hunt, Toby Jackson, John Joardar, Vinita Kay, Mike P Kodali, Vamsi K Martin, Fergal J McAndrews, Monica McGarvey, Kelly M Murphy, Michael Rajput, Bhanu Rangwala, Sanjida H Riddick, Lillian D Seal, Ruth L Suner, Marie-Marthe Webb, David Zhu, Sophia Aken, Bronwen L Bruford, Elspeth A Bult, Carol J Frankish, Adam Murphy, Terence Pruitt, Kim D Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
title | Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
title_full | Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
title_fullStr | Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
title_full_unstemmed | Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
title_short | Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
title_sort | consensus coding sequence (ccds) database: a standardized set of human and mouse protein-coding regions supported by expert curation |
topic | Database Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753299/ https://www.ncbi.nlm.nih.gov/pubmed/29126148 http://dx.doi.org/10.1093/nar/gkx1031 |
work_keys_str_mv | AT pujarshashikant consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT olearynualaa consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT farrellcatherinem consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT lovelandjanee consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT mudgejonathanm consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT wallincraig consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT gironcarlosg consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT diekhansmark consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT barnesif consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT bennettruth consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT berryandrewe consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT coxeric consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT davidsonclaire consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT goldfarbtamara consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT gonzalezjosem consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT hunttoby consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT jacksonjohn consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT joardarvinita consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT kaymikep consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT kodalivamsik consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT martinfergalj consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT mcandrewsmonica consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT mcgarveykellym consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT murphymichael consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT rajputbhanu consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT rangwalasanjidah consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT riddicklilliand consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT sealruthl consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT sunermariemarthe consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT webbdavid consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT zhusophia consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT akenbronwenl consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT brufordelspetha consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT bultcarolj consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT frankishadam consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT murphyterence consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration AT pruittkimd consensuscodingsequenceccdsdatabaseastandardizedsetofhumanandmouseproteincodingregionssupportedbyexpertcuration |