Cargando…

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pruitt, Kim D., Tatusova, Tatiana, Brown, Garth R., Maglott, Donna R.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2012
Materias:	Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245008/ https://www.ncbi.nlm.nih.gov/pubmed/22121212 http://dx.doi.org/10.1093/nar/gkr1079

_version_	1782219778880962560
author	Pruitt, Kim D. Tatusova, Tatiana Brown, Garth R. Maglott, Donna R.
author_facet	Pruitt, Kim D. Tatusova, Tatiana Brown, Garth R. Maglott, Donna R.
author_sort	Pruitt, Kim D.
collection	PubMed
description	The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the volume of data archived by the International Nucleotide Sequence Database Collaboration. The database includes over 16 000 organisms, 2.4 × 10(6) genomic records, 13 × 10(6) proteins and 2 × 10(6) RNA records spanning prokaryotes, eukaryotes and viruses (RefSeq release 49, September 2011). The RefSeq database is maintained by a combined approach of automated analyses, collaboration and manual curation to generate an up-to-date representation of the sequence, its features, names and cross-links to related sources of information. We report here on recent growth, the status of curating the human RefSeq data set, more extensive feature annotation and current policy for eukaryotic genome annotation via the NCBI annotation pipeline. More information about the resource is available online (see http://www.ncbi.nlm.nih.gov/RefSeq/).
format	Online Article Text
id	pubmed-3245008
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-32450082012-01-10 NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy Pruitt, Kim D. Tatusova, Tatiana Brown, Garth R. Maglott, Donna R. Nucleic Acids Res Articles The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the volume of data archived by the International Nucleotide Sequence Database Collaboration. The database includes over 16 000 organisms, 2.4 × 10(6) genomic records, 13 × 10(6) proteins and 2 × 10(6) RNA records spanning prokaryotes, eukaryotes and viruses (RefSeq release 49, September 2011). The RefSeq database is maintained by a combined approach of automated analyses, collaboration and manual curation to generate an up-to-date representation of the sequence, its features, names and cross-links to related sources of information. We report here on recent growth, the status of curating the human RefSeq data set, more extensive feature annotation and current policy for eukaryotic genome annotation via the NCBI annotation pipeline. More information about the resource is available online (see http://www.ncbi.nlm.nih.gov/RefSeq/). Oxford University Press 2012-01 2011-11-24 /pmc/articles/PMC3245008/ /pubmed/22121212 http://dx.doi.org/10.1093/nar/gkr1079 Text en Published by Oxford University Press, 2011. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Articles Pruitt, Kim D. Tatusova, Tatiana Brown, Garth R. Maglott, Donna R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
title	NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
title_full	NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
title_fullStr	NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
title_full_unstemmed	NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
title_short	NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
title_sort	ncbi reference sequences (refseq): current status, new features and genome annotation policy
topic	Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245008/ https://www.ncbi.nlm.nih.gov/pubmed/22121212 http://dx.doi.org/10.1093/nar/gkr1079
work_keys_str_mv	AT pruittkimd ncbireferencesequencesrefseqcurrentstatusnewfeaturesandgenomeannotationpolicy AT tatusovatatiana ncbireferencesequencesrefseqcurrentstatusnewfeaturesandgenomeannotationpolicy AT browngarthr ncbireferencesequencesrefseqcurrentstatusnewfeaturesandgenomeannotationpolicy AT maglottdonnar ncbireferencesequencesrefseqcurrentstatusnewfeaturesandgenomeannotationpolicy

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

Ejemplares similares