Cargando…

The Annotation-enriched non-redundant patent sequence databases

The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Weizhong, Kondratowicz, Bartosz, McWilliam, Hamish, Nauche, Stephane, Lopez, Rodrigo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2013
Materias:	Database Update
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568390/ https://www.ncbi.nlm.nih.gov/pubmed/23396323 http://dx.doi.org/10.1093/database/bat005

_version_	1782258783173476352
author	Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo
author_facet	Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo
author_sort	Li, Weizhong
collection	PubMed
description	The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/
format	Online Article Text
id	pubmed-3568390
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-35683902013-02-11 The Annotation-enriched non-redundant patent sequence databases Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo Database (Oxford) Database Update The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ Oxford University Press 2013-02-09 /pmc/articles/PMC3568390/ /pubmed/23396323 http://dx.doi.org/10.1093/database/bat005 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Database Update Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo The Annotation-enriched non-redundant patent sequence databases
title	The Annotation-enriched non-redundant patent sequence databases
title_full	The Annotation-enriched non-redundant patent sequence databases
title_fullStr	The Annotation-enriched non-redundant patent sequence databases
title_full_unstemmed	The Annotation-enriched non-redundant patent sequence databases
title_short	The Annotation-enriched non-redundant patent sequence databases
title_sort	annotation-enriched non-redundant patent sequence databases
topic	Database Update
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568390/ https://www.ncbi.nlm.nih.gov/pubmed/23396323 http://dx.doi.org/10.1093/database/bat005
work_keys_str_mv	AT liweizhong theannotationenrichednonredundantpatentsequencedatabases AT kondratowiczbartosz theannotationenrichednonredundantpatentsequencedatabases AT mcwilliamhamish theannotationenrichednonredundantpatentsequencedatabases AT nauchestephane theannotationenrichednonredundantpatentsequencedatabases AT lopezrodrigo theannotationenrichednonredundantpatentsequencedatabases AT liweizhong annotationenrichednonredundantpatentsequencedatabases AT kondratowiczbartosz annotationenrichednonredundantpatentsequencedatabases AT mcwilliamhamish annotationenrichednonredundantpatentsequencedatabases AT nauchestephane annotationenrichednonredundantpatentsequencedatabases AT lopezrodrigo annotationenrichednonredundantpatentsequencedatabases

The Annotation-enriched non-redundant patent sequence databases

Ejemplares similares