Cargando…

The Annotation-enriched non-redundant patent sequence databases

The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Weizhong, Kondratowicz, Bartosz, McWilliam, Hamish, Nauche, Stephane, Lopez, Rodrigo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568390/
https://www.ncbi.nlm.nih.gov/pubmed/23396323
http://dx.doi.org/10.1093/database/bat005
_version_ 1782258783173476352
author Li, Weizhong
Kondratowicz, Bartosz
McWilliam, Hamish
Nauche, Stephane
Lopez, Rodrigo
author_facet Li, Weizhong
Kondratowicz, Bartosz
McWilliam, Hamish
Nauche, Stephane
Lopez, Rodrigo
author_sort Li, Weizhong
collection PubMed
description The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/
format Online
Article
Text
id pubmed-3568390
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35683902013-02-11 The Annotation-enriched non-redundant patent sequence databases Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo Database (Oxford) Database Update The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ Oxford University Press 2013-02-09 /pmc/articles/PMC3568390/ /pubmed/23396323 http://dx.doi.org/10.1093/database/bat005 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Update
Li, Weizhong
Kondratowicz, Bartosz
McWilliam, Hamish
Nauche, Stephane
Lopez, Rodrigo
The Annotation-enriched non-redundant patent sequence databases
title The Annotation-enriched non-redundant patent sequence databases
title_full The Annotation-enriched non-redundant patent sequence databases
title_fullStr The Annotation-enriched non-redundant patent sequence databases
title_full_unstemmed The Annotation-enriched non-redundant patent sequence databases
title_short The Annotation-enriched non-redundant patent sequence databases
title_sort annotation-enriched non-redundant patent sequence databases
topic Database Update
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568390/
https://www.ncbi.nlm.nih.gov/pubmed/23396323
http://dx.doi.org/10.1093/database/bat005
work_keys_str_mv AT liweizhong theannotationenrichednonredundantpatentsequencedatabases
AT kondratowiczbartosz theannotationenrichednonredundantpatentsequencedatabases
AT mcwilliamhamish theannotationenrichednonredundantpatentsequencedatabases
AT nauchestephane theannotationenrichednonredundantpatentsequencedatabases
AT lopezrodrigo theannotationenrichednonredundantpatentsequencedatabases
AT liweizhong annotationenrichednonredundantpatentsequencedatabases
AT kondratowiczbartosz annotationenrichednonredundantpatentsequencedatabases
AT mcwilliamhamish annotationenrichednonredundantpatentsequencedatabases
AT nauchestephane annotationenrichednonredundantpatentsequencedatabases
AT lopezrodrigo annotationenrichednonredundantpatentsequencedatabases