Cargando…
The Annotation-enriched non-redundant patent sequence databases
The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568390/ https://www.ncbi.nlm.nih.gov/pubmed/23396323 http://dx.doi.org/10.1093/database/bat005 |
_version_ | 1782258783173476352 |
---|---|
author | Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo |
author_facet | Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo |
author_sort | Li, Weizhong |
collection | PubMed |
description | The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ |
format | Online Article Text |
id | pubmed-3568390 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-35683902013-02-11 The Annotation-enriched non-redundant patent sequence databases Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo Database (Oxford) Database Update The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ Oxford University Press 2013-02-09 /pmc/articles/PMC3568390/ /pubmed/23396323 http://dx.doi.org/10.1093/database/bat005 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Database Update Li, Weizhong Kondratowicz, Bartosz McWilliam, Hamish Nauche, Stephane Lopez, Rodrigo The Annotation-enriched non-redundant patent sequence databases |
title | The Annotation-enriched non-redundant patent sequence databases |
title_full | The Annotation-enriched non-redundant patent sequence databases |
title_fullStr | The Annotation-enriched non-redundant patent sequence databases |
title_full_unstemmed | The Annotation-enriched non-redundant patent sequence databases |
title_short | The Annotation-enriched non-redundant patent sequence databases |
title_sort | annotation-enriched non-redundant patent sequence databases |
topic | Database Update |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568390/ https://www.ncbi.nlm.nih.gov/pubmed/23396323 http://dx.doi.org/10.1093/database/bat005 |
work_keys_str_mv | AT liweizhong theannotationenrichednonredundantpatentsequencedatabases AT kondratowiczbartosz theannotationenrichednonredundantpatentsequencedatabases AT mcwilliamhamish theannotationenrichednonredundantpatentsequencedatabases AT nauchestephane theannotationenrichednonredundantpatentsequencedatabases AT lopezrodrigo theannotationenrichednonredundantpatentsequencedatabases AT liweizhong annotationenrichednonredundantpatentsequencedatabases AT kondratowiczbartosz annotationenrichednonredundantpatentsequencedatabases AT mcwilliamhamish annotationenrichednonredundantpatentsequencedatabases AT nauchestephane annotationenrichednonredundantpatentsequencedatabases AT lopezrodrigo annotationenrichednonredundantpatentsequencedatabases |