Cargando…

Data mining patented antibody sequences

The patent literature should reflect the past 30 years of engineering efforts directed toward developing monoclonal antibody therapeutics. Such information is potentially valuable for rational antibody design. Patents, however, are designed not to convey scientific knowledge, but to provide legal pr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Krawczyk, Konrad, Buchanan, Andrew, Marcatili, Paolo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Taylor & Francis 2021
Materias:	Report
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7971238/ https://www.ncbi.nlm.nih.gov/pubmed/33722161 http://dx.doi.org/10.1080/19420862.2021.1892366

_version_	1783666574752743424
author	Krawczyk, Konrad Buchanan, Andrew Marcatili, Paolo
author_facet	Krawczyk, Konrad Buchanan, Andrew Marcatili, Paolo
author_sort	Krawczyk, Konrad
collection	PubMed
description	The patent literature should reflect the past 30 years of engineering efforts directed toward developing monoclonal antibody therapeutics. Such information is potentially valuable for rational antibody design. Patents, however, are designed not to convey scientific knowledge, but to provide legal protection. It is not obvious whether antibody information from patent documents, such as antibody sequences, is useful in conveying engineering know-how, rather than as a legal reference only. To assess the utility of patent data for therapeutic antibody engineering, we quantified the amount of antibody sequences in patents destined for medicinal purposes and how well they reflect the primary sequences of therapeutic antibodies in clinical use. We identified 16,526 patent families covering major jurisdictions (e.g., US Patent and Trademark Office (USPTO) and World Intellectual Property Organization) that contained antibody sequences. These families held 245,109 unique antibody chains (135,397 heavy chains and 109,712 light chains) that we compiled in our Patented Antibody Database (PAD, http://naturalantibody.com/pad). We find that antibodies make up a non-trivial proportion of all patent amino acid sequence depositions (e.g., 11% of USPTO Full Text database). Our analysis of the 16,526 families demonstrates that the volume of patent documents with antibody sequences is growing, with the majority of documents classified as containing antibodies for medicinal purposes. We further studied the 245,109 antibody chains from patent literature to reveal that they very well reflect the primary sequences of antibody therapeutics in clinical use. This suggests that the patent literature could serve as a reference for previous engineering efforts to improve rational antibody design.
format	Online Article Text
id	pubmed-7971238
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Taylor & Francis
record_format	MEDLINE/PubMed
spelling	pubmed-79712382021-03-31 Data mining patented antibody sequences Krawczyk, Konrad Buchanan, Andrew Marcatili, Paolo MAbs Report The patent literature should reflect the past 30 years of engineering efforts directed toward developing monoclonal antibody therapeutics. Such information is potentially valuable for rational antibody design. Patents, however, are designed not to convey scientific knowledge, but to provide legal protection. It is not obvious whether antibody information from patent documents, such as antibody sequences, is useful in conveying engineering know-how, rather than as a legal reference only. To assess the utility of patent data for therapeutic antibody engineering, we quantified the amount of antibody sequences in patents destined for medicinal purposes and how well they reflect the primary sequences of therapeutic antibodies in clinical use. We identified 16,526 patent families covering major jurisdictions (e.g., US Patent and Trademark Office (USPTO) and World Intellectual Property Organization) that contained antibody sequences. These families held 245,109 unique antibody chains (135,397 heavy chains and 109,712 light chains) that we compiled in our Patented Antibody Database (PAD, http://naturalantibody.com/pad). We find that antibodies make up a non-trivial proportion of all patent amino acid sequence depositions (e.g., 11% of USPTO Full Text database). Our analysis of the 16,526 families demonstrates that the volume of patent documents with antibody sequences is growing, with the majority of documents classified as containing antibodies for medicinal purposes. We further studied the 245,109 antibody chains from patent literature to reveal that they very well reflect the primary sequences of antibody therapeutics in clinical use. This suggests that the patent literature could serve as a reference for previous engineering efforts to improve rational antibody design. Taylor & Francis 2021-03-15 /pmc/articles/PMC7971238/ /pubmed/33722161 http://dx.doi.org/10.1080/19420862.2021.1892366 Text en © 2021 The Author(s). Published with license by Taylor & Francis Group, LLC. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Report Krawczyk, Konrad Buchanan, Andrew Marcatili, Paolo Data mining patented antibody sequences
title	Data mining patented antibody sequences
title_full	Data mining patented antibody sequences
title_fullStr	Data mining patented antibody sequences
title_full_unstemmed	Data mining patented antibody sequences
title_short	Data mining patented antibody sequences
title_sort	data mining patented antibody sequences
topic	Report
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7971238/ https://www.ncbi.nlm.nih.gov/pubmed/33722161 http://dx.doi.org/10.1080/19420862.2021.1892366
work_keys_str_mv	AT krawczykkonrad dataminingpatentedantibodysequences AT buchananandrew dataminingpatentedantibodysequences AT marcatilipaolo dataminingpatentedantibodysequences

Data mining patented antibody sequences

Ejemplares similares