Cargando…

ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning

Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors....

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Xinmeng, Van Deventer, James A., Hassoun, Soha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7205315/
https://www.ncbi.nlm.nih.gov/pubmed/32339164
http://dx.doi.org/10.1371/journal.pcbi.1007779
_version_ 1783530218190798848
author Li, Xinmeng
Van Deventer, James A.
Hassoun, Soha
author_facet Li, Xinmeng
Van Deventer, James A.
Hassoun, Soha
author_sort Li, Xinmeng
collection PubMed
description Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The pipeline extracts feature fingerprints from sequences. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets.
format Online
Article
Text
id pubmed-7205315
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-72053152020-05-12 ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning Li, Xinmeng Van Deventer, James A. Hassoun, Soha PLoS Comput Biol Research Article Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The pipeline extracts feature fingerprints from sequences. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets. Public Library of Science 2020-04-27 /pmc/articles/PMC7205315/ /pubmed/32339164 http://dx.doi.org/10.1371/journal.pcbi.1007779 Text en © 2020 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Xinmeng
Van Deventer, James A.
Hassoun, Soha
ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
title ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
title_full ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
title_fullStr ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
title_full_unstemmed ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
title_short ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
title_sort asap-sml: an antibody sequence analysis pipeline using statistical testing and machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7205315/
https://www.ncbi.nlm.nih.gov/pubmed/32339164
http://dx.doi.org/10.1371/journal.pcbi.1007779
work_keys_str_mv AT lixinmeng asapsmlanantibodysequenceanalysispipelineusingstatisticaltestingandmachinelearning
AT vandeventerjamesa asapsmlanantibodysequenceanalysispipelineusingstatisticaltestingandmachinelearning
AT hassounsoha asapsmlanantibodysequenceanalysispipelineusingstatisticaltestingandmachinelearning