Cargando…
ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors....
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7205315/ https://www.ncbi.nlm.nih.gov/pubmed/32339164 http://dx.doi.org/10.1371/journal.pcbi.1007779 |
_version_ | 1783530218190798848 |
---|---|
author | Li, Xinmeng Van Deventer, James A. Hassoun, Soha |
author_facet | Li, Xinmeng Van Deventer, James A. Hassoun, Soha |
author_sort | Li, Xinmeng |
collection | PubMed |
description | Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The pipeline extracts feature fingerprints from sequences. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets. |
format | Online Article Text |
id | pubmed-7205315 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-72053152020-05-12 ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning Li, Xinmeng Van Deventer, James A. Hassoun, Soha PLoS Comput Biol Research Article Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The pipeline extracts feature fingerprints from sequences. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets. Public Library of Science 2020-04-27 /pmc/articles/PMC7205315/ /pubmed/32339164 http://dx.doi.org/10.1371/journal.pcbi.1007779 Text en © 2020 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Li, Xinmeng Van Deventer, James A. Hassoun, Soha ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning |
title | ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning |
title_full | ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning |
title_fullStr | ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning |
title_full_unstemmed | ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning |
title_short | ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning |
title_sort | asap-sml: an antibody sequence analysis pipeline using statistical testing and machine learning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7205315/ https://www.ncbi.nlm.nih.gov/pubmed/32339164 http://dx.doi.org/10.1371/journal.pcbi.1007779 |
work_keys_str_mv | AT lixinmeng asapsmlanantibodysequenceanalysispipelineusingstatisticaltestingandmachinelearning AT vandeventerjamesa asapsmlanantibodysequenceanalysispipelineusingstatisticaltestingandmachinelearning AT hassounsoha asapsmlanantibodysequenceanalysispipelineusingstatisticaltestingandmachinelearning |