Cargando…

Complete De Novo Assembly of Monoclonal Antibody Sequences

De novo protein sequencing is one of the key problems in mass spectrometry-based proteomics, especially for novel proteins such as monoclonal antibodies for which genome information is often limited or not available. However, due to limitations in peptides fragmentation and coverage, as well as ambi...

Descripción completa

Detalles Bibliográficos
Autores principales: Tran, Ngoc Hieu, Rahman, M. Ziaur, He, Lin, Xin, Lei, Shan, Baozhen, Li, Ming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4999880/
https://www.ncbi.nlm.nih.gov/pubmed/27562653
http://dx.doi.org/10.1038/srep31730
_version_ 1782450180499439616
author Tran, Ngoc Hieu
Rahman, M. Ziaur
He, Lin
Xin, Lei
Shan, Baozhen
Li, Ming
author_facet Tran, Ngoc Hieu
Rahman, M. Ziaur
He, Lin
Xin, Lei
Shan, Baozhen
Li, Ming
author_sort Tran, Ngoc Hieu
collection PubMed
description De novo protein sequencing is one of the key problems in mass spectrometry-based proteomics, especially for novel proteins such as monoclonal antibodies for which genome information is often limited or not available. However, due to limitations in peptides fragmentation and coverage, as well as ambiguities in spectra interpretation, complete de novo assembly of unknown protein sequences still remains challenging. To address this problem, we propose an integrated system, ALPS, which for the first time can automatically assemble full-length monoclonal antibody sequences. Our system integrates de novo sequencing peptides, their quality scores and error-correction information from databases into a weighted de Bruijn graph to assemble protein sequences. We evaluated ALPS performance on two antibody data sets, each including a heavy chain and a light chain. The results show that ALPS was able to assemble three complete monoclonal antibody sequences of length 216–441 AA, at 100% coverage, and 96.64–100% accuracy.
format Online
Article
Text
id pubmed-4999880
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-49998802016-09-07 Complete De Novo Assembly of Monoclonal Antibody Sequences Tran, Ngoc Hieu Rahman, M. Ziaur He, Lin Xin, Lei Shan, Baozhen Li, Ming Sci Rep Article De novo protein sequencing is one of the key problems in mass spectrometry-based proteomics, especially for novel proteins such as monoclonal antibodies for which genome information is often limited or not available. However, due to limitations in peptides fragmentation and coverage, as well as ambiguities in spectra interpretation, complete de novo assembly of unknown protein sequences still remains challenging. To address this problem, we propose an integrated system, ALPS, which for the first time can automatically assemble full-length monoclonal antibody sequences. Our system integrates de novo sequencing peptides, their quality scores and error-correction information from databases into a weighted de Bruijn graph to assemble protein sequences. We evaluated ALPS performance on two antibody data sets, each including a heavy chain and a light chain. The results show that ALPS was able to assemble three complete monoclonal antibody sequences of length 216–441 AA, at 100% coverage, and 96.64–100% accuracy. Nature Publishing Group 2016-08-26 /pmc/articles/PMC4999880/ /pubmed/27562653 http://dx.doi.org/10.1038/srep31730 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Tran, Ngoc Hieu
Rahman, M. Ziaur
He, Lin
Xin, Lei
Shan, Baozhen
Li, Ming
Complete De Novo Assembly of Monoclonal Antibody Sequences
title Complete De Novo Assembly of Monoclonal Antibody Sequences
title_full Complete De Novo Assembly of Monoclonal Antibody Sequences
title_fullStr Complete De Novo Assembly of Monoclonal Antibody Sequences
title_full_unstemmed Complete De Novo Assembly of Monoclonal Antibody Sequences
title_short Complete De Novo Assembly of Monoclonal Antibody Sequences
title_sort complete de novo assembly of monoclonal antibody sequences
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4999880/
https://www.ncbi.nlm.nih.gov/pubmed/27562653
http://dx.doi.org/10.1038/srep31730
work_keys_str_mv AT tranngochieu completedenovoassemblyofmonoclonalantibodysequences
AT rahmanmziaur completedenovoassemblyofmonoclonalantibodysequences
AT helin completedenovoassemblyofmonoclonalantibodysequences
AT xinlei completedenovoassemblyofmonoclonalantibodysequences
AT shanbaozhen completedenovoassemblyofmonoclonalantibodysequences
AT liming completedenovoassemblyofmonoclonalantibodysequences