Cargando…

Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search

In this study, we faced the challenge of deciphering a protein that has been designed and expressed by E. coli in such a way that the amino acid sequence encodes two concatenated English sentences. The letters ‘O’ and ‘U’ in the sentence are both replaced by ‘K’ in the protein. The sequence cannot b...

Descripción completa

Detalles Bibliográficos
Autores principales: Niu, Lili, Mann, Matthias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6924291/
https://www.ncbi.nlm.nih.gov/pubmed/31890553
http://dx.doi.org/10.1016/j.euprot.2019.07.010
_version_ 1783481699661774848
author Niu, Lili
Mann, Matthias
author_facet Niu, Lili
Mann, Matthias
author_sort Niu, Lili
collection PubMed
description In this study, we faced the challenge of deciphering a protein that has been designed and expressed by E. coli in such a way that the amino acid sequence encodes two concatenated English sentences. The letters ‘O’ and ‘U’ in the sentence are both replaced by ‘K’ in the protein. The sequence cannot be found online and carried to-be-discovered modifications. With limited information in hand, to solve the challenge, we developed a workflow consisting of bottom-up proteomics, de novo sequencing and a bioinformatics pipeline for data processing and searching for frequently appearing words. We assembled a complete first question: “Have you ever wondered what the most fundamental limitations in life are?” and validated the result by sequence database search against a customized FASTA file. We also searched the spectra against an E. coli proteome database and found close to 600 endogenous, co-purified E. coli proteins and contaminants introduced during sample handling, which made the inference of the sentence very challenging. We conclude that E. coli can express English sentences, and that de novo sequencing combined with clever sequence database search strategies is a promising tool for the identification of uncharacterized proteins.
format Online
Article
Text
id pubmed-6924291
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-69242912019-12-30 Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search Niu, Lili Mann, Matthias EuPA Open Proteom Article In this study, we faced the challenge of deciphering a protein that has been designed and expressed by E. coli in such a way that the amino acid sequence encodes two concatenated English sentences. The letters ‘O’ and ‘U’ in the sentence are both replaced by ‘K’ in the protein. The sequence cannot be found online and carried to-be-discovered modifications. With limited information in hand, to solve the challenge, we developed a workflow consisting of bottom-up proteomics, de novo sequencing and a bioinformatics pipeline for data processing and searching for frequently appearing words. We assembled a complete first question: “Have you ever wondered what the most fundamental limitations in life are?” and validated the result by sequence database search against a customized FASTA file. We also searched the spectra against an E. coli proteome database and found close to 600 endogenous, co-purified E. coli proteins and contaminants introduced during sample handling, which made the inference of the sentence very challenging. We conclude that E. coli can express English sentences, and that de novo sequencing combined with clever sequence database search strategies is a promising tool for the identification of uncharacterized proteins. Elsevier 2019-07-29 /pmc/articles/PMC6924291/ /pubmed/31890553 http://dx.doi.org/10.1016/j.euprot.2019.07.010 Text en © 2019 Published by Elsevier B.V. on behalf of European Proteomics Association (EuPA). http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Niu, Lili
Mann, Matthias
Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search
title Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search
title_full Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search
title_fullStr Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search
title_full_unstemmed Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search
title_short Quick and clean: Cracking sentences encoded in E. coli by LC–MS/MS, de novo sequencing, and dictionary search
title_sort quick and clean: cracking sentences encoded in e. coli by lc–ms/ms, de novo sequencing, and dictionary search
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6924291/
https://www.ncbi.nlm.nih.gov/pubmed/31890553
http://dx.doi.org/10.1016/j.euprot.2019.07.010
work_keys_str_mv AT niulili quickandcleancrackingsentencesencodedinecolibylcmsmsdenovosequencinganddictionarysearch
AT mannmatthias quickandcleancrackingsentencesencodedinecolibylcmsmsdenovosequencinganddictionarysearch