Cargando…

Quasi-prime peptides: identification of the shortest peptide sequences unique to a species

Determining the organisms present in a biosample has many important applications in agriculture, wildlife conservation, and healthcare. Here, we develop a universal fingerprint based on the identification of short peptides that are unique to a specific organism. We define quasi-prime peptides as seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Mouratidis, Ioannis, Chan, Candace S Y, Chantzi, Nikol, Tsiatsianis, Georgios Christos, Hemberg, Martin, Ahituv, Nadav, Georgakopoulos-Soares, Ilias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10124967/
https://www.ncbi.nlm.nih.gov/pubmed/37101657
http://dx.doi.org/10.1093/nargab/lqad039
_version_ 1785029939758628864
author Mouratidis, Ioannis
Chan, Candace S Y
Chantzi, Nikol
Tsiatsianis, Georgios Christos
Hemberg, Martin
Ahituv, Nadav
Georgakopoulos-Soares, Ilias
author_facet Mouratidis, Ioannis
Chan, Candace S Y
Chantzi, Nikol
Tsiatsianis, Georgios Christos
Hemberg, Martin
Ahituv, Nadav
Georgakopoulos-Soares, Ilias
author_sort Mouratidis, Ioannis
collection PubMed
description Determining the organisms present in a biosample has many important applications in agriculture, wildlife conservation, and healthcare. Here, we develop a universal fingerprint based on the identification of short peptides that are unique to a specific organism. We define quasi-prime peptides as sequences that are found in only one species, and we analyzed proteomes from 21 875 species, from viruses to humans, and annotated the smallest peptide kmer sequences that are unique to a species and absent from all other proteomes. We also perform simulations across all reference proteomes and observe a lower than expected number of peptide kmers across species and taxonomies, indicating an enrichment for nullpeptides, sequences absent from a proteome. For humans, we find that quasi-primes are found in genes enriched for specific gene ontology terms, including proteasome and ATP and GTP catalysis. We also provide a set of quasi-prime peptides for a number of human pathogens and model organisms and further showcase its utility via two case studies for Mycobacterium tuberculosis and Vibrio cholerae, where we identify quasi-prime peptides in two transmembrane and extracellular proteins with relevance for pathogen detection. Our catalog of quasi-prime peptides provides the smallest unit of information that is specific to a single organism at the protein level, providing a versatile tool for species identification.
format Online
Article
Text
id pubmed-10124967
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101249672023-04-25 Quasi-prime peptides: identification of the shortest peptide sequences unique to a species Mouratidis, Ioannis Chan, Candace S Y Chantzi, Nikol Tsiatsianis, Georgios Christos Hemberg, Martin Ahituv, Nadav Georgakopoulos-Soares, Ilias NAR Genom Bioinform Standard Article Determining the organisms present in a biosample has many important applications in agriculture, wildlife conservation, and healthcare. Here, we develop a universal fingerprint based on the identification of short peptides that are unique to a specific organism. We define quasi-prime peptides as sequences that are found in only one species, and we analyzed proteomes from 21 875 species, from viruses to humans, and annotated the smallest peptide kmer sequences that are unique to a species and absent from all other proteomes. We also perform simulations across all reference proteomes and observe a lower than expected number of peptide kmers across species and taxonomies, indicating an enrichment for nullpeptides, sequences absent from a proteome. For humans, we find that quasi-primes are found in genes enriched for specific gene ontology terms, including proteasome and ATP and GTP catalysis. We also provide a set of quasi-prime peptides for a number of human pathogens and model organisms and further showcase its utility via two case studies for Mycobacterium tuberculosis and Vibrio cholerae, where we identify quasi-prime peptides in two transmembrane and extracellular proteins with relevance for pathogen detection. Our catalog of quasi-prime peptides provides the smallest unit of information that is specific to a single organism at the protein level, providing a versatile tool for species identification. Oxford University Press 2023-04-24 /pmc/articles/PMC10124967/ /pubmed/37101657 http://dx.doi.org/10.1093/nargab/lqad039 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Standard Article
Mouratidis, Ioannis
Chan, Candace S Y
Chantzi, Nikol
Tsiatsianis, Georgios Christos
Hemberg, Martin
Ahituv, Nadav
Georgakopoulos-Soares, Ilias
Quasi-prime peptides: identification of the shortest peptide sequences unique to a species
title Quasi-prime peptides: identification of the shortest peptide sequences unique to a species
title_full Quasi-prime peptides: identification of the shortest peptide sequences unique to a species
title_fullStr Quasi-prime peptides: identification of the shortest peptide sequences unique to a species
title_full_unstemmed Quasi-prime peptides: identification of the shortest peptide sequences unique to a species
title_short Quasi-prime peptides: identification of the shortest peptide sequences unique to a species
title_sort quasi-prime peptides: identification of the shortest peptide sequences unique to a species
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10124967/
https://www.ncbi.nlm.nih.gov/pubmed/37101657
http://dx.doi.org/10.1093/nargab/lqad039
work_keys_str_mv AT mouratidisioannis quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies
AT chancandacesy quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies
AT chantzinikol quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies
AT tsiatsianisgeorgioschristos quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies
AT hembergmartin quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies
AT ahituvnadav quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies
AT georgakopoulossoaresilias quasiprimepeptidesidentificationoftheshortestpeptidesequencesuniquetoaspecies