Cargando…
Detecting sequence signals in targeting peptides using deep learning
In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel stat...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Life Science Alliance LLC
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769257/ https://www.ncbi.nlm.nih.gov/pubmed/31570514 http://dx.doi.org/10.26508/lsa.201900429 |
_version_ | 1783455209961291776 |
---|---|
author | Almagro Armenteros, Jose Juan Salvatore, Marco Emanuelsson, Olof Winther, Ole von Heijne, Gunnar Elofsson, Arne Nielsen, Henrik |
author_facet | Almagro Armenteros, Jose Juan Salvatore, Marco Emanuelsson, Olof Winther, Ole von Heijne, Gunnar Elofsson, Arne Nielsen, Henrik |
author_sort | Almagro Armenteros, Jose Juan |
collection | PubMed |
description | In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before. |
format | Online Article Text |
id | pubmed-6769257 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Life Science Alliance LLC |
record_format | MEDLINE/PubMed |
spelling | pubmed-67692572019-10-02 Detecting sequence signals in targeting peptides using deep learning Almagro Armenteros, Jose Juan Salvatore, Marco Emanuelsson, Olof Winther, Ole von Heijne, Gunnar Elofsson, Arne Nielsen, Henrik Life Sci Alliance Methods In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before. Life Science Alliance LLC 2019-09-30 /pmc/articles/PMC6769257/ /pubmed/31570514 http://dx.doi.org/10.26508/lsa.201900429 Text en © 2019 Armenteros et al. https://creativecommons.org/licenses/by/4.0/This article is available under a Creative Commons License (Attribution 4.0 International, as described at https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Methods Almagro Armenteros, Jose Juan Salvatore, Marco Emanuelsson, Olof Winther, Ole von Heijne, Gunnar Elofsson, Arne Nielsen, Henrik Detecting sequence signals in targeting peptides using deep learning |
title | Detecting sequence signals in targeting peptides using deep learning |
title_full | Detecting sequence signals in targeting peptides using deep learning |
title_fullStr | Detecting sequence signals in targeting peptides using deep learning |
title_full_unstemmed | Detecting sequence signals in targeting peptides using deep learning |
title_short | Detecting sequence signals in targeting peptides using deep learning |
title_sort | detecting sequence signals in targeting peptides using deep learning |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769257/ https://www.ncbi.nlm.nih.gov/pubmed/31570514 http://dx.doi.org/10.26508/lsa.201900429 |
work_keys_str_mv | AT almagroarmenterosjosejuan detectingsequencesignalsintargetingpeptidesusingdeeplearning AT salvatoremarco detectingsequencesignalsintargetingpeptidesusingdeeplearning AT emanuelssonolof detectingsequencesignalsintargetingpeptidesusingdeeplearning AT wintherole detectingsequencesignalsintargetingpeptidesusingdeeplearning AT vonheijnegunnar detectingsequencesignalsintargetingpeptidesusingdeeplearning AT elofssonarne detectingsequencesignalsintargetingpeptidesusingdeeplearning AT nielsenhenrik detectingsequencesignalsintargetingpeptidesusingdeeplearning |