Cargando…

Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data

Several disorders are related to amyloid aggregation of proteins, for example Alzheimer’s or Parkinson’s diseases. Amyloid proteins form fibrils of aggregated beta structures. This is preceded by formation of oligomers—the most cytotoxic species. Determining amyloidogenicity is tedious and costly. T...

Descripción completa

Detalles Bibliográficos
Autores principales: Szulc, Natalia, Burdukiewicz, Michał, Gąsior-Głogowska, Marlena, Wojciechowski, Jakub W., Chilimoniuk, Jarosław, Mackiewicz, Paweł, Šneideris, Tomas, Smirnovas, Vytautas, Kotulska, Malgorzata
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076271/
https://www.ncbi.nlm.nih.gov/pubmed/33903613
http://dx.doi.org/10.1038/s41598-021-86530-6
_version_ 1783684663090348032
author Szulc, Natalia
Burdukiewicz, Michał
Gąsior-Głogowska, Marlena
Wojciechowski, Jakub W.
Chilimoniuk, Jarosław
Mackiewicz, Paweł
Šneideris, Tomas
Smirnovas, Vytautas
Kotulska, Malgorzata
author_facet Szulc, Natalia
Burdukiewicz, Michał
Gąsior-Głogowska, Marlena
Wojciechowski, Jakub W.
Chilimoniuk, Jarosław
Mackiewicz, Paweł
Šneideris, Tomas
Smirnovas, Vytautas
Kotulska, Malgorzata
author_sort Szulc, Natalia
collection PubMed
description Several disorders are related to amyloid aggregation of proteins, for example Alzheimer’s or Parkinson’s diseases. Amyloid proteins form fibrils of aggregated beta structures. This is preceded by formation of oligomers—the most cytotoxic species. Determining amyloidogenicity is tedious and costly. The most reliable identification of amyloids is obtained with high resolution microscopies, such as electron microscopy or atomic force microscopy (AFM). More frequently, less expensive and faster methods are used, especially infrared (IR) spectroscopy or Thioflavin T staining. Different experimental methods are not always concurrent, especially when amyloid peptides do not readily form fibrils but oligomers. This may lead to peptide misclassification and mislabeling. Several bioinformatics methods have been proposed for in-silico identification of amyloids, many of them based on machine learning. The effectiveness of these methods heavily depends on accurate annotation of the reference training data obtained from in-vitro experiments. We study how robust are bioinformatics methods to weak supervision, encountering imperfect training data. AmyloGram and three other amyloid predictors were applied. The results proved that a certain degree of misannotation in the reference data can be eliminated by the bioinformatics tools, even if they belonged to their training set. The computational results are supported by new experiments with IR and AFM methods.
format Online
Article
Text
id pubmed-8076271
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80762712021-04-27 Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data Szulc, Natalia Burdukiewicz, Michał Gąsior-Głogowska, Marlena Wojciechowski, Jakub W. Chilimoniuk, Jarosław Mackiewicz, Paweł Šneideris, Tomas Smirnovas, Vytautas Kotulska, Malgorzata Sci Rep Article Several disorders are related to amyloid aggregation of proteins, for example Alzheimer’s or Parkinson’s diseases. Amyloid proteins form fibrils of aggregated beta structures. This is preceded by formation of oligomers—the most cytotoxic species. Determining amyloidogenicity is tedious and costly. The most reliable identification of amyloids is obtained with high resolution microscopies, such as electron microscopy or atomic force microscopy (AFM). More frequently, less expensive and faster methods are used, especially infrared (IR) spectroscopy or Thioflavin T staining. Different experimental methods are not always concurrent, especially when amyloid peptides do not readily form fibrils but oligomers. This may lead to peptide misclassification and mislabeling. Several bioinformatics methods have been proposed for in-silico identification of amyloids, many of them based on machine learning. The effectiveness of these methods heavily depends on accurate annotation of the reference training data obtained from in-vitro experiments. We study how robust are bioinformatics methods to weak supervision, encountering imperfect training data. AmyloGram and three other amyloid predictors were applied. The results proved that a certain degree of misannotation in the reference data can be eliminated by the bioinformatics tools, even if they belonged to their training set. The computational results are supported by new experiments with IR and AFM methods. Nature Publishing Group UK 2021-04-26 /pmc/articles/PMC8076271/ /pubmed/33903613 http://dx.doi.org/10.1038/s41598-021-86530-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Szulc, Natalia
Burdukiewicz, Michał
Gąsior-Głogowska, Marlena
Wojciechowski, Jakub W.
Chilimoniuk, Jarosław
Mackiewicz, Paweł
Šneideris, Tomas
Smirnovas, Vytautas
Kotulska, Malgorzata
Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
title Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
title_full Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
title_fullStr Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
title_full_unstemmed Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
title_short Bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
title_sort bioinformatics methods for identification of amyloidogenic peptides show robustness to misannotated training data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076271/
https://www.ncbi.nlm.nih.gov/pubmed/33903613
http://dx.doi.org/10.1038/s41598-021-86530-6
work_keys_str_mv AT szulcnatalia bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT burdukiewiczmichał bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT gasiorgłogowskamarlena bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT wojciechowskijakubw bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT chilimoniukjarosław bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT mackiewiczpaweł bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT sneideristomas bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT smirnovasvytautas bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata
AT kotulskamalgorzata bioinformaticsmethodsforidentificationofamyloidogenicpeptidesshowrobustnesstomisannotatedtrainingdata