Cargando…
The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis
The increasing availability of massive omics data requires improving the quality of reference databases and their annotations. The combination of full-length isoform sequencing (Iso-Seq) with short-read transcriptomics and proteomics has been successfully used for increasing proteoform characterizat...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920879/ https://www.ncbi.nlm.nih.gov/pubmed/36771596 http://dx.doi.org/10.3390/plants12030511 |
_version_ | 1784887178377035776 |
---|---|
author | García-Campa, Lara Valledor, Luis Pascual, Jesús |
author_facet | García-Campa, Lara Valledor, Luis Pascual, Jesús |
author_sort | García-Campa, Lara |
collection | PubMed |
description | The increasing availability of massive omics data requires improving the quality of reference databases and their annotations. The combination of full-length isoform sequencing (Iso-Seq) with short-read transcriptomics and proteomics has been successfully used for increasing proteoform characterization, which is a main ongoing goal in biology. However, the potential of including Oxford Nanopore Technologies Direct RNA Sequencing (ONT-DRS) data has not been explored. In this paper, we analyzed the impact of combining Iso-Seq- and ONT-DRS-derived data on the identification of proteoforms in Arabidopsis MS proteomics data. To this end, we selected a proteomics dataset corresponding to senescent leaves and we performed protein searches using three different protein databases: AtRTD2 and AtRTD3, built from the homonymous transcriptomes, regarded as the most complete and up-to-date available for the species; and a custom hybrid database combining AtRTD3 with publicly available ONT-DRS transcriptomics data generated from Arabidopsis leaves. Our results show that the inclusion and combination of long-read sequencing data from Iso-Seq and ONT-DRS into a proteogenomic workflow enhances proteoform characterization and discovery in bottom-up proteomics studies. This represents a great opportunity to further investigate biological systems at an unprecedented scale, although it brings challenges to current protein searching algorithms. |
format | Online Article Text |
id | pubmed-9920879 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-99208792023-02-12 The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis García-Campa, Lara Valledor, Luis Pascual, Jesús Plants (Basel) Article The increasing availability of massive omics data requires improving the quality of reference databases and their annotations. The combination of full-length isoform sequencing (Iso-Seq) with short-read transcriptomics and proteomics has been successfully used for increasing proteoform characterization, which is a main ongoing goal in biology. However, the potential of including Oxford Nanopore Technologies Direct RNA Sequencing (ONT-DRS) data has not been explored. In this paper, we analyzed the impact of combining Iso-Seq- and ONT-DRS-derived data on the identification of proteoforms in Arabidopsis MS proteomics data. To this end, we selected a proteomics dataset corresponding to senescent leaves and we performed protein searches using three different protein databases: AtRTD2 and AtRTD3, built from the homonymous transcriptomes, regarded as the most complete and up-to-date available for the species; and a custom hybrid database combining AtRTD3 with publicly available ONT-DRS transcriptomics data generated from Arabidopsis leaves. Our results show that the inclusion and combination of long-read sequencing data from Iso-Seq and ONT-DRS into a proteogenomic workflow enhances proteoform characterization and discovery in bottom-up proteomics studies. This represents a great opportunity to further investigate biological systems at an unprecedented scale, although it brings challenges to current protein searching algorithms. MDPI 2023-01-22 /pmc/articles/PMC9920879/ /pubmed/36771596 http://dx.doi.org/10.3390/plants12030511 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article García-Campa, Lara Valledor, Luis Pascual, Jesús The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis |
title | The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis |
title_full | The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis |
title_fullStr | The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis |
title_full_unstemmed | The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis |
title_short | The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis |
title_sort | integration of data from different long-read sequencing platforms enhances proteoform characterization in arabidopsis |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920879/ https://www.ncbi.nlm.nih.gov/pubmed/36771596 http://dx.doi.org/10.3390/plants12030511 |
work_keys_str_mv | AT garciacampalara theintegrationofdatafromdifferentlongreadsequencingplatformsenhancesproteoformcharacterizationinarabidopsis AT valledorluis theintegrationofdatafromdifferentlongreadsequencingplatformsenhancesproteoformcharacterizationinarabidopsis AT pascualjesus theintegrationofdatafromdifferentlongreadsequencingplatformsenhancesproteoformcharacterizationinarabidopsis AT garciacampalara integrationofdatafromdifferentlongreadsequencingplatformsenhancesproteoformcharacterizationinarabidopsis AT valledorluis integrationofdatafromdifferentlongreadsequencingplatformsenhancesproteoformcharacterizationinarabidopsis AT pascualjesus integrationofdatafromdifferentlongreadsequencingplatformsenhancesproteoformcharacterizationinarabidopsis |