Cargando…

Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications

We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipelin...

Descripción completa

Detalles Bibliográficos
Autores principales: Hari, P.S., Balakrishnan, Lavanya, Kotyada, Chaithanya, Everad John, Arivusudar, Tiwary, Shivani, Shah, Nameeta, Sirdeshmukh, Ravi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Biochemistry and Molecular Biology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9020135/
https://www.ncbi.nlm.nih.gov/pubmed/35227895
http://dx.doi.org/10.1016/j.mcpro.2022.100220
_version_ 1784689466384842752
author Hari, P.S.
Balakrishnan, Lavanya
Kotyada, Chaithanya
Everad John, Arivusudar
Tiwary, Shivani
Shah, Nameeta
Sirdeshmukh, Ravi
author_facet Hari, P.S.
Balakrishnan, Lavanya
Kotyada, Chaithanya
Everad John, Arivusudar
Tiwary, Shivani
Shah, Nameeta
Sirdeshmukh, Ravi
author_sort Hari, P.S.
collection PubMed
description We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipeline that consisted of de novo transcript assembly, six frame-translated custom database, and a combination of search engines to identify novel peptides. A portfolio of 4,387 novel peptide sequences initially identified was further screened through PepQuery validation tool (Clinical Proteomic Tumor Analysis Consortium), which yielded 1,558 novel peptides. We considered the dataset of 1,558 validated through PepQuery to understand their functional and clinical significance, leaving the rest to be further verified using other validation tools and approaches. The novel peptides mapped to the known gene sequences as well as to genomic regions yet undefined for translation, 580 novel peptides mapped to known protein-coding genes, 147 to non–protein-coding genes, and 831 belonged to novel translational sequences. The novel peptides belonging to protein-coding genes represented alternatively spliced events or 5′ or 3′ extensions, whereas others represented translation from pseudogenes, long noncoding RNAs, or novel peptides originating from uncharacterized protein-coding sequences—mostly from the intronic regions of known genes. Seventy-six of the 580 protein-coding genes were associated with cancer hallmark genes, which included key oncogenes, transcription factors, kinases, and cell surface receptors. Survival association analysis of the 76 novel peptide sequences revealed 10 of them to be significant, and we present a panel of six novel peptides, whose high expression was found to be strongly associated with poor survival of patients with human epidermal growth factor receptor 2–enriched subtype. Our analysis represents a landscape of novel peptides of different types that may be expressed in breast cancer tissues, whereas their presence in full-length functional proteins needs further investigations.
format Online
Article
Text
id pubmed-9020135
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Biochemistry and Molecular Biology
record_format MEDLINE/PubMed
spelling pubmed-90201352022-04-22 Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications Hari, P.S. Balakrishnan, Lavanya Kotyada, Chaithanya Everad John, Arivusudar Tiwary, Shivani Shah, Nameeta Sirdeshmukh, Ravi Mol Cell Proteomics Research We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipeline that consisted of de novo transcript assembly, six frame-translated custom database, and a combination of search engines to identify novel peptides. A portfolio of 4,387 novel peptide sequences initially identified was further screened through PepQuery validation tool (Clinical Proteomic Tumor Analysis Consortium), which yielded 1,558 novel peptides. We considered the dataset of 1,558 validated through PepQuery to understand their functional and clinical significance, leaving the rest to be further verified using other validation tools and approaches. The novel peptides mapped to the known gene sequences as well as to genomic regions yet undefined for translation, 580 novel peptides mapped to known protein-coding genes, 147 to non–protein-coding genes, and 831 belonged to novel translational sequences. The novel peptides belonging to protein-coding genes represented alternatively spliced events or 5′ or 3′ extensions, whereas others represented translation from pseudogenes, long noncoding RNAs, or novel peptides originating from uncharacterized protein-coding sequences—mostly from the intronic regions of known genes. Seventy-six of the 580 protein-coding genes were associated with cancer hallmark genes, which included key oncogenes, transcription factors, kinases, and cell surface receptors. Survival association analysis of the 76 novel peptide sequences revealed 10 of them to be significant, and we present a panel of six novel peptides, whose high expression was found to be strongly associated with poor survival of patients with human epidermal growth factor receptor 2–enriched subtype. Our analysis represents a landscape of novel peptides of different types that may be expressed in breast cancer tissues, whereas their presence in full-length functional proteins needs further investigations. American Society for Biochemistry and Molecular Biology 2022-02-26 /pmc/articles/PMC9020135/ /pubmed/35227895 http://dx.doi.org/10.1016/j.mcpro.2022.100220 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research
Hari, P.S.
Balakrishnan, Lavanya
Kotyada, Chaithanya
Everad John, Arivusudar
Tiwary, Shivani
Shah, Nameeta
Sirdeshmukh, Ravi
Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications
title Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications
title_full Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications
title_fullStr Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications
title_full_unstemmed Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications
title_short Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications
title_sort proteogenomic analysis of breast cancer transcriptomic and proteomic data, using de novo transcript assembly: genome-wide identification of novel peptides and clinical implications
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9020135/
https://www.ncbi.nlm.nih.gov/pubmed/35227895
http://dx.doi.org/10.1016/j.mcpro.2022.100220
work_keys_str_mv AT harips proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications
AT balakrishnanlavanya proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications
AT kotyadachaithanya proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications
AT everadjohnarivusudar proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications
AT tiwaryshivani proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications
AT shahnameeta proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications
AT sirdeshmukhravi proteogenomicanalysisofbreastcancertranscriptomicandproteomicdatausingdenovotranscriptassemblygenomewideidentificationofnovelpeptidesandclinicalimplications