Cargando…

SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

BACKGROUND: Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of altern...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Fan, Drabier, Renee
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850988/
https://www.ncbi.nlm.nih.gov/pubmed/24267658
http://dx.doi.org/10.1186/1471-2105-14-S14-S13
_version_ 1782294205372039168
author Zhang, Fan
Drabier, Renee
author_facet Zhang, Fan
Drabier, Renee
author_sort Zhang, Fan
collection PubMed
description BACKGROUND: Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. RESULTS: We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. CONCLUSIONS: The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing.
format Online
Article
Text
id pubmed-3850988
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38509882013-12-13 SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics Zhang, Fan Drabier, Renee BMC Bioinformatics Proceedings BACKGROUND: Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. RESULTS: We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. CONCLUSIONS: The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. BioMed Central 2013-10-09 /pmc/articles/PMC3850988/ /pubmed/24267658 http://dx.doi.org/10.1186/1471-2105-14-S14-S13 Text en Copyright © 2013 Zhang and Drabier; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Zhang, Fan
Drabier, Renee
SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
title SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
title_full SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
title_fullStr SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
title_full_unstemmed SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
title_short SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
title_sort sasd: the synthetic alternative splicing database for identifying novel isoform from proteomics
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850988/
https://www.ncbi.nlm.nih.gov/pubmed/24267658
http://dx.doi.org/10.1186/1471-2105-14-S14-S13
work_keys_str_mv AT zhangfan sasdthesyntheticalternativesplicingdatabaseforidentifyingnovelisoformfromproteomics
AT drabierrenee sasdthesyntheticalternativesplicingdatabaseforidentifyingnovelisoformfromproteomics