Cargando…

KAOS: a new automated computational method for the identification of overexpressed genes

BACKGROUND: Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as...

Descripción completa

Detalles Bibliográficos
Autores principales: Nuzzo, Angelo, Carapezza, Giovanni, Di Bella, Sebastiano, Pulvirenti, Alfredo, Isacchi, Antonella, Bosotti, Roberta
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5123341/
https://www.ncbi.nlm.nih.gov/pubmed/28185541
http://dx.doi.org/10.1186/s12859-016-1188-1
_version_ 1782469715150503936
author Nuzzo, Angelo
Carapezza, Giovanni
Di Bella, Sebastiano
Pulvirenti, Alfredo
Isacchi, Antonella
Bosotti, Roberta
author_facet Nuzzo, Angelo
Carapezza, Giovanni
Di Bella, Sebastiano
Pulvirenti, Alfredo
Isacchi, Antonella
Bosotti, Roberta
author_sort Nuzzo, Angelo
collection PubMed
description BACKGROUND: Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as lead to the identification of new candidate targets for drug discovery. However this requires the ability to query large datasets to identify rare events occurring in very small fractions (1–3 %) of different tumor subtypes. This task is different from what is normally done by conventional tools that are able to find genes differentially expressed between two experimental conditions. RESULTS: We propose a computational method aimed at the automatic identification of genes which are selectively over-expressed in a very small fraction of samples within a specific tissue. The method does not require a healthy counterpart or a reference sample for the analysis and can be therefore applied also to transcriptional data generated from cell lines. In our implementation the tool can use gene-expression data from microarray experiments, as well as data generated by RNASeq technologies. CONCLUSIONS: The method was implemented as a publicly available, user-friendly tool called KAOS (Kinase Automatic Outliers Search). The tool enables the automatic execution of iterative searches for the identification of extreme outliers and for the graphical visualization of the results. Filters can be applied to select the most significant outliers. The performance of the tool was evaluated using a synthetic dataset and compared to state-of-the-art tools. KAOS performs particularly well in detecting genes that are overexpressed in few samples or when an extreme outlier stands out on a high variable expression background. To validate the method on real case studies, we used publicly available tumor cell line microarray data, and we were able to identify genes which are known to be overexpressed in specific samples, as well as novel ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1188-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5123341
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51233412016-12-06 KAOS: a new automated computational method for the identification of overexpressed genes Nuzzo, Angelo Carapezza, Giovanni Di Bella, Sebastiano Pulvirenti, Alfredo Isacchi, Antonella Bosotti, Roberta BMC Bioinformatics Research BACKGROUND: Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as lead to the identification of new candidate targets for drug discovery. However this requires the ability to query large datasets to identify rare events occurring in very small fractions (1–3 %) of different tumor subtypes. This task is different from what is normally done by conventional tools that are able to find genes differentially expressed between two experimental conditions. RESULTS: We propose a computational method aimed at the automatic identification of genes which are selectively over-expressed in a very small fraction of samples within a specific tissue. The method does not require a healthy counterpart or a reference sample for the analysis and can be therefore applied also to transcriptional data generated from cell lines. In our implementation the tool can use gene-expression data from microarray experiments, as well as data generated by RNASeq technologies. CONCLUSIONS: The method was implemented as a publicly available, user-friendly tool called KAOS (Kinase Automatic Outliers Search). The tool enables the automatic execution of iterative searches for the identification of extreme outliers and for the graphical visualization of the results. Filters can be applied to select the most significant outliers. The performance of the tool was evaluated using a synthetic dataset and compared to state-of-the-art tools. KAOS performs particularly well in detecting genes that are overexpressed in few samples or when an extreme outlier stands out on a high variable expression background. To validate the method on real case studies, we used publicly available tumor cell line microarray data, and we were able to identify genes which are known to be overexpressed in specific samples, as well as novel ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1188-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-08 /pmc/articles/PMC5123341/ /pubmed/28185541 http://dx.doi.org/10.1186/s12859-016-1188-1 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Nuzzo, Angelo
Carapezza, Giovanni
Di Bella, Sebastiano
Pulvirenti, Alfredo
Isacchi, Antonella
Bosotti, Roberta
KAOS: a new automated computational method for the identification of overexpressed genes
title KAOS: a new automated computational method for the identification of overexpressed genes
title_full KAOS: a new automated computational method for the identification of overexpressed genes
title_fullStr KAOS: a new automated computational method for the identification of overexpressed genes
title_full_unstemmed KAOS: a new automated computational method for the identification of overexpressed genes
title_short KAOS: a new automated computational method for the identification of overexpressed genes
title_sort kaos: a new automated computational method for the identification of overexpressed genes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5123341/
https://www.ncbi.nlm.nih.gov/pubmed/28185541
http://dx.doi.org/10.1186/s12859-016-1188-1
work_keys_str_mv AT nuzzoangelo kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes
AT carapezzagiovanni kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes
AT dibellasebastiano kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes
AT pulvirentialfredo kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes
AT isacchiantonella kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes
AT bosottiroberta kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes