Cargando…
KAOS: a new automated computational method for the identification of overexpressed genes
BACKGROUND: Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5123341/ https://www.ncbi.nlm.nih.gov/pubmed/28185541 http://dx.doi.org/10.1186/s12859-016-1188-1 |
_version_ | 1782469715150503936 |
---|---|
author | Nuzzo, Angelo Carapezza, Giovanni Di Bella, Sebastiano Pulvirenti, Alfredo Isacchi, Antonella Bosotti, Roberta |
author_facet | Nuzzo, Angelo Carapezza, Giovanni Di Bella, Sebastiano Pulvirenti, Alfredo Isacchi, Antonella Bosotti, Roberta |
author_sort | Nuzzo, Angelo |
collection | PubMed |
description | BACKGROUND: Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as lead to the identification of new candidate targets for drug discovery. However this requires the ability to query large datasets to identify rare events occurring in very small fractions (1–3 %) of different tumor subtypes. This task is different from what is normally done by conventional tools that are able to find genes differentially expressed between two experimental conditions. RESULTS: We propose a computational method aimed at the automatic identification of genes which are selectively over-expressed in a very small fraction of samples within a specific tissue. The method does not require a healthy counterpart or a reference sample for the analysis and can be therefore applied also to transcriptional data generated from cell lines. In our implementation the tool can use gene-expression data from microarray experiments, as well as data generated by RNASeq technologies. CONCLUSIONS: The method was implemented as a publicly available, user-friendly tool called KAOS (Kinase Automatic Outliers Search). The tool enables the automatic execution of iterative searches for the identification of extreme outliers and for the graphical visualization of the results. Filters can be applied to select the most significant outliers. The performance of the tool was evaluated using a synthetic dataset and compared to state-of-the-art tools. KAOS performs particularly well in detecting genes that are overexpressed in few samples or when an extreme outlier stands out on a high variable expression background. To validate the method on real case studies, we used publicly available tumor cell line microarray data, and we were able to identify genes which are known to be overexpressed in specific samples, as well as novel ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1188-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5123341 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-51233412016-12-06 KAOS: a new automated computational method for the identification of overexpressed genes Nuzzo, Angelo Carapezza, Giovanni Di Bella, Sebastiano Pulvirenti, Alfredo Isacchi, Antonella Bosotti, Roberta BMC Bioinformatics Research BACKGROUND: Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as lead to the identification of new candidate targets for drug discovery. However this requires the ability to query large datasets to identify rare events occurring in very small fractions (1–3 %) of different tumor subtypes. This task is different from what is normally done by conventional tools that are able to find genes differentially expressed between two experimental conditions. RESULTS: We propose a computational method aimed at the automatic identification of genes which are selectively over-expressed in a very small fraction of samples within a specific tissue. The method does not require a healthy counterpart or a reference sample for the analysis and can be therefore applied also to transcriptional data generated from cell lines. In our implementation the tool can use gene-expression data from microarray experiments, as well as data generated by RNASeq technologies. CONCLUSIONS: The method was implemented as a publicly available, user-friendly tool called KAOS (Kinase Automatic Outliers Search). The tool enables the automatic execution of iterative searches for the identification of extreme outliers and for the graphical visualization of the results. Filters can be applied to select the most significant outliers. The performance of the tool was evaluated using a synthetic dataset and compared to state-of-the-art tools. KAOS performs particularly well in detecting genes that are overexpressed in few samples or when an extreme outlier stands out on a high variable expression background. To validate the method on real case studies, we used publicly available tumor cell line microarray data, and we were able to identify genes which are known to be overexpressed in specific samples, as well as novel ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1188-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-08 /pmc/articles/PMC5123341/ /pubmed/28185541 http://dx.doi.org/10.1186/s12859-016-1188-1 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Nuzzo, Angelo Carapezza, Giovanni Di Bella, Sebastiano Pulvirenti, Alfredo Isacchi, Antonella Bosotti, Roberta KAOS: a new automated computational method for the identification of overexpressed genes |
title | KAOS: a new automated computational method for the identification of overexpressed genes |
title_full | KAOS: a new automated computational method for the identification of overexpressed genes |
title_fullStr | KAOS: a new automated computational method for the identification of overexpressed genes |
title_full_unstemmed | KAOS: a new automated computational method for the identification of overexpressed genes |
title_short | KAOS: a new automated computational method for the identification of overexpressed genes |
title_sort | kaos: a new automated computational method for the identification of overexpressed genes |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5123341/ https://www.ncbi.nlm.nih.gov/pubmed/28185541 http://dx.doi.org/10.1186/s12859-016-1188-1 |
work_keys_str_mv | AT nuzzoangelo kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes AT carapezzagiovanni kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes AT dibellasebastiano kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes AT pulvirentialfredo kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes AT isacchiantonella kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes AT bosottiroberta kaosanewautomatedcomputationalmethodfortheidentificationofoverexpressedgenes |