Cargando…

On detection and assessment of statistical significance of Genomic Islands

BACKGROUND: Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in...

Descripción completa

Detalles Bibliográficos
Autores principales: Chatterjee, Raghunath, Chaudhuri, Keya, Chaudhuri, Probal
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2362129/
https://www.ncbi.nlm.nih.gov/pubmed/18380895
http://dx.doi.org/10.1186/1471-2164-9-150
_version_ 1782153384536571904
author Chatterjee, Raghunath
Chaudhuri, Keya
Chaudhuri, Probal
author_facet Chatterjee, Raghunath
Chaudhuri, Keya
Chaudhuri, Probal
author_sort Chatterjee, Raghunath
collection PubMed
description BACKGROUND: Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. RESULTS: Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island) runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. CONCLUSION: The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods.
format Text
id pubmed-2362129
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23621292008-05-01 On detection and assessment of statistical significance of Genomic Islands Chatterjee, Raghunath Chaudhuri, Keya Chaudhuri, Probal BMC Genomics Methodology Article BACKGROUND: Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. RESULTS: Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island) runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. CONCLUSION: The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods. BioMed Central 2008-04-01 /pmc/articles/PMC2362129/ /pubmed/18380895 http://dx.doi.org/10.1186/1471-2164-9-150 Text en Copyright © 2008 Chatterjee et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Chatterjee, Raghunath
Chaudhuri, Keya
Chaudhuri, Probal
On detection and assessment of statistical significance of Genomic Islands
title On detection and assessment of statistical significance of Genomic Islands
title_full On detection and assessment of statistical significance of Genomic Islands
title_fullStr On detection and assessment of statistical significance of Genomic Islands
title_full_unstemmed On detection and assessment of statistical significance of Genomic Islands
title_short On detection and assessment of statistical significance of Genomic Islands
title_sort on detection and assessment of statistical significance of genomic islands
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2362129/
https://www.ncbi.nlm.nih.gov/pubmed/18380895
http://dx.doi.org/10.1186/1471-2164-9-150
work_keys_str_mv AT chatterjeeraghunath ondetectionandassessmentofstatisticalsignificanceofgenomicislands
AT chaudhurikeya ondetectionandassessmentofstatisticalsignificanceofgenomicislands
AT chaudhuriprobal ondetectionandassessmentofstatisticalsignificanceofgenomicislands