Cargando…
On detection and assessment of statistical significance of Genomic Islands
BACKGROUND: Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2362129/ https://www.ncbi.nlm.nih.gov/pubmed/18380895 http://dx.doi.org/10.1186/1471-2164-9-150 |
_version_ | 1782153384536571904 |
---|---|
author | Chatterjee, Raghunath Chaudhuri, Keya Chaudhuri, Probal |
author_facet | Chatterjee, Raghunath Chaudhuri, Keya Chaudhuri, Probal |
author_sort | Chatterjee, Raghunath |
collection | PubMed |
description | BACKGROUND: Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. RESULTS: Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island) runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. CONCLUSION: The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods. |
format | Text |
id | pubmed-2362129 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-23621292008-05-01 On detection and assessment of statistical significance of Genomic Islands Chatterjee, Raghunath Chaudhuri, Keya Chaudhuri, Probal BMC Genomics Methodology Article BACKGROUND: Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. RESULTS: Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island) runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. CONCLUSION: The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods. BioMed Central 2008-04-01 /pmc/articles/PMC2362129/ /pubmed/18380895 http://dx.doi.org/10.1186/1471-2164-9-150 Text en Copyright © 2008 Chatterjee et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Chatterjee, Raghunath Chaudhuri, Keya Chaudhuri, Probal On detection and assessment of statistical significance of Genomic Islands |
title | On detection and assessment of statistical significance of Genomic Islands |
title_full | On detection and assessment of statistical significance of Genomic Islands |
title_fullStr | On detection and assessment of statistical significance of Genomic Islands |
title_full_unstemmed | On detection and assessment of statistical significance of Genomic Islands |
title_short | On detection and assessment of statistical significance of Genomic Islands |
title_sort | on detection and assessment of statistical significance of genomic islands |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2362129/ https://www.ncbi.nlm.nih.gov/pubmed/18380895 http://dx.doi.org/10.1186/1471-2164-9-150 |
work_keys_str_mv | AT chatterjeeraghunath ondetectionandassessmentofstatisticalsignificanceofgenomicislands AT chaudhurikeya ondetectionandassessmentofstatisticalsignificanceofgenomicislands AT chaudhuriprobal ondetectionandassessmentofstatisticalsignificanceofgenomicislands |