Cargando…

Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm

Genomic islands are related to microbial adaptation and carry different genomic characteristics from the host. Therefore, many methods have been proposed to detect genomic islands from the rest of the genome by evaluating its sequence composition. Many sequence features have been proposed, but many...

Descripción completa

Detalles Bibliográficos
Autores principales: Onesime, Mbulayi, Yang, Zhenyu, Dai, Qi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169257/
https://www.ncbi.nlm.nih.gov/pubmed/34122622
http://dx.doi.org/10.1155/2021/9969751
_version_ 1783702020131127296
author Onesime, Mbulayi
Yang, Zhenyu
Dai, Qi
author_facet Onesime, Mbulayi
Yang, Zhenyu
Dai, Qi
author_sort Onesime, Mbulayi
collection PubMed
description Genomic islands are related to microbial adaptation and carry different genomic characteristics from the host. Therefore, many methods have been proposed to detect genomic islands from the rest of the genome by evaluating its sequence composition. Many sequence features have been proposed, but many of them have not been applied to the identification of genomic islands. In this paper, we present a scheme to predict genomic islands using the chi-square test and random forest algorithm. We extract seven kinds of sequence features and select the important features with the chi-square test. All the selected features are then input into the random forest to predict the genome islands. Three experiments and comparison show that the proposed method achieves the best performance. This understanding can be useful to design more powerful method for the genomic island prediction.
format Online
Article
Text
id pubmed-8169257
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-81692572021-06-11 Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm Onesime, Mbulayi Yang, Zhenyu Dai, Qi Comput Math Methods Med Research Article Genomic islands are related to microbial adaptation and carry different genomic characteristics from the host. Therefore, many methods have been proposed to detect genomic islands from the rest of the genome by evaluating its sequence composition. Many sequence features have been proposed, but many of them have not been applied to the identification of genomic islands. In this paper, we present a scheme to predict genomic islands using the chi-square test and random forest algorithm. We extract seven kinds of sequence features and select the important features with the chi-square test. All the selected features are then input into the random forest to predict the genome islands. Three experiments and comparison show that the proposed method achieves the best performance. This understanding can be useful to design more powerful method for the genomic island prediction. Hindawi 2021-05-24 /pmc/articles/PMC8169257/ /pubmed/34122622 http://dx.doi.org/10.1155/2021/9969751 Text en Copyright © 2021 Mbulayi Onesime et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Onesime, Mbulayi
Yang, Zhenyu
Dai, Qi
Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
title Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
title_full Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
title_fullStr Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
title_full_unstemmed Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
title_short Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
title_sort genomic island prediction via chi-square test and random forest algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169257/
https://www.ncbi.nlm.nih.gov/pubmed/34122622
http://dx.doi.org/10.1155/2021/9969751
work_keys_str_mv AT onesimembulayi genomicislandpredictionviachisquaretestandrandomforestalgorithm
AT yangzhenyu genomicislandpredictionviachisquaretestandrandomforestalgorithm
AT daiqi genomicislandpredictionviachisquaretestandrandomforestalgorithm