Cargando…

Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk

In the present study, recurrent copy number variations (CNVs) from non-tumor blood cell DNAs of Caucasian non-cancer subjects and glioma, myeloma, and colorectal cancer-patients, and Korean non-cancer subjects and hepatocellular carcinoma, gastric cancer, and colorectal cancer patients, were found t...

Descripción completa

Detalles Bibliográficos
Autores principales: Ding, Xiaofan, Tsang, Shui-Ying, Ng, Siu-Kin, Xue, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4504076/
https://www.ncbi.nlm.nih.gov/pubmed/26203258
http://dx.doi.org/10.4137/GEI.S15002
_version_ 1782381422694105088
author Ding, Xiaofan
Tsang, Shui-Ying
Ng, Siu-Kin
Xue, Hong
author_facet Ding, Xiaofan
Tsang, Shui-Ying
Ng, Siu-Kin
Xue, Hong
author_sort Ding, Xiaofan
collection PubMed
description In the present study, recurrent copy number variations (CNVs) from non-tumor blood cell DNAs of Caucasian non-cancer subjects and glioma, myeloma, and colorectal cancer-patients, and Korean non-cancer subjects and hepatocellular carcinoma, gastric cancer, and colorectal cancer patients, were found to reveal for each of the two ethnic cohorts highly significant differences between cancer patients and controls with respect to the number of CN-losses and size-distribution of CN-gains, suggesting the existence of recurrent constitutional CNV-features useful for prediction of predisposition to cancer. Upon identification by machine learning, such CNV-features could extensively discriminate between cancer-patient and control DNAs. When the CNV-features selected from a learning-group of Caucasian or Korean mixed DNAs consisting of both cancer-patient and control DNAs were employed to make predictions on the cancer predisposition of an unseen test group of mixed DNAs, the average prediction accuracy was 93.6% for the Caucasian cohort and 86.5% for the Korean cohort.
format Online
Article
Text
id pubmed-4504076
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-45040762015-07-22 Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk Ding, Xiaofan Tsang, Shui-Ying Ng, Siu-Kin Xue, Hong Genomics Insights Original Research In the present study, recurrent copy number variations (CNVs) from non-tumor blood cell DNAs of Caucasian non-cancer subjects and glioma, myeloma, and colorectal cancer-patients, and Korean non-cancer subjects and hepatocellular carcinoma, gastric cancer, and colorectal cancer patients, were found to reveal for each of the two ethnic cohorts highly significant differences between cancer patients and controls with respect to the number of CN-losses and size-distribution of CN-gains, suggesting the existence of recurrent constitutional CNV-features useful for prediction of predisposition to cancer. Upon identification by machine learning, such CNV-features could extensively discriminate between cancer-patient and control DNAs. When the CNV-features selected from a learning-group of Caucasian or Korean mixed DNAs consisting of both cancer-patient and control DNAs were employed to make predictions on the cancer predisposition of an unseen test group of mixed DNAs, the average prediction accuracy was 93.6% for the Caucasian cohort and 86.5% for the Korean cohort. Libertas Academica 2014-06-26 /pmc/articles/PMC4504076/ /pubmed/26203258 http://dx.doi.org/10.4137/GEI.S15002 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Original Research
Ding, Xiaofan
Tsang, Shui-Ying
Ng, Siu-Kin
Xue, Hong
Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk
title Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk
title_full Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk
title_fullStr Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk
title_full_unstemmed Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk
title_short Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk
title_sort application of machine learning to development of copy number variation-based prediction of cancer risk
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4504076/
https://www.ncbi.nlm.nih.gov/pubmed/26203258
http://dx.doi.org/10.4137/GEI.S15002
work_keys_str_mv AT dingxiaofan applicationofmachinelearningtodevelopmentofcopynumbervariationbasedpredictionofcancerrisk
AT tsangshuiying applicationofmachinelearningtodevelopmentofcopynumbervariationbasedpredictionofcancerrisk
AT ngsiukin applicationofmachinelearningtodevelopmentofcopynumbervariationbasedpredictionofcancerrisk
AT xuehong applicationofmachinelearningtodevelopmentofcopynumbervariationbasedpredictionofcancerrisk