Cargando…

Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer

Genotype, particularly Ras status, greatly affects prognosis and treatment of liver metastasis in colon cancer patients. This pilot aimed to apply word frequency analysis and a naive Bayes classifier on radiology reports to extract distinguishing imaging descriptors of wild-type colon cancer patient...

Descripción completa

Detalles Bibliográficos
Autores principales: Pershad, Yash, Govindan, Siddharth, Hara, Amy K., Borad, Mitesh J., Bekaii-Saab, Tanios, Wallace, Alex, Albadawi, Hassan, Oklu, Rahmi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5617950/
https://www.ncbi.nlm.nih.gov/pubmed/28869500
http://dx.doi.org/10.3390/diagnostics7030050
_version_ 1783267076704567296
author Pershad, Yash
Govindan, Siddharth
Hara, Amy K.
Borad, Mitesh J.
Bekaii-Saab, Tanios
Wallace, Alex
Albadawi, Hassan
Oklu, Rahmi
author_facet Pershad, Yash
Govindan, Siddharth
Hara, Amy K.
Borad, Mitesh J.
Bekaii-Saab, Tanios
Wallace, Alex
Albadawi, Hassan
Oklu, Rahmi
author_sort Pershad, Yash
collection PubMed
description Genotype, particularly Ras status, greatly affects prognosis and treatment of liver metastasis in colon cancer patients. This pilot aimed to apply word frequency analysis and a naive Bayes classifier on radiology reports to extract distinguishing imaging descriptors of wild-type colon cancer patients and those with v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations. In this institutional-review-board-approved study, we compiled a SNaPshot mutation analysis dataset from 457 colon adenocarcinoma patients. From this cohort of patients, we analyzed radiology reports of 299 patients (> 32,000 reports) who either were wild-type (147 patients) or had a KRAS (152 patients) mutation. Our algorithm determined word frequency within the wild-type and mutant radiology reports and used a naive Bayes classifier to determine the probability of a given word belonging to either group. The classifier determined that words with a greater than 50% chance of being in the KRAS mutation group and which had the highest absolute probability difference compared to the wild-type group included: “several”, “innumerable”, “confluent”, and “numerous” (p < 0.01). In contrast, words with a greater than 50% chance of being in the wild type group and with the highest absolute probability difference included: “few”, “discrete”, and “[no] recurrent” (p = 0.03). Words used in radiology reports, which have direct implications on disease course, tumor burden, and therapy, appear with differing frequency in patients with KRAS mutations versus wild-type colon adenocarcinoma. Moreover, likely characteristic imaging traits of mutant tumors make probabilistic word analysis useful in identifying unique characteristics and disease course, with applications ranging from radiology and pathology reports to clinical notes.
format Online
Article
Text
id pubmed-5617950
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-56179502017-09-29 Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer Pershad, Yash Govindan, Siddharth Hara, Amy K. Borad, Mitesh J. Bekaii-Saab, Tanios Wallace, Alex Albadawi, Hassan Oklu, Rahmi Diagnostics (Basel) Brief Report Genotype, particularly Ras status, greatly affects prognosis and treatment of liver metastasis in colon cancer patients. This pilot aimed to apply word frequency analysis and a naive Bayes classifier on radiology reports to extract distinguishing imaging descriptors of wild-type colon cancer patients and those with v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations. In this institutional-review-board-approved study, we compiled a SNaPshot mutation analysis dataset from 457 colon adenocarcinoma patients. From this cohort of patients, we analyzed radiology reports of 299 patients (> 32,000 reports) who either were wild-type (147 patients) or had a KRAS (152 patients) mutation. Our algorithm determined word frequency within the wild-type and mutant radiology reports and used a naive Bayes classifier to determine the probability of a given word belonging to either group. The classifier determined that words with a greater than 50% chance of being in the KRAS mutation group and which had the highest absolute probability difference compared to the wild-type group included: “several”, “innumerable”, “confluent”, and “numerous” (p < 0.01). In contrast, words with a greater than 50% chance of being in the wild type group and with the highest absolute probability difference included: “few”, “discrete”, and “[no] recurrent” (p = 0.03). Words used in radiology reports, which have direct implications on disease course, tumor burden, and therapy, appear with differing frequency in patients with KRAS mutations versus wild-type colon adenocarcinoma. Moreover, likely characteristic imaging traits of mutant tumors make probabilistic word analysis useful in identifying unique characteristics and disease course, with applications ranging from radiology and pathology reports to clinical notes. MDPI 2017-09-02 /pmc/articles/PMC5617950/ /pubmed/28869500 http://dx.doi.org/10.3390/diagnostics7030050 Text en © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Brief Report
Pershad, Yash
Govindan, Siddharth
Hara, Amy K.
Borad, Mitesh J.
Bekaii-Saab, Tanios
Wallace, Alex
Albadawi, Hassan
Oklu, Rahmi
Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer
title Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer
title_full Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer
title_fullStr Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer
title_full_unstemmed Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer
title_short Using Naïve Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer
title_sort using naïve bayesian analysis to determine imaging characteristics of kras mutations in metastatic colon cancer
topic Brief Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5617950/
https://www.ncbi.nlm.nih.gov/pubmed/28869500
http://dx.doi.org/10.3390/diagnostics7030050
work_keys_str_mv AT pershadyash usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT govindansiddharth usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT haraamyk usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT boradmiteshj usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT bekaiisaabtanios usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT wallacealex usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT albadawihassan usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer
AT oklurahmi usingnaivebayesiananalysistodetermineimagingcharacteristicsofkrasmutationsinmetastaticcoloncancer