Cargando…

Protecting the Privacy of Cancer Patients Using Fuzzy Association Rule Hiding

OBJECTIVE: Privacy protection in the medical field means the protection of individuals from being associated with undesirable conditions, diagnoses or treatments (Sensitive Attributes). The problem of knowledge discovery from health care data by applying data mining algorithms is inversely related t...

Descripción completa

Detalles Bibliográficos
Autores principales: Krishnamoorthy, Sathiyapriya, Murugesan, Kaviya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: West Asia Organization for Cancer Prevention 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857861/
https://www.ncbi.nlm.nih.gov/pubmed/31127905
http://dx.doi.org/10.31557/APJCP.2019.20.5.1437
Descripción
Sumario:OBJECTIVE: Privacy protection in the medical field means the protection of individuals from being associated with undesirable conditions, diagnoses or treatments (Sensitive Attributes). The problem of knowledge discovery from health care data by applying data mining algorithms is inversely related to the privacy of individuals. Due to the tremendous growth of data in a large scale, there is a demand to protect the sensitive data accessible from medical datasets. METHODS: This paper considers the problem of building privacy preserving association rule mining algorithm using the notion of TF * IDF derived from the information retrieval domain. The highly sensitive transaction is chosen using the product of Relative Item Frequency and Condensed Frequency. Finally, sensitive fuzzy data is perturbed to hide these refined rules. RESULTS: It has been found that the number of non-sensitive rules lost as a side effect of hiding sensitive rule is 20% less and number of ghost rules is 30% less in proposed work than in previous work using Transactional Impact factor method. The execution time of hiding a rule is 26% lesser on an average in the proposed technique for various values of minimum confidence threshold. It has been observed that the number of modifications to the original dataset after hiding three rules were reduced by 66% in proposed method than in previous work. As the number of modifications to original data is less the chances of generating false association is also reduced. CONCLUSION: In this paper, a novel method was presented to hide the sensitive rule in quantitative data by decreasing the support of the RHS of the rule. Experimental results demonstrate that the proposed approach is more efficient as it facilitates better rule hiding and minimizes the number of lost rules and ghost rules. Also, this approach makes minimum modifications to the dataset.