Cargando…

Binary matrix factorization on special purpose hardware

Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose comp...

Descripción completa

Detalles Bibliográficos
Autores principales: Malik, Osman Asif, Ushijima-Mwesigwa, Hayato, Roy, Arnab, Mandal, Avradip, Ghosh, Indradeep
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8675762/
https://www.ncbi.nlm.nih.gov/pubmed/34914786
http://dx.doi.org/10.1371/journal.pone.0261250
_version_ 1784615940394057728
author Malik, Osman Asif
Ushijima-Mwesigwa, Hayato
Roy, Arnab
Mandal, Avradip
Ghosh, Indradeep
author_facet Malik, Osman Asif
Ushijima-Mwesigwa, Hayato
Roy, Arnab
Mandal, Avradip
Ghosh, Indradeep
author_sort Malik, Osman Asif
collection PubMed
description Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose computers but often require the problem to be modeled in a special form, such as an Ising or quadratic unconstrained binary optimization (QUBO) model, in order to take advantage of these devices. In this work, we focus on the important binary matrix factorization (BMF) problem which has many applications in data mining. We propose two QUBO formulations for BMF. We show how clustering constraints can easily be incorporated into these formulations. The special purpose hardware we consider is limited in the number of variables it can handle which presents a challenge when factorizing large matrices. We propose a sampling based approach to overcome this challenge, allowing us to factorize large rectangular matrices. In addition to these methods, we also propose a simple baseline algorithm which outperforms our more sophisticated methods in a few situations. We run experiments on the Fujitsu Digital Annealer, a quantum-inspired complementary metal-oxide-semiconductor (CMOS) annealer, on both synthetic and real data, including gene expression data. These experiments show that our approach is able to produce more accurate BMFs than competing methods.
format Online
Article
Text
id pubmed-8675762
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-86757622021-12-17 Binary matrix factorization on special purpose hardware Malik, Osman Asif Ushijima-Mwesigwa, Hayato Roy, Arnab Mandal, Avradip Ghosh, Indradeep PLoS One Research Article Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose computers but often require the problem to be modeled in a special form, such as an Ising or quadratic unconstrained binary optimization (QUBO) model, in order to take advantage of these devices. In this work, we focus on the important binary matrix factorization (BMF) problem which has many applications in data mining. We propose two QUBO formulations for BMF. We show how clustering constraints can easily be incorporated into these formulations. The special purpose hardware we consider is limited in the number of variables it can handle which presents a challenge when factorizing large matrices. We propose a sampling based approach to overcome this challenge, allowing us to factorize large rectangular matrices. In addition to these methods, we also propose a simple baseline algorithm which outperforms our more sophisticated methods in a few situations. We run experiments on the Fujitsu Digital Annealer, a quantum-inspired complementary metal-oxide-semiconductor (CMOS) annealer, on both synthetic and real data, including gene expression data. These experiments show that our approach is able to produce more accurate BMFs than competing methods. Public Library of Science 2021-12-16 /pmc/articles/PMC8675762/ /pubmed/34914786 http://dx.doi.org/10.1371/journal.pone.0261250 Text en © 2021 Malik et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Malik, Osman Asif
Ushijima-Mwesigwa, Hayato
Roy, Arnab
Mandal, Avradip
Ghosh, Indradeep
Binary matrix factorization on special purpose hardware
title Binary matrix factorization on special purpose hardware
title_full Binary matrix factorization on special purpose hardware
title_fullStr Binary matrix factorization on special purpose hardware
title_full_unstemmed Binary matrix factorization on special purpose hardware
title_short Binary matrix factorization on special purpose hardware
title_sort binary matrix factorization on special purpose hardware
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8675762/
https://www.ncbi.nlm.nih.gov/pubmed/34914786
http://dx.doi.org/10.1371/journal.pone.0261250
work_keys_str_mv AT malikosmanasif binarymatrixfactorizationonspecialpurposehardware
AT ushijimamwesigwahayato binarymatrixfactorizationonspecialpurposehardware
AT royarnab binarymatrixfactorizationonspecialpurposehardware
AT mandalavradip binarymatrixfactorizationonspecialpurposehardware
AT ghoshindradeep binarymatrixfactorizationonspecialpurposehardware