Cargando…

Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU

Despite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previ...

Descripción completa

Detalles Bibliográficos
Autores principales: Al Hajri, Qais, Dash, Sajal, Feng, Wu-chun, Garner, Harold R., Anandakrishnan, Ramu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7005272/
https://www.ncbi.nlm.nih.gov/pubmed/32029803
http://dx.doi.org/10.1038/s41598-020-58785-y
_version_ 1783494900189233152
author Al Hajri, Qais
Dash, Sajal
Feng, Wu-chun
Garner, Harold R.
Anandakrishnan, Ramu
author_facet Al Hajri, Qais
Dash, Sajal
Feng, Wu-chun
Garner, Harold R.
Anandakrishnan, Ramu
author_sort Al Hajri, Qais
collection PubMed
description Despite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previously developed an algorithm for identifying combinations of carcinogenic genes with mutations (multi-hit combinations), which could suggest a likely cause for individual instances of cancer. Most cancers are estimated to require three or more hits. However, the computational complexity of the algorithm scales exponentially with the number of hits, making it impractical for identifying combinations of more than two hits. To identify combinations of greater than two hits, we used a compressed binary matrix representation, and optimized the algorithm for parallel execution on an NVIDIA V100 graphics processing unit (GPU). With these enhancements, the optimized GPU implementation was on average an estimated 12,144 times faster than the original integer matrix based CPU implementation, for the 3-hit algorithm, allowing us to identify 3-hit combinations. The 3-hit combinations identified using a training set were able to differentiate between tumor and normal samples in a separate test set with 90% overall sensitivity and 93% overall specificity. We illustrate how the distribution of mutations in tumor and normal samples in the multi-hit gene combinations can suggest potential driver mutations for further investigation. With experimental validation, these combinations may provide insight into the etiology of cancer and a rational basis for targeted combination therapy.
format Online
Article
Text
id pubmed-7005272
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-70052722020-02-18 Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU Al Hajri, Qais Dash, Sajal Feng, Wu-chun Garner, Harold R. Anandakrishnan, Ramu Sci Rep Article Despite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previously developed an algorithm for identifying combinations of carcinogenic genes with mutations (multi-hit combinations), which could suggest a likely cause for individual instances of cancer. Most cancers are estimated to require three or more hits. However, the computational complexity of the algorithm scales exponentially with the number of hits, making it impractical for identifying combinations of more than two hits. To identify combinations of greater than two hits, we used a compressed binary matrix representation, and optimized the algorithm for parallel execution on an NVIDIA V100 graphics processing unit (GPU). With these enhancements, the optimized GPU implementation was on average an estimated 12,144 times faster than the original integer matrix based CPU implementation, for the 3-hit algorithm, allowing us to identify 3-hit combinations. The 3-hit combinations identified using a training set were able to differentiate between tumor and normal samples in a separate test set with 90% overall sensitivity and 93% overall specificity. We illustrate how the distribution of mutations in tumor and normal samples in the multi-hit gene combinations can suggest potential driver mutations for further investigation. With experimental validation, these combinations may provide insight into the etiology of cancer and a rational basis for targeted combination therapy. Nature Publishing Group UK 2020-02-06 /pmc/articles/PMC7005272/ /pubmed/32029803 http://dx.doi.org/10.1038/s41598-020-58785-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Al Hajri, Qais
Dash, Sajal
Feng, Wu-chun
Garner, Harold R.
Anandakrishnan, Ramu
Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
title Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
title_full Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
title_fullStr Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
title_full_unstemmed Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
title_short Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
title_sort identifying multi-hit carcinogenic gene combinations: scaling up a weighted set cover algorithm using compressed binary matrix representation on a gpu
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7005272/
https://www.ncbi.nlm.nih.gov/pubmed/32029803
http://dx.doi.org/10.1038/s41598-020-58785-y
work_keys_str_mv AT alhajriqais identifyingmultihitcarcinogenicgenecombinationsscalingupaweightedsetcoveralgorithmusingcompressedbinarymatrixrepresentationonagpu
AT dashsajal identifyingmultihitcarcinogenicgenecombinationsscalingupaweightedsetcoveralgorithmusingcompressedbinarymatrixrepresentationonagpu
AT fengwuchun identifyingmultihitcarcinogenicgenecombinationsscalingupaweightedsetcoveralgorithmusingcompressedbinarymatrixrepresentationonagpu
AT garnerharoldr identifyingmultihitcarcinogenicgenecombinationsscalingupaweightedsetcoveralgorithmusingcompressedbinarymatrixrepresentationonagpu
AT anandakrishnanramu identifyingmultihitcarcinogenicgenecombinationsscalingupaweightedsetcoveralgorithmusingcompressedbinarymatrixrepresentationonagpu