Cargando…

Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations

Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the s...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, HoJoon, Shuaibi, Ahmed, Bell, John M, Pavlichin, Dmitri S, Ji, Hanlee P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7727745/
https://www.ncbi.nlm.nih.gov/pubmed/33345188
http://dx.doi.org/10.1093/narcan/zcaa034
_version_ 1783621123568566272
author Lee, HoJoon
Shuaibi, Ahmed
Bell, John M
Pavlichin, Dmitri S
Ji, Hanlee P
author_facet Lee, HoJoon
Shuaibi, Ahmed
Bell, John M
Pavlichin, Dmitri S
Ji, Hanlee P
author_sort Lee, HoJoon
collection PubMed
description Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the same genome sequencing data. For insertion and deletion mutations, oftentimes there is no agreement among different callers. Identifying somatic mutations involves read mapping and variant calling, a complicated process that uses many parameters and model tuning. To validate the identification of true mutations, we developed a method using k-mer sequences. First, we characterized the landscape of unique versus non-unique k-mers in the human genome. Second, we developed a software package, KmerVC, to validate the given somatic mutations from sequencing data. Our program validates the occurrence of a mutation based on statistically significant difference in frequency of k-mers with and without a mutation from matched normal and tumor sequences. Third, we tested our method on both simulated and cancer genome sequencing data. Counting k-mer involving mutations effectively validated true positive mutations including insertions and deletions across different individual samples in a reproducible manner. Thus, we demonstrated a straightforward approach for rapidly validating mutations from cancer genome sequencing data.
format Online
Article
Text
id pubmed-7727745
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77277452020-12-16 Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations Lee, HoJoon Shuaibi, Ahmed Bell, John M Pavlichin, Dmitri S Ji, Hanlee P NAR Cancer Cancer Computational Biology Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the same genome sequencing data. For insertion and deletion mutations, oftentimes there is no agreement among different callers. Identifying somatic mutations involves read mapping and variant calling, a complicated process that uses many parameters and model tuning. To validate the identification of true mutations, we developed a method using k-mer sequences. First, we characterized the landscape of unique versus non-unique k-mers in the human genome. Second, we developed a software package, KmerVC, to validate the given somatic mutations from sequencing data. Our program validates the occurrence of a mutation based on statistically significant difference in frequency of k-mers with and without a mutation from matched normal and tumor sequences. Third, we tested our method on both simulated and cancer genome sequencing data. Counting k-mer involving mutations effectively validated true positive mutations including insertions and deletions across different individual samples in a reproducible manner. Thus, we demonstrated a straightforward approach for rapidly validating mutations from cancer genome sequencing data. Oxford University Press 2020-12-10 /pmc/articles/PMC7727745/ /pubmed/33345188 http://dx.doi.org/10.1093/narcan/zcaa034 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of NAR Cancer. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Cancer Computational Biology
Lee, HoJoon
Shuaibi, Ahmed
Bell, John M
Pavlichin, Dmitri S
Ji, Hanlee P
Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
title Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
title_full Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
title_fullStr Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
title_full_unstemmed Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
title_short Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
title_sort unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
topic Cancer Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7727745/
https://www.ncbi.nlm.nih.gov/pubmed/33345188
http://dx.doi.org/10.1093/narcan/zcaa034
work_keys_str_mv AT leehojoon uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations
AT shuaibiahmed uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations
AT belljohnm uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations
AT pavlichindmitris uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations
AT jihanleep uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations