Cargando…
Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations
Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the s...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7727745/ https://www.ncbi.nlm.nih.gov/pubmed/33345188 http://dx.doi.org/10.1093/narcan/zcaa034 |
_version_ | 1783621123568566272 |
---|---|
author | Lee, HoJoon Shuaibi, Ahmed Bell, John M Pavlichin, Dmitri S Ji, Hanlee P |
author_facet | Lee, HoJoon Shuaibi, Ahmed Bell, John M Pavlichin, Dmitri S Ji, Hanlee P |
author_sort | Lee, HoJoon |
collection | PubMed |
description | Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the same genome sequencing data. For insertion and deletion mutations, oftentimes there is no agreement among different callers. Identifying somatic mutations involves read mapping and variant calling, a complicated process that uses many parameters and model tuning. To validate the identification of true mutations, we developed a method using k-mer sequences. First, we characterized the landscape of unique versus non-unique k-mers in the human genome. Second, we developed a software package, KmerVC, to validate the given somatic mutations from sequencing data. Our program validates the occurrence of a mutation based on statistically significant difference in frequency of k-mers with and without a mutation from matched normal and tumor sequences. Third, we tested our method on both simulated and cancer genome sequencing data. Counting k-mer involving mutations effectively validated true positive mutations including insertions and deletions across different individual samples in a reproducible manner. Thus, we demonstrated a straightforward approach for rapidly validating mutations from cancer genome sequencing data. |
format | Online Article Text |
id | pubmed-7727745 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77277452020-12-16 Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations Lee, HoJoon Shuaibi, Ahmed Bell, John M Pavlichin, Dmitri S Ji, Hanlee P NAR Cancer Cancer Computational Biology Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the same genome sequencing data. For insertion and deletion mutations, oftentimes there is no agreement among different callers. Identifying somatic mutations involves read mapping and variant calling, a complicated process that uses many parameters and model tuning. To validate the identification of true mutations, we developed a method using k-mer sequences. First, we characterized the landscape of unique versus non-unique k-mers in the human genome. Second, we developed a software package, KmerVC, to validate the given somatic mutations from sequencing data. Our program validates the occurrence of a mutation based on statistically significant difference in frequency of k-mers with and without a mutation from matched normal and tumor sequences. Third, we tested our method on both simulated and cancer genome sequencing data. Counting k-mer involving mutations effectively validated true positive mutations including insertions and deletions across different individual samples in a reproducible manner. Thus, we demonstrated a straightforward approach for rapidly validating mutations from cancer genome sequencing data. Oxford University Press 2020-12-10 /pmc/articles/PMC7727745/ /pubmed/33345188 http://dx.doi.org/10.1093/narcan/zcaa034 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of NAR Cancer. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Cancer Computational Biology Lee, HoJoon Shuaibi, Ahmed Bell, John M Pavlichin, Dmitri S Ji, Hanlee P Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
title | Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
title_full | Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
title_fullStr | Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
title_full_unstemmed | Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
title_short | Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
title_sort | unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations |
topic | Cancer Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7727745/ https://www.ncbi.nlm.nih.gov/pubmed/33345188 http://dx.doi.org/10.1093/narcan/zcaa034 |
work_keys_str_mv | AT leehojoon uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations AT shuaibiahmed uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations AT belljohnm uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations AT pavlichindmitris uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations AT jihanleep uniquekmersequencesforvalidatingcancerrelatedsubstitutioninsertionanddeletionmutations |