Cargando…

Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data

Inference of absolute copy numbers in tumor genomes is one of the key points in the study of tumor genesis. However, the mixture of tumor and normal cells poses a big challenge to this task. Accurate estimation of tumor purity (i.e., the fraction of tumor cells) is a necessary step to solve this pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Yuan, Xiguo, Li, Zhe, Zhao, Haiyong, Bai, Jun, Zhang, Junying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7205152/
https://www.ncbi.nlm.nih.gov/pubmed/32425990
http://dx.doi.org/10.3389/fgene.2020.00458
_version_ 1783530189521682432
author Yuan, Xiguo
Li, Zhe
Zhao, Haiyong
Bai, Jun
Zhang, Junying
author_facet Yuan, Xiguo
Li, Zhe
Zhao, Haiyong
Bai, Jun
Zhang, Junying
author_sort Yuan, Xiguo
collection PubMed
description Inference of absolute copy numbers in tumor genomes is one of the key points in the study of tumor genesis. However, the mixture of tumor and normal cells poses a big challenge to this task. Accurate estimation of tumor purity (i.e., the fraction of tumor cells) is a necessary step to solve this problem. In this paper, we propose a new approach, AITAC, to accurately infer tumor purity and absolute copy numbers in a tumor sample by using high-throughput sequencing (HTS) data. In contrast to many existing algorithms for estimating tumor purity, which usually rely on pre-detected mutation genotypes (heterogeneity and homogeneity), AITAC just requires read depths (RDs) observed at the regions with copy number losses. AITAC creates a non-linear model to correlate tumor purity, observed and expected RDs. It adopts an exhaustive search strategy to scan tumor purity in a wide range, and chooses the tumor purity that minimizes the deviation between observed RDs and expected ones as the optimal solution. We apply the proposed approach to both simulation and real sequencing data sets and demonstrate its performance by comparing with two classical approaches. AITAC is freely available at https://github.com/BDanalysis/aitac and can be expected to become a useful approach for researchers to analyze copy numbers in cancer genome.
format Online
Article
Text
id pubmed-7205152
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-72051522020-05-18 Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data Yuan, Xiguo Li, Zhe Zhao, Haiyong Bai, Jun Zhang, Junying Front Genet Genetics Inference of absolute copy numbers in tumor genomes is one of the key points in the study of tumor genesis. However, the mixture of tumor and normal cells poses a big challenge to this task. Accurate estimation of tumor purity (i.e., the fraction of tumor cells) is a necessary step to solve this problem. In this paper, we propose a new approach, AITAC, to accurately infer tumor purity and absolute copy numbers in a tumor sample by using high-throughput sequencing (HTS) data. In contrast to many existing algorithms for estimating tumor purity, which usually rely on pre-detected mutation genotypes (heterogeneity and homogeneity), AITAC just requires read depths (RDs) observed at the regions with copy number losses. AITAC creates a non-linear model to correlate tumor purity, observed and expected RDs. It adopts an exhaustive search strategy to scan tumor purity in a wide range, and chooses the tumor purity that minimizes the deviation between observed RDs and expected ones as the optimal solution. We apply the proposed approach to both simulation and real sequencing data sets and demonstrate its performance by comparing with two classical approaches. AITAC is freely available at https://github.com/BDanalysis/aitac and can be expected to become a useful approach for researchers to analyze copy numbers in cancer genome. Frontiers Media S.A. 2020-04-30 /pmc/articles/PMC7205152/ /pubmed/32425990 http://dx.doi.org/10.3389/fgene.2020.00458 Text en Copyright © 2020 Yuan, Li, Zhao, Bai and Zhang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yuan, Xiguo
Li, Zhe
Zhao, Haiyong
Bai, Jun
Zhang, Junying
Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data
title Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data
title_full Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data
title_fullStr Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data
title_full_unstemmed Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data
title_short Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data
title_sort accurate inference of tumor purity and absolute copy numbers from high-throughput sequencing data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7205152/
https://www.ncbi.nlm.nih.gov/pubmed/32425990
http://dx.doi.org/10.3389/fgene.2020.00458
work_keys_str_mv AT yuanxiguo accurateinferenceoftumorpurityandabsolutecopynumbersfromhighthroughputsequencingdata
AT lizhe accurateinferenceoftumorpurityandabsolutecopynumbersfromhighthroughputsequencingdata
AT zhaohaiyong accurateinferenceoftumorpurityandabsolutecopynumbersfromhighthroughputsequencingdata
AT baijun accurateinferenceoftumorpurityandabsolutecopynumbersfromhighthroughputsequencingdata
AT zhangjunying accurateinferenceoftumorpurityandabsolutecopynumbersfromhighthroughputsequencingdata