Cargando…

MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data

Copy number variation (CNV) is a very important phenomenon in tumor genomes and plays a significant role in tumor genesis. Accurate detection of CNVs has become a routine and necessary procedure for a deep investigation of tumor cells and diagnosis of tumor patients. Next-generation sequencing (NGS)...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Haiyong, Huang, Tihao, Li, Junqing, Liu, Guojun, Yuan, Xiguo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7243272/
https://www.ncbi.nlm.nih.gov/pubmed/32499814
http://dx.doi.org/10.3389/fgene.2020.00434
_version_ 1783537398217441280
author Zhao, Haiyong
Huang, Tihao
Li, Junqing
Liu, Guojun
Yuan, Xiguo
author_facet Zhao, Haiyong
Huang, Tihao
Li, Junqing
Liu, Guojun
Yuan, Xiguo
author_sort Zhao, Haiyong
collection PubMed
description Copy number variation (CNV) is a very important phenomenon in tumor genomes and plays a significant role in tumor genesis. Accurate detection of CNVs has become a routine and necessary procedure for a deep investigation of tumor cells and diagnosis of tumor patients. Next-generation sequencing (NGS) technique has provided a wealth of data for the detection of CNVs at base-pair resolution. However, such task is usually influenced by a number of factors, including GC-content bias, sequencing errors, and correlations among adjacent positions within CNVs. Although many existing methods have dealt with some of these artifacts by designing their own strategies, there is still a lack of comprehensive consideration of all the factors. In this paper, we propose a new method, MFCNV, for an accurate detection of CNVs from NGS data. Compared with existing methods, the characteristics of the proposed method include the following: (1) it makes a full consideration of the intrinsic correlations among adjacent positions in the genome to be analyzed, (2) it calculates read depth, GC-content bias, base quality, and correlation value for each genome bin and combines them as multiple features for the evaluation of genome bins, and (3) it addresses the joint effect among the factors via training a neural network algorithm for the prediction of CNVs. We test the performance of the MFCNV method by using simulation and real sequencing data and make comparisons with several peer methods. The results demonstrate that our method is superior to other methods in terms of sensitivity, precision, and F1-score and can detect many CNVs that other methods have not discovered. MFCNV is expected to be a complementary tool in the analysis of mutations in tumor genomes and can be extended to be applied to the analysis of single-cell sequencing data.
format Online
Article
Text
id pubmed-7243272
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-72432722020-06-03 MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data Zhao, Haiyong Huang, Tihao Li, Junqing Liu, Guojun Yuan, Xiguo Front Genet Genetics Copy number variation (CNV) is a very important phenomenon in tumor genomes and plays a significant role in tumor genesis. Accurate detection of CNVs has become a routine and necessary procedure for a deep investigation of tumor cells and diagnosis of tumor patients. Next-generation sequencing (NGS) technique has provided a wealth of data for the detection of CNVs at base-pair resolution. However, such task is usually influenced by a number of factors, including GC-content bias, sequencing errors, and correlations among adjacent positions within CNVs. Although many existing methods have dealt with some of these artifacts by designing their own strategies, there is still a lack of comprehensive consideration of all the factors. In this paper, we propose a new method, MFCNV, for an accurate detection of CNVs from NGS data. Compared with existing methods, the characteristics of the proposed method include the following: (1) it makes a full consideration of the intrinsic correlations among adjacent positions in the genome to be analyzed, (2) it calculates read depth, GC-content bias, base quality, and correlation value for each genome bin and combines them as multiple features for the evaluation of genome bins, and (3) it addresses the joint effect among the factors via training a neural network algorithm for the prediction of CNVs. We test the performance of the MFCNV method by using simulation and real sequencing data and make comparisons with several peer methods. The results demonstrate that our method is superior to other methods in terms of sensitivity, precision, and F1-score and can detect many CNVs that other methods have not discovered. MFCNV is expected to be a complementary tool in the analysis of mutations in tumor genomes and can be extended to be applied to the analysis of single-cell sequencing data. Frontiers Media S.A. 2020-05-15 /pmc/articles/PMC7243272/ /pubmed/32499814 http://dx.doi.org/10.3389/fgene.2020.00434 Text en Copyright © 2020 Zhao, Huang, Li, Liu and Yuan. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhao, Haiyong
Huang, Tihao
Li, Junqing
Liu, Guojun
Yuan, Xiguo
MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data
title MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data
title_full MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data
title_fullStr MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data
title_full_unstemmed MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data
title_short MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data
title_sort mfcnv: a new method to detect copy number variations from next-generation sequencing data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7243272/
https://www.ncbi.nlm.nih.gov/pubmed/32499814
http://dx.doi.org/10.3389/fgene.2020.00434
work_keys_str_mv AT zhaohaiyong mfcnvanewmethodtodetectcopynumbervariationsfromnextgenerationsequencingdata
AT huangtihao mfcnvanewmethodtodetectcopynumbervariationsfromnextgenerationsequencingdata
AT lijunqing mfcnvanewmethodtodetectcopynumbervariationsfromnextgenerationsequencingdata
AT liuguojun mfcnvanewmethodtodetectcopynumbervariationsfromnextgenerationsequencingdata
AT yuanxiguo mfcnvanewmethodtodetectcopynumbervariationsfromnextgenerationsequencingdata