Cargando…

ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets

Copy-number variations (CNVs) are an essential component of genetic variation distributed across large parts of the human genome. CNV detection from next-generation sequencing data and artificial intelligence algorithms have progressed in recent years. However, only a few tools have taken advantage...

Descripción completa

Detalles Bibliográficos
Autores principales: Cabello-Aguilar, Simon, Vendrell, Julie A., Van Goethem, Charles, Brousse, Mehdi, Gozé, Catherine, Frantz, Laurent, Solassol, Jérôme
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Gene & Cell Therapy 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9547229/
https://www.ncbi.nlm.nih.gov/pubmed/36250203
http://dx.doi.org/10.1016/j.omtn.2022.09.009
_version_ 1784805218554216448
author Cabello-Aguilar, Simon
Vendrell, Julie A.
Van Goethem, Charles
Brousse, Mehdi
Gozé, Catherine
Frantz, Laurent
Solassol, Jérôme
author_facet Cabello-Aguilar, Simon
Vendrell, Julie A.
Van Goethem, Charles
Brousse, Mehdi
Gozé, Catherine
Frantz, Laurent
Solassol, Jérôme
author_sort Cabello-Aguilar, Simon
collection PubMed
description Copy-number variations (CNVs) are an essential component of genetic variation distributed across large parts of the human genome. CNV detection from next-generation sequencing data and artificial intelligence algorithms have progressed in recent years. However, only a few tools have taken advantage of machine-learning algorithms for CNV detection, and none propose using artificial intelligence to automatically detect probable CNV-positive samples. The most developed approach is to use a reference or normal dataset to compare with the samples of interest, and it is well known that selecting appropriate normal samples represents a challenging task that dramatically influences the precision of results in all CNV-detecting tools. With careful consideration of these issues, we propose here ifCNV, a new software based on isolation forests that creates its own reference, available in R and python with customizable parameters. ifCNV combines artificial intelligence using two isolation forests and a comprehensive scoring method to faithfully detect CNVs among various samples. It was validated using targeted next-generation sequencing (NGS) datasets from diverse origins (capture and amplicon, germline and somatic), and it exhibits high sensitivity, specificity, and accuracy. ifCNV is a publicly available open-source software (https://github.com/SimCab-CHU/ifCNV) that allows the detection of CNVs in many clinical situations.
format Online
Article
Text
id pubmed-9547229
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society of Gene & Cell Therapy
record_format MEDLINE/PubMed
spelling pubmed-95472292022-10-14 ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets Cabello-Aguilar, Simon Vendrell, Julie A. Van Goethem, Charles Brousse, Mehdi Gozé, Catherine Frantz, Laurent Solassol, Jérôme Mol Ther Nucleic Acids Original Article Copy-number variations (CNVs) are an essential component of genetic variation distributed across large parts of the human genome. CNV detection from next-generation sequencing data and artificial intelligence algorithms have progressed in recent years. However, only a few tools have taken advantage of machine-learning algorithms for CNV detection, and none propose using artificial intelligence to automatically detect probable CNV-positive samples. The most developed approach is to use a reference or normal dataset to compare with the samples of interest, and it is well known that selecting appropriate normal samples represents a challenging task that dramatically influences the precision of results in all CNV-detecting tools. With careful consideration of these issues, we propose here ifCNV, a new software based on isolation forests that creates its own reference, available in R and python with customizable parameters. ifCNV combines artificial intelligence using two isolation forests and a comprehensive scoring method to faithfully detect CNVs among various samples. It was validated using targeted next-generation sequencing (NGS) datasets from diverse origins (capture and amplicon, germline and somatic), and it exhibits high sensitivity, specificity, and accuracy. ifCNV is a publicly available open-source software (https://github.com/SimCab-CHU/ifCNV) that allows the detection of CNVs in many clinical situations. American Society of Gene & Cell Therapy 2022-09-22 /pmc/articles/PMC9547229/ /pubmed/36250203 http://dx.doi.org/10.1016/j.omtn.2022.09.009 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Article
Cabello-Aguilar, Simon
Vendrell, Julie A.
Van Goethem, Charles
Brousse, Mehdi
Gozé, Catherine
Frantz, Laurent
Solassol, Jérôme
ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets
title ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets
title_full ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets
title_fullStr ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets
title_full_unstemmed ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets
title_short ifCNV: A novel isolation-forest-based package to detect copy-number variations from various targeted NGS datasets
title_sort ifcnv: a novel isolation-forest-based package to detect copy-number variations from various targeted ngs datasets
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9547229/
https://www.ncbi.nlm.nih.gov/pubmed/36250203
http://dx.doi.org/10.1016/j.omtn.2022.09.009
work_keys_str_mv AT cabelloaguilarsimon ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets
AT vendrelljuliea ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets
AT vangoethemcharles ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets
AT broussemehdi ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets
AT gozecatherine ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets
AT frantzlaurent ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets
AT solassoljerome ifcnvanovelisolationforestbasedpackagetodetectcopynumbervariationsfromvarioustargetedngsdatasets