Cargando…
PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study
Copy Number Variation (CNV) refers to a type of structural genomic alteration in which a segment of chromosome is duplicated or deleted. To date, many CNVs have been identified as causative genetic elements for several diseases and phenotypes. However, performing a CNV-based genome-wide association...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9478359/ https://www.ncbi.nlm.nih.gov/pubmed/36147666 http://dx.doi.org/10.1016/j.csbj.2022.09.001 |
_version_ | 1784790553016139776 |
---|---|
author | Labani, Mahdieh Afrasiabi, Ali Beheshti, Amin Lovell, Nigel H. Alinejad-Rokny, Hamid |
author_facet | Labani, Mahdieh Afrasiabi, Ali Beheshti, Amin Lovell, Nigel H. Alinejad-Rokny, Hamid |
author_sort | Labani, Mahdieh |
collection | PubMed |
description | Copy Number Variation (CNV) refers to a type of structural genomic alteration in which a segment of chromosome is duplicated or deleted. To date, many CNVs have been identified as causative genetic elements for several diseases and phenotypes. However, performing a CNV-based genome-wide association study is challenging due to inconsistency in length and occurrence of CNVs across different individuals under investigation. One of the most efficient strategies to address this issue is building CNV regions (genomic regions in which CNVs are overlapping - CNVRs). However, this approach is susceptible to a high false positive rate due to overlapping and co-occurring of confounding CNVRs with true positive CNVRs. Here, we develop PeakCNV that differentiates false-positive CNVRs from true positives by calculating a new metric, independence ranking score, (IR-score) via a feature ranking approach. We compared the performance of PeakCNV with other current existing tools by carrying out two case studies one using the CNV genotype data for individuals with prostate cancer (194 cases and 2,392 healthy individuals) and the second one for individuals with neurodevelopmental disorders (19,642 cases and 6,451 healthy individuals). Crucially, our benchmarking analyses on prostate cancer cohort indicated that PeakCNV identifies a fewer risk candidate CNVRs with shorter lengths compared to other tools. Importantly, these CNVRs cover a greater proportion of case over healthy individuals compared to other tools. The accuracy of PeakCNV in identifying relevant candidate CNVRs was reproducible in the case study on neurodevelopmental disorders. Using data from the FANTOM5 expression atlas and the Clinical Genomic Database, we show that the candidate CNVRs identified by PeakCNV for neurodevelopmental disorders overlap with a greater number of genes with the brain-enriched expression, and a greater number of genes that are associated with neurological conditions compared to candidate CNVRs identified by other tools. Taken together, PeakCNV outperformed current existing CNV association study tools by identifying more biologically meaningful CNVRs relevant to the phenotype of interest. PeakCNV is publicly available for the analysis of CNV-associated diseases and is accessible from https://rdrr.io/github/mahdieh1/PeakCNV. |
format | Online Article Text |
id | pubmed-9478359 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-94783592022-09-21 PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study Labani, Mahdieh Afrasiabi, Ali Beheshti, Amin Lovell, Nigel H. Alinejad-Rokny, Hamid Comput Struct Biotechnol J Method Article Copy Number Variation (CNV) refers to a type of structural genomic alteration in which a segment of chromosome is duplicated or deleted. To date, many CNVs have been identified as causative genetic elements for several diseases and phenotypes. However, performing a CNV-based genome-wide association study is challenging due to inconsistency in length and occurrence of CNVs across different individuals under investigation. One of the most efficient strategies to address this issue is building CNV regions (genomic regions in which CNVs are overlapping - CNVRs). However, this approach is susceptible to a high false positive rate due to overlapping and co-occurring of confounding CNVRs with true positive CNVRs. Here, we develop PeakCNV that differentiates false-positive CNVRs from true positives by calculating a new metric, independence ranking score, (IR-score) via a feature ranking approach. We compared the performance of PeakCNV with other current existing tools by carrying out two case studies one using the CNV genotype data for individuals with prostate cancer (194 cases and 2,392 healthy individuals) and the second one for individuals with neurodevelopmental disorders (19,642 cases and 6,451 healthy individuals). Crucially, our benchmarking analyses on prostate cancer cohort indicated that PeakCNV identifies a fewer risk candidate CNVRs with shorter lengths compared to other tools. Importantly, these CNVRs cover a greater proportion of case over healthy individuals compared to other tools. The accuracy of PeakCNV in identifying relevant candidate CNVRs was reproducible in the case study on neurodevelopmental disorders. Using data from the FANTOM5 expression atlas and the Clinical Genomic Database, we show that the candidate CNVRs identified by PeakCNV for neurodevelopmental disorders overlap with a greater number of genes with the brain-enriched expression, and a greater number of genes that are associated with neurological conditions compared to candidate CNVRs identified by other tools. Taken together, PeakCNV outperformed current existing CNV association study tools by identifying more biologically meaningful CNVRs relevant to the phenotype of interest. PeakCNV is publicly available for the analysis of CNV-associated diseases and is accessible from https://rdrr.io/github/mahdieh1/PeakCNV. Research Network of Computational and Structural Biotechnology 2022-09-07 /pmc/articles/PMC9478359/ /pubmed/36147666 http://dx.doi.org/10.1016/j.csbj.2022.09.001 Text en Crown Copyright © 2022 Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Method Article Labani, Mahdieh Afrasiabi, Ali Beheshti, Amin Lovell, Nigel H. Alinejad-Rokny, Hamid PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
title | PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
title_full | PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
title_fullStr | PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
title_full_unstemmed | PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
title_short | PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
title_sort | peakcnv: a multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9478359/ https://www.ncbi.nlm.nih.gov/pubmed/36147666 http://dx.doi.org/10.1016/j.csbj.2022.09.001 |
work_keys_str_mv | AT labanimahdieh peakcnvamultifeaturerankingalgorithmbasedtoolforgenomewidecopynumbervariationassociationstudy AT afrasiabiali peakcnvamultifeaturerankingalgorithmbasedtoolforgenomewidecopynumbervariationassociationstudy AT beheshtiamin peakcnvamultifeaturerankingalgorithmbasedtoolforgenomewidecopynumbervariationassociationstudy AT lovellnigelh peakcnvamultifeaturerankingalgorithmbasedtoolforgenomewidecopynumbervariationassociationstudy AT alinejadroknyhamid peakcnvamultifeaturerankingalgorithmbasedtoolforgenomewidecopynumbervariationassociationstudy |