Cargando…

CNARA: reliability assessment for genomic copy number profiles

BACKGROUND: DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes corre...

Descripción completa

Detalles Bibliográficos
Autores principales: Ai, Ni, Cai, Haoyang, Solovan, Caius, Baudis, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5062840/
https://www.ncbi.nlm.nih.gov/pubmed/27733115
http://dx.doi.org/10.1186/s12864-016-3074-7
_version_ 1782459856962191360
author Ai, Ni
Cai, Haoyang
Solovan, Caius
Baudis, Michael
author_facet Ai, Ni
Cai, Haoyang
Solovan, Caius
Baudis, Michael
author_sort Ai, Ni
collection PubMed
description BACKGROUND: DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly. RESULTS: Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA. CONCLUSIONS: We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3074-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5062840
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50628402016-10-17 CNARA: reliability assessment for genomic copy number profiles Ai, Ni Cai, Haoyang Solovan, Caius Baudis, Michael BMC Genomics Original Paper BACKGROUND: DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly. RESULTS: Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA. CONCLUSIONS: We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3074-7) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-12 /pmc/articles/PMC5062840/ /pubmed/27733115 http://dx.doi.org/10.1186/s12864-016-3074-7 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Original Paper
Ai, Ni
Cai, Haoyang
Solovan, Caius
Baudis, Michael
CNARA: reliability assessment for genomic copy number profiles
title CNARA: reliability assessment for genomic copy number profiles
title_full CNARA: reliability assessment for genomic copy number profiles
title_fullStr CNARA: reliability assessment for genomic copy number profiles
title_full_unstemmed CNARA: reliability assessment for genomic copy number profiles
title_short CNARA: reliability assessment for genomic copy number profiles
title_sort cnara: reliability assessment for genomic copy number profiles
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5062840/
https://www.ncbi.nlm.nih.gov/pubmed/27733115
http://dx.doi.org/10.1186/s12864-016-3074-7
work_keys_str_mv AT aini cnarareliabilityassessmentforgenomiccopynumberprofiles
AT caihaoyang cnarareliabilityassessmentforgenomiccopynumberprofiles
AT solovancaius cnarareliabilityassessmentforgenomiccopynumberprofiles
AT baudismichael cnarareliabilityassessmentforgenomiccopynumberprofiles