Cargando…
Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays
BACKGROUND: Microarray-based comparative genomic hybridization (aCGH) is used for rapid comparison of genomes of different bacterial strains. The purpose is to evaluate the distribution of genes from sequenced bacterial strains (control) among unsequenced strains (test). We previously compared the u...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3995598/ https://www.ncbi.nlm.nih.gov/pubmed/24629208 http://dx.doi.org/10.1186/1756-0500-7-148 |
_version_ | 1782312896001212416 |
---|---|
author | Cardoso, Liliana Sofia Suissas, Cláudia Elvas Ramirez, Mário Antunes, Marília Pinto, Francisco Rodrigues |
author_facet | Cardoso, Liliana Sofia Suissas, Cláudia Elvas Ramirez, Mário Antunes, Marília Pinto, Francisco Rodrigues |
author_sort | Cardoso, Liliana Sofia |
collection | PubMed |
description | BACKGROUND: Microarray-based comparative genomic hybridization (aCGH) is used for rapid comparison of genomes of different bacterial strains. The purpose is to evaluate the distribution of genes from sequenced bacterial strains (control) among unsequenced strains (test). We previously compared the use of single strain versus multiple strain control with arrays covering multiple genomes. The conclusion was that a multiple strain control promoted a better separation of signals between present and absent genes. FINDINGS: We now extend our previous study by applying the Expectation-Maximization (EM) algorithm to fit a mixture model to the signal distribution in order to classify each gene as present or absent and by comparing different methods for analyzing aCGH data, using combinations of different control strain choices, two different statistical mixture models, with or without normalization, with or without logarithm transformation and with test-over-control or inverse signal ratio calculation. We also assessed the impact of replication on classification accuracy. Higher values of accuracy have been achieved using the ratio of control-over-test intensities, without logarithmic transformation and with a strain mix control. Normalization and the type of mixture model fitted by the EM algorithm did not have a significant impact on classification accuracy. Similarly, using the average of replicate arrays to perform the classification does not significantly improve the results. CONCLUSIONS: Our work provides a guiding benchmark comparison of alternative methods to analyze aCGH results that can impact on the analysis of currently ongoing comparative genomic projects or in the re-analysis of published studies. |
format | Online Article Text |
id | pubmed-3995598 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39955982014-05-07 Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays Cardoso, Liliana Sofia Suissas, Cláudia Elvas Ramirez, Mário Antunes, Marília Pinto, Francisco Rodrigues BMC Res Notes Short Report BACKGROUND: Microarray-based comparative genomic hybridization (aCGH) is used for rapid comparison of genomes of different bacterial strains. The purpose is to evaluate the distribution of genes from sequenced bacterial strains (control) among unsequenced strains (test). We previously compared the use of single strain versus multiple strain control with arrays covering multiple genomes. The conclusion was that a multiple strain control promoted a better separation of signals between present and absent genes. FINDINGS: We now extend our previous study by applying the Expectation-Maximization (EM) algorithm to fit a mixture model to the signal distribution in order to classify each gene as present or absent and by comparing different methods for analyzing aCGH data, using combinations of different control strain choices, two different statistical mixture models, with or without normalization, with or without logarithm transformation and with test-over-control or inverse signal ratio calculation. We also assessed the impact of replication on classification accuracy. Higher values of accuracy have been achieved using the ratio of control-over-test intensities, without logarithmic transformation and with a strain mix control. Normalization and the type of mixture model fitted by the EM algorithm did not have a significant impact on classification accuracy. Similarly, using the average of replicate arrays to perform the classification does not significantly improve the results. CONCLUSIONS: Our work provides a guiding benchmark comparison of alternative methods to analyze aCGH results that can impact on the analysis of currently ongoing comparative genomic projects or in the re-analysis of published studies. BioMed Central 2014-03-14 /pmc/articles/PMC3995598/ /pubmed/24629208 http://dx.doi.org/10.1186/1756-0500-7-148 Text en Copyright © 2014 Cardoso et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Short Report Cardoso, Liliana Sofia Suissas, Cláudia Elvas Ramirez, Mário Antunes, Marília Pinto, Francisco Rodrigues Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays |
title | Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays |
title_full | Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays |
title_fullStr | Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays |
title_full_unstemmed | Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays |
title_short | Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays |
title_sort | comparison of alternative mixture model methods to analyze bacterial cgh experiments with multi-genome arrays |
topic | Short Report |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3995598/ https://www.ncbi.nlm.nih.gov/pubmed/24629208 http://dx.doi.org/10.1186/1756-0500-7-148 |
work_keys_str_mv | AT cardosolilianasofia comparisonofalternativemixturemodelmethodstoanalyzebacterialcghexperimentswithmultigenomearrays AT suissasclaudiaelvas comparisonofalternativemixturemodelmethodstoanalyzebacterialcghexperimentswithmultigenomearrays AT ramirezmario comparisonofalternativemixturemodelmethodstoanalyzebacterialcghexperimentswithmultigenomearrays AT antunesmarilia comparisonofalternativemixturemodelmethodstoanalyzebacterialcghexperimentswithmultigenomearrays AT pintofranciscorodrigues comparisonofalternativemixturemodelmethodstoanalyzebacterialcghexperimentswithmultigenomearrays |