Cargando…

Supervised learning for the automated transcription of spacer classification from spoligotype films

BACKGROUND: Molecular genotyping of bacteria has revolutionized the study of tuberculosis epidemiology, yet these established laboratory techniques typically require subjective and laborious interpretation by trained professionals. In the context of a Tuberculosis Case Contact study in The Gambia we...

Descripción completa

Detalles Bibliográficos
Autores principales: Jeffries, David J, Abernethy, Neil, de Jong, Bouke C
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2731757/
https://www.ncbi.nlm.nih.gov/pubmed/19674444
http://dx.doi.org/10.1186/1471-2105-10-248
Descripción
Sumario:BACKGROUND: Molecular genotyping of bacteria has revolutionized the study of tuberculosis epidemiology, yet these established laboratory techniques typically require subjective and laborious interpretation by trained professionals. In the context of a Tuberculosis Case Contact study in The Gambia we used a reverse hybridization laboratory assay called spoligotype analysis. To facilitate processing of spoligotype images we have developed tools and algorithms to automate the classification and transcription of these data directly to a database while allowing for manual editing. RESULTS: Features extracted from each of the 1849 spots on a spoligo film were classified using two supervised learning algorithms. A graphical user interface allows manual editing of the classification, before export to a database. The application was tested on ten films of differing quality and the results of the best classifier were compared to expert manual classification, giving a median correct classification rate of 98.1% (inter quartile range: 97.1% to 99.2%), with an automated processing time of less than 1 minute per film. CONCLUSION: The software implementation offers considerable time savings over manual processing whilst allowing expert editing of the automated classification. The automatic upload of the classification to a database reduces the chances of transcription errors.