Cargando…

Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies

Transposon insertion sequencing (TIS) is a widely used technique for conducting genome-scale forward genetic screens in bacteria. However, few methods enable comparison of TIS data across multiple replicates of a screen or across independent screens, including screens performed in different organism...

Descripción completa

Detalles Bibliográficos
Autores principales: Hubbard, Troy P., D’Gama, Jonathan D., Billings, Gabriel, Davis, Brigid M., Waldor, Matthew K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6382967/
https://www.ncbi.nlm.nih.gov/pubmed/30787116
http://dx.doi.org/10.1128/mSphere.00031-19
_version_ 1783396761286475776
author Hubbard, Troy P.
D’Gama, Jonathan D.
Billings, Gabriel
Davis, Brigid M.
Waldor, Matthew K.
author_facet Hubbard, Troy P.
D’Gama, Jonathan D.
Billings, Gabriel
Davis, Brigid M.
Waldor, Matthew K.
author_sort Hubbard, Troy P.
collection PubMed
description Transposon insertion sequencing (TIS) is a widely used technique for conducting genome-scale forward genetic screens in bacteria. However, few methods enable comparison of TIS data across multiple replicates of a screen or across independent screens, including screens performed in different organisms. Here, we introduce a post hoc analytic framework, comparative TIS (CompTIS), which utilizes unsupervised learning to enable meta-analysis of multiple TIS data sets. CompTIS first implements screen-level principal-component analysis (PCA) and clustering to identify variation between the TIS screens. This initial screen-level analysis facilitates the selection of related screens for additional analyses, reveals the relatedness of complex environments based on growth phenotypes measured by TIS, and provides a useful quality control step. Subsequently, PCA is performed on genes to identify loci whose corresponding mutants lead to concordant/discordant phenotypes across all or in a subset of screens. We used CompTIS to analyze published intestinal colonization TIS data sets from two vibrio species. Gene-level analyses identified both pan-vibrio genes required for intestinal colonization and conserved genes that displayed species-specific requirements. CompTIS is applicable to virtually any combination of TIS screens and can be implemented without regard to either the number of screens or the methods used for upstream data analysis. IMPORTANCE Forward genetic screens are powerful tools for functional genomics. The comparison of similar forward genetic screens performed in different organisms enables the identification of genes with similar or different phenotypes across organisms. Transposon insertion sequencing is a widely used method for conducting genome-scale forward genetic screens in bacteria, yet few bioinformatic approaches have been developed to compare the results of screen replicates and different screens conducted across species or strains. Here, we used principal-component analysis (PCA) and hierarchical clustering, two unsupervised learning approaches, to analyze the relatedness of multiple in vivo screens of pathogenic vibrios. This analytic framework reveals both shared pan-vibrio requirements for intestinal colonization and strain-specific dependencies. Our findings suggest that PCA-based analytics will be a straightforward widely applicable approach for comparing diverse transposon insertion sequencing screens.
format Online
Article
Text
id pubmed-6382967
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-63829672019-02-22 Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies Hubbard, Troy P. D’Gama, Jonathan D. Billings, Gabriel Davis, Brigid M. Waldor, Matthew K. mSphere Research Article Transposon insertion sequencing (TIS) is a widely used technique for conducting genome-scale forward genetic screens in bacteria. However, few methods enable comparison of TIS data across multiple replicates of a screen or across independent screens, including screens performed in different organisms. Here, we introduce a post hoc analytic framework, comparative TIS (CompTIS), which utilizes unsupervised learning to enable meta-analysis of multiple TIS data sets. CompTIS first implements screen-level principal-component analysis (PCA) and clustering to identify variation between the TIS screens. This initial screen-level analysis facilitates the selection of related screens for additional analyses, reveals the relatedness of complex environments based on growth phenotypes measured by TIS, and provides a useful quality control step. Subsequently, PCA is performed on genes to identify loci whose corresponding mutants lead to concordant/discordant phenotypes across all or in a subset of screens. We used CompTIS to analyze published intestinal colonization TIS data sets from two vibrio species. Gene-level analyses identified both pan-vibrio genes required for intestinal colonization and conserved genes that displayed species-specific requirements. CompTIS is applicable to virtually any combination of TIS screens and can be implemented without regard to either the number of screens or the methods used for upstream data analysis. IMPORTANCE Forward genetic screens are powerful tools for functional genomics. The comparison of similar forward genetic screens performed in different organisms enables the identification of genes with similar or different phenotypes across organisms. Transposon insertion sequencing is a widely used method for conducting genome-scale forward genetic screens in bacteria, yet few bioinformatic approaches have been developed to compare the results of screen replicates and different screens conducted across species or strains. Here, we used principal-component analysis (PCA) and hierarchical clustering, two unsupervised learning approaches, to analyze the relatedness of multiple in vivo screens of pathogenic vibrios. This analytic framework reveals both shared pan-vibrio requirements for intestinal colonization and strain-specific dependencies. Our findings suggest that PCA-based analytics will be a straightforward widely applicable approach for comparing diverse transposon insertion sequencing screens. American Society for Microbiology 2019-02-20 /pmc/articles/PMC6382967/ /pubmed/30787116 http://dx.doi.org/10.1128/mSphere.00031-19 Text en Copyright © 2019 Hubbard et al. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Hubbard, Troy P.
D’Gama, Jonathan D.
Billings, Gabriel
Davis, Brigid M.
Waldor, Matthew K.
Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies
title Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies
title_full Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies
title_fullStr Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies
title_full_unstemmed Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies
title_short Unsupervised Learning Approach for Comparing Multiple Transposon Insertion Sequencing Studies
title_sort unsupervised learning approach for comparing multiple transposon insertion sequencing studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6382967/
https://www.ncbi.nlm.nih.gov/pubmed/30787116
http://dx.doi.org/10.1128/mSphere.00031-19
work_keys_str_mv AT hubbardtroyp unsupervisedlearningapproachforcomparingmultipletransposoninsertionsequencingstudies
AT dgamajonathand unsupervisedlearningapproachforcomparingmultipletransposoninsertionsequencingstudies
AT billingsgabriel unsupervisedlearningapproachforcomparingmultipletransposoninsertionsequencingstudies
AT davisbrigidm unsupervisedlearningapproachforcomparingmultipletransposoninsertionsequencingstudies
AT waldormatthewk unsupervisedlearningapproachforcomparingmultipletransposoninsertionsequencingstudies