Cargando…

Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping

High Resolution Melt (HRM) is a versatile and rapid post-PCR DNA analysis technique primarily used to differentiate sequence variants among only a few short amplicons. We recently developed a one-vs-one support vector machine algorithm (OVO SVM) that enables the use of HRM for identifying numerous s...

Descripción completa

Detalles Bibliográficos
Autores principales: Fraley, Stephanie I., Athamanolap, Pornpat, Masek, Billie J., Hardick, Justin, Carroll, Karen C., Hsieh, Yu-Hsiang, Rothman, Richard E., Gaydos, Charlotte A., Wang, Tza-Huei, Yang, Samuel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4726007/
https://www.ncbi.nlm.nih.gov/pubmed/26778280
http://dx.doi.org/10.1038/srep19218
_version_ 1782411726509047808
author Fraley, Stephanie I.
Athamanolap, Pornpat
Masek, Billie J.
Hardick, Justin
Carroll, Karen C.
Hsieh, Yu-Hsiang
Rothman, Richard E.
Gaydos, Charlotte A.
Wang, Tza-Huei
Yang, Samuel
author_facet Fraley, Stephanie I.
Athamanolap, Pornpat
Masek, Billie J.
Hardick, Justin
Carroll, Karen C.
Hsieh, Yu-Hsiang
Rothman, Richard E.
Gaydos, Charlotte A.
Wang, Tza-Huei
Yang, Samuel
author_sort Fraley, Stephanie I.
collection PubMed
description High Resolution Melt (HRM) is a versatile and rapid post-PCR DNA analysis technique primarily used to differentiate sequence variants among only a few short amplicons. We recently developed a one-vs-one support vector machine algorithm (OVO SVM) that enables the use of HRM for identifying numerous short amplicon sequences automatically and reliably. Herein, we set out to maximize the discriminating power of HRM + SVM for a single genetic locus by testing longer amplicons harboring significantly more sequence information. Using universal primers that amplify the hypervariable bacterial 16 S rRNA gene as a model system, we found that long amplicons yield more complex HRM curve shapes. We developed a novel nested OVO SVM approach to take advantage of this feature and achieved 100% accuracy in the identification of 37 clinically relevant bacteria in Leave-One-Out-Cross-Validation. A subset of organisms were independently tested. Those from pure culture were identified with high accuracy, while those tested directly from clinical blood bottles displayed more technical variability and reduced accuracy. Our findings demonstrate that long sequences can be accurately and automatically profiled by HRM with a novel nested SVM approach and suggest that clinical sample testing is feasible with further optimization.
format Online
Article
Text
id pubmed-4726007
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-47260072016-01-28 Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping Fraley, Stephanie I. Athamanolap, Pornpat Masek, Billie J. Hardick, Justin Carroll, Karen C. Hsieh, Yu-Hsiang Rothman, Richard E. Gaydos, Charlotte A. Wang, Tza-Huei Yang, Samuel Sci Rep Article High Resolution Melt (HRM) is a versatile and rapid post-PCR DNA analysis technique primarily used to differentiate sequence variants among only a few short amplicons. We recently developed a one-vs-one support vector machine algorithm (OVO SVM) that enables the use of HRM for identifying numerous short amplicon sequences automatically and reliably. Herein, we set out to maximize the discriminating power of HRM + SVM for a single genetic locus by testing longer amplicons harboring significantly more sequence information. Using universal primers that amplify the hypervariable bacterial 16 S rRNA gene as a model system, we found that long amplicons yield more complex HRM curve shapes. We developed a novel nested OVO SVM approach to take advantage of this feature and achieved 100% accuracy in the identification of 37 clinically relevant bacteria in Leave-One-Out-Cross-Validation. A subset of organisms were independently tested. Those from pure culture were identified with high accuracy, while those tested directly from clinical blood bottles displayed more technical variability and reduced accuracy. Our findings demonstrate that long sequences can be accurately and automatically profiled by HRM with a novel nested SVM approach and suggest that clinical sample testing is feasible with further optimization. Nature Publishing Group 2016-01-18 /pmc/articles/PMC4726007/ /pubmed/26778280 http://dx.doi.org/10.1038/srep19218 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Fraley, Stephanie I.
Athamanolap, Pornpat
Masek, Billie J.
Hardick, Justin
Carroll, Karen C.
Hsieh, Yu-Hsiang
Rothman, Richard E.
Gaydos, Charlotte A.
Wang, Tza-Huei
Yang, Samuel
Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping
title Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping
title_full Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping
title_fullStr Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping
title_full_unstemmed Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping
title_short Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping
title_sort nested machine learning facilitates increased sequence content for large-scale automated high resolution melt genotyping
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4726007/
https://www.ncbi.nlm.nih.gov/pubmed/26778280
http://dx.doi.org/10.1038/srep19218
work_keys_str_mv AT fraleystephaniei nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT athamanolappornpat nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT masekbilliej nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT hardickjustin nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT carrollkarenc nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT hsiehyuhsiang nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT rothmanricharde nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT gaydoscharlottea nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT wangtzahuei nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping
AT yangsamuel nestedmachinelearningfacilitatesincreasedsequencecontentforlargescaleautomatedhighresolutionmeltgenotyping