Cargando…

Sensory Descriptor Analysis of Whisky Lexicons through the Use of Deep Learning

This paper is concerned with extracting relevant terms from a text corpus on whisk(e)y. “Relevant” terms are usually contextually defined in their domain of use. Arguably, every domain has a specialized vocabulary used for describing things. For example, the field of Sensory Science, a sub-field of...

Descripción completa

Detalles Bibliográficos
Autores principales: Miller, Chreston, Hamilton, Leah, Lahne, Jacob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8303687/
https://www.ncbi.nlm.nih.gov/pubmed/34359502
http://dx.doi.org/10.3390/foods10071633
Descripción
Sumario:This paper is concerned with extracting relevant terms from a text corpus on whisk(e)y. “Relevant” terms are usually contextually defined in their domain of use. Arguably, every domain has a specialized vocabulary used for describing things. For example, the field of Sensory Science, a sub-field of Food Science, investigates human responses to food products and differentiates “descriptive” terms for flavors from “ordinary”, non-descriptive language. Within the field, descriptors are generated through Descriptive Analysis, a method wherein a human panel of experts tastes multiple food products and defines descriptors. This process is both time-consuming and expensive. However, one could leverage existing data to identify and build a flavor language automatically. For example, there are thousands of professional and semi-professional reviews of whisk(e)y published on the internet, providing abundant descriptors interspersed with non-descriptive language. The aim, then, is to be able to automatically identify descriptive terms in unstructured reviews for later use in product flavor characterization. We created two systems to perform this task. The first is an interactive visual tool that can be used to tag examples of descriptive terms from thousands of whisky reviews. This creates a training dataset that we use to perform transfer learning using GloVe word embeddings and a Long Short-Term Memory deep learning model architecture. The result is a model that can accurately identify descriptors within a corpus of whisky review texts with a train/test accuracy of 99% and precision, recall, and F1-scores of 0.99. We tested for overfitting by comparing the training and validation loss for divergence. Our results show that the language structure for descriptive terms can be programmatically learned.