Cargando…
Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection
Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verificati...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3830839/ https://www.ncbi.nlm.nih.gov/pubmed/24288684 http://dx.doi.org/10.1155/2013/720834 |
_version_ | 1782291533274284032 |
---|---|
author | Fong, Simon Lan, Kun Wong, Raymond |
author_facet | Fong, Simon Lan, Kun Wong, Raymond |
author_sort | Fong, Simon |
collection | PubMed |
description | Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. |
format | Online Article Text |
id | pubmed-3830839 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-38308392013-11-28 Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection Fong, Simon Lan, Kun Wong, Raymond Biomed Res Int Research Article Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. Hindawi Publishing Corporation 2013 2013-10-29 /pmc/articles/PMC3830839/ /pubmed/24288684 http://dx.doi.org/10.1155/2013/720834 Text en Copyright © 2013 Simon Fong et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Fong, Simon Lan, Kun Wong, Raymond Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection |
title | Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection |
title_full | Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection |
title_fullStr | Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection |
title_full_unstemmed | Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection |
title_short | Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection |
title_sort | classifying human voices by using hybrid sfx time-series preprocessing and ensemble feature selection |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3830839/ https://www.ncbi.nlm.nih.gov/pubmed/24288684 http://dx.doi.org/10.1155/2013/720834 |
work_keys_str_mv | AT fongsimon classifyinghumanvoicesbyusinghybridsfxtimeseriespreprocessingandensemblefeatureselection AT lankun classifyinghumanvoicesbyusinghybridsfxtimeseriespreprocessingandensemblefeatureselection AT wongraymond classifyinghumanvoicesbyusinghybridsfxtimeseriespreprocessingandensemblefeatureselection |