Cargando…

Collaborative Research on Mouth Shape and Lyrics in Singing Practice Based on Image Processing

Image processing is a mainstream processing method. When people enjoy artists' singing videos, there will be a problem that the subtitles of the lyrics are out of sync with the singer's mouth shape. This problem needs to be solved using image processing technology, letting the computer rea...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Lujia, Chen, Chen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8816571/
https://www.ncbi.nlm.nih.gov/pubmed/35126493
http://dx.doi.org/10.1155/2022/5138442
Descripción
Sumario:Image processing is a mainstream processing method. When people enjoy artists' singing videos, there will be a problem that the subtitles of the lyrics are out of sync with the singer's mouth shape. This problem needs to be solved using image processing technology, letting the computer realize lip-reading recognition function and correct the mouth shape and lyrics subtitles in the image according to the extracted lip-reading data, so that the mouth shape and lyrics in singing practice can be synchronized. Lip-reading information can effectively improve the accuracy of language cognition, save part of capital and manpower investment, and make viewers get a good audio-visual interactive experience. The results show the following: (1) After the UI test, the system user interface function design is reasonable and there is no bad BUG. We can find that the average processing time of each frame is 628 ms, the system performance evaluation is good, and the success rate can be as high as 98.80%. 0.36724 s is the average time for each step when the system processes the image. (2) The human image can basically identify the portrait area and lip area from various angles. (3) Compared with DCT and DWT, the recognition rate of the two cascade lip region feature extraction methods is improved by nearly 10%, and the feature vector dimension is reduced by nearly 65%. (4) Classify the mouth shape more finely and optimize the image of the tester's mouth shape to make the mouth shape closer to the standard mouth shape. (5) After systematic correction of mouth shape and subtitles, the success rate is higher than 90%. Finally, we can find that the running effect is good and the method has achieved high results, which can carry out the details of the next optimization work.