Cargando…
AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network
Unlike able-bodied persons, it is difficult for visually impaired people, especially those in the educational age, to build a full perception of the world due to the lack of normal vision. The rapid development of AI and sensing technologies has provided new solutions to visually impaired assistance...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Nature Singapore
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9245372/ http://dx.doi.org/10.1007/s42486-022-00108-3 |
_version_ | 1784738728255684608 |
---|---|
author | Li, Xinrong Huang, Meiyu Xu, Yao Cao, Yingze Lu, Yamei Wang, Pengfei Xiang, Xueshuang |
author_facet | Li, Xinrong Huang, Meiyu Xu, Yao Cao, Yingze Lu, Yamei Wang, Pengfei Xiang, Xueshuang |
author_sort | Li, Xinrong |
collection | PubMed |
description | Unlike able-bodied persons, it is difficult for visually impaired people, especially those in the educational age, to build a full perception of the world due to the lack of normal vision. The rapid development of AI and sensing technologies has provided new solutions to visually impaired assistance. However, to our knowledge, most previous studies focused on obstacle avoidance and environmental perception but paid less attention to educational assistance for visually impaired people. In this paper, we propose AviPer, a system that aims to assist visually impaired people to perceive the world via creating a continuous, immersive, and educational assisting pattern. Equipped with a self-developed flexible tactile glove and a webcam, AviPer can simultaneously predict the grasping object and provide voice feedback using the vision-tactile fusion classification model, when a visually impaired people is perceiving the object with his gloved hand. To achieve accurate multimodal classification, we creatively embed three attention mechanisms, namely temporal, channel-wise, and spatial attention in the model. Experimental results show that AviPer can achieve an accuracy of 99.75% in classification of 10 daily objects. We evaluated the system in a variety of extreme cases, which verified its robustness and demonstrated the necessity of visual and tactile modal fusion. We also conducted tests in the actual use scene and proved the usability and user-friendliness of the system. We opensourced the code and self-collected datasets in the hope of promoting research development and bringing changes to the lives of visually impaired people. |
format | Online Article Text |
id | pubmed-9245372 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Nature Singapore |
record_format | MEDLINE/PubMed |
spelling | pubmed-92453722022-07-01 AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network Li, Xinrong Huang, Meiyu Xu, Yao Cao, Yingze Lu, Yamei Wang, Pengfei Xiang, Xueshuang CCF Trans. Pervasive Comp. Interact. Regular Paper Unlike able-bodied persons, it is difficult for visually impaired people, especially those in the educational age, to build a full perception of the world due to the lack of normal vision. The rapid development of AI and sensing technologies has provided new solutions to visually impaired assistance. However, to our knowledge, most previous studies focused on obstacle avoidance and environmental perception but paid less attention to educational assistance for visually impaired people. In this paper, we propose AviPer, a system that aims to assist visually impaired people to perceive the world via creating a continuous, immersive, and educational assisting pattern. Equipped with a self-developed flexible tactile glove and a webcam, AviPer can simultaneously predict the grasping object and provide voice feedback using the vision-tactile fusion classification model, when a visually impaired people is perceiving the object with his gloved hand. To achieve accurate multimodal classification, we creatively embed three attention mechanisms, namely temporal, channel-wise, and spatial attention in the model. Experimental results show that AviPer can achieve an accuracy of 99.75% in classification of 10 daily objects. We evaluated the system in a variety of extreme cases, which verified its robustness and demonstrated the necessity of visual and tactile modal fusion. We also conducted tests in the actual use scene and proved the usability and user-friendliness of the system. We opensourced the code and self-collected datasets in the hope of promoting research development and bringing changes to the lives of visually impaired people. Springer Nature Singapore 2022-06-30 2022 /pmc/articles/PMC9245372/ http://dx.doi.org/10.1007/s42486-022-00108-3 Text en © China Computer Federation (CCF) 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Regular Paper Li, Xinrong Huang, Meiyu Xu, Yao Cao, Yingze Lu, Yamei Wang, Pengfei Xiang, Xueshuang AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
title | AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
title_full | AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
title_fullStr | AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
title_full_unstemmed | AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
title_short | AviPer: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
title_sort | aviper: assisting visually impaired people to perceive the world with visual-tactile multimodal attention network |
topic | Regular Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9245372/ http://dx.doi.org/10.1007/s42486-022-00108-3 |
work_keys_str_mv | AT lixinrong aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork AT huangmeiyu aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork AT xuyao aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork AT caoyingze aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork AT luyamei aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork AT wangpengfei aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork AT xiangxueshuang aviperassistingvisuallyimpairedpeopletoperceivetheworldwithvisualtactilemultimodalattentionnetwork |