An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR

Kashvi Ruparelia; Priyam Parikh; Parth Atulkumar Shah

doi:doi:10.11648/j.ajcst.20250804.13

Research Article |

| Peer-Reviewed

An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR

Kashvi Ruparelia, Priyam Parikh^*

, Parth Atulkumar Shah

Published in American Journal of Computer Science and Technology (Volume 8, Issue 4)

Received: 12 September 2025 Accepted: 23 September 2025 Published: 30 October 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

This paper presents the design and evaluation of a jacket–helmet assistive system for visually impaired individuals in India. The system integrates a Raspberry Pi 4B with a USB web camera, USB microphone, vibration motor cluster, earphone, pushbuttons, and a rechargeable 7.4 V, 10,000 mAh battery. Two primary functions are implemented: (i) object detection and distance estimation using YOLO algorithms with 2D depth estimation, and (ii) text recognition on posters and hoardings using optical character recognition (OCR). Comparative analysis of YOLOv5, YOLOv7, and YOLOv8 models demonstrated that YOLOv8 achieved the highest mean Average Precision (mAP) of 92.4%, outperforming YOLOv7 (89.6%) and YOLOv5 (87.3%). For monocular 2D depth estimation, MiDaS achieved the lowest mean absolute relative error (0.124) compared to Monodepth2 (0.156) and DPT (0.139). Speech-to-text efficiency was tested across Google Speech Recognition, Vosk, and CMU Sphinx, with Google achieving 94.1% accuracy, followed by Vosk (88.3%) and CMU Sphinx (81.6%). User trials were conducted with ten visually impaired individuals across diverse environments (bus stand, garden, bungalow, and home settings). System usability was measured using the System Usability Scale (SUS), yielding an overall average score of 84.6, indicating “excellent” usability. The proposed system demonstrates high accuracy, robustness, and practicality for real-world navigation and reading assistance, thus contributing to improved autonomy and quality of life for visually impaired users.

Published in	American Journal of Computer Science and Technology (Volume 8, Issue 4)
DOI	10.11648/j.ajcst.20250804.13
Page(s)	189-205
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Assistive Technology, YOLO Object Detection, Depth Estimation, Speech-to-Text, OCR, Raspberry Pi, Visually Impaired, System Usability Scale (SUS)

References

[1]	W. Wang, B. Jing, X. Yu, Y. Sun, L. Yang, and C. Wang, “YOLO-OD: Obstacle Detection for Visually Impaired Navigation Assistance,” Sensors, vol. 24, no. 23, p. 7621, 2024. https://doi.org/10.3390/s24237621
[2]	W. Wang, X. Yu, B. Jing, Y. Sun, L. Yang, and C. Wang, “YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather,” Sensors, vol. 25, no. 14, p. 4338, 2025. https://doi.org/10.3390/s25144338
[3]	W. Song, X. Cui, Y. Xie, G. Wang, and J. Ma, “Monocular Depth Estimation via a Detail-Semantic Collaborative Network for Indoor Scenes,” Scientific Reports, vol. 15, no. 1, p. 10990, 2025. https://doi.org/10.1038/s41598-025-96024-4
[4]	Y. Xi, S. Li, Z. Xu, F. Zhou, and J. Tian, “LapUNet: A Novel Approach to Monocular Depth Estimation Using Dynamic Laplacian Residual U-Shape Networks,” Scientific Reports, vol. 14, no. 1, p. 23544, 2024. https://doi.org/10.1038/s41598-024-74445-x
[5]	A. Abdusalomov, S. Umirzakova, M. B. Shukhratovich, A. Kakhorov, and Y.-I. Cho, “Breaking New Ground in Monocular Depth Estimation with Dynamic Iterative Refinement and Scale Consistency,” Applied Sciences, vol. 15, no. 2, p. 674, 2025. https://doi.org/10.3390/app15020674
[6]	A. Paramarthalingam, T. Subramani, and K. Mahadevan, “A Deep Learning Model to Assist Visually Impaired,” Machine Learning with Applications, vol. 15, p. 100156, 2024. https://doi.org/10.1016/j.mlwa.2024.100156
[7]	G. I. Okolo, S. C. Chukwuedo, O. U. Ezeani, and E. A. Nwokoye, “Smart Assistive Navigation System for Visually Impaired Individuals,” Journal of Digital Research, vol. 4, no. 1, pp. 1–10, 2025. https://doi.org/10.57197/JDR-2024-0086
[8]	A. Pratap, S. Kumar, and S. Chakravarty, “Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms,” arXiv preprint arXiv: 2501.18444, 2025.
[9]	A. B. Atitallah, Y. Said, M. A. B. Atitallah, M. Albekairi, K. Kaaniche, and S. Boubaker, “An effective obstacle detection system using deep learning advantages to aid blind and visually impaired navigation,” Ain Shams Engineering Journal, vol. 15, no. 2, p. 102387, 2024, https://doi.org/10.1016/j.asej.2023.102387
[10]	S. C. Sethuraman, G. R. Tadkapally, S. P. Mohanty, G. Galada, and A. Subramanian, “MagicEye: An Intelligent Wearable Towards Independent Living of Visually Impaired,” arXiv: 2303.13863, 2023. arXiv.
[11]	V. Moram, S. Zahruddin, Sonu Kumar, “Multifunctional Assistive Smart Glasses for Visually Impaired,” SN Computer Science, vol. 6, no. 2, p. 173, 2025. https://doi.org/10.1007/s42979-025-03701-2 ACM Digital Library+1.
[12]	P. Pfreundschuh, G. Cioffi, C. von Einem, A. Wyss, H. Wernher van de Venn, C. Cadena, D. Scaramuzza, Roland Siegwart, and A. Darvishy, “Sight Guide: A Wearable Assistive Perception and Navigation System for the Vision Assistance Race in the Cybathlon 2024,” arXiv: 2506.02676, 2025. arXiv+1.
[13]	Y. Chen et al., “A wearable assistive system for the visually impaired using object recognition, distance measurement and tactile presentation,” Infrared Physics & Engineering / IR, 2023 (or the journal in OAEPublish). OAE Publish.
[14]	M. S. A. Baig, S. A. Gillani, S. M. Shah, M. Aljawarneh, A. Akbar Khan, and M. H. Siddiqui, “AI-based Wearable Vision Assistance System for the Visually Impaired: Integrating Real-Time Object Recognition and Contextual Understanding Using Large Vision-Language Models,” arXiv: 2412.20059, 2024. arXiv.
[15]	I. Tokmurziyev, M. Altamirano Cabrera, M. Haris Khan, Y. Mahmoud, L. Moreno, and D. Tsetserukou, “LLM-Glasses: GenAI-driven Glasses with Haptic Feedback for Navigation of Visually Impaired People,” arXiv: 2503.16475, 2025. arXiv.
[16]	Neel Mani Upadhyay, Aryan Pratap Singh, Ashwin Perti, “eyeRoad – An App that Helps Visually Impaired Peoples,” ICICC 2024. https://doi.org/10.2139/ssrn.4825671
[17]	X. Zhang et al., “Advancements in Smart Wearable Mobility Aids for Visual Impairment: A Bibliometric Analysis,” PMC, 2024. PMC.
[18]	J. Jocher, A. Chaurasia, and G. Qiu, “YOLOv5: A state-of-the-art real-time object detection system,” GitHub Repository, 2020. Available: https://github.com/ultralytics/yolov5
[19]	C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” arXiv preprint arXiv: 2207.02696, 2022.
[20]	G. Jocher, Y. Qiu, and A. Chaurasia, “YOLOv8: Next-generation real-time object detector,” Ultralytics Technical Report, 2023. Available: https://github.com/ultralytics/ultralytics
[21]	R. S. Mehta and V. Kumar, “Comparative evaluation of YOLOv5, YOLOv7 and YOLOv8 for real-time object detection,” Procedia Computer Science, vol. 227, pp. 116–124, 2023. https://doi.org/10.1016/j.procs.2023.03.015
[22]	P. A. Parikh, K. D. Joshi and R. Trivedi, "Face Detection-Based Depth Estimation by 2D and 3D Cameras: A Comparison," 2022 28th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), Nanjing, China, 2022, pp. 1-4, https://doi.org/10.1109/M2VIP55626.2022.10041072

Cite This Article

Plain Text BibTeX RIS

APA Style

Ruparelia, K., Parikh, P., Shah, P. A. (2025). An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR. American Journal of Computer Science and Technology, 8(4), 189-205. https://doi.org/10.11648/j.ajcst.20250804.13

Copy | Download

ACS Style

Ruparelia, K.; Parikh, P.; Shah, P. A. An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR. Am. J. Comput. Sci. Technol. 2025, 8(4), 189-205. doi: 10.11648/j.ajcst.20250804.13

Copy | Download

AMA Style

Ruparelia K, Parikh P, Shah PA. An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR. Am J Comput Sci Technol. 2025;8(4):189-205. doi: 10.11648/j.ajcst.20250804.13

Copy | Download

@article{10.11648/j.ajcst.20250804.13,
  author = {Kashvi Ruparelia and Priyam Parikh and Parth Atulkumar Shah},
  title = {An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR
},
  journal = {American Journal of Computer Science and Technology},
  volume = {8},
  number = {4},
  pages = {189-205},
  doi = {10.11648/j.ajcst.20250804.13},
  url = {https://doi.org/10.11648/j.ajcst.20250804.13},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20250804.13},
  abstract = {This paper presents the design and evaluation of a jacket–helmet assistive system for visually impaired individuals in India. The system integrates a Raspberry Pi 4B with a USB web camera, USB microphone, vibration motor cluster, earphone, pushbuttons, and a rechargeable 7.4 V, 10,000 mAh battery. Two primary functions are implemented: (i) object detection and distance estimation using YOLO algorithms with 2D depth estimation, and (ii) text recognition on posters and hoardings using optical character recognition (OCR). Comparative analysis of YOLOv5, YOLOv7, and YOLOv8 models demonstrated that YOLOv8 achieved the highest mean Average Precision (mAP) of 92.4%, outperforming YOLOv7 (89.6%) and YOLOv5 (87.3%). For monocular 2D depth estimation, MiDaS achieved the lowest mean absolute relative error (0.124) compared to Monodepth2 (0.156) and DPT (0.139). Speech-to-text efficiency was tested across Google Speech Recognition, Vosk, and CMU Sphinx, with Google achieving 94.1% accuracy, followed by Vosk (88.3%) and CMU Sphinx (81.6%). User trials were conducted with ten visually impaired individuals across diverse environments (bus stand, garden, bungalow, and home settings). System usability was measured using the System Usability Scale (SUS), yielding an overall average score of 84.6, indicating “excellent” usability. The proposed system demonstrates high accuracy, robustness, and practicality for real-world navigation and reading assistance, thus contributing to improved autonomy and quality of life for visually impaired users.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR

AU  - Kashvi Ruparelia
AU  - Priyam Parikh
AU  - Parth Atulkumar Shah
Y1  - 2025/10/30
PY  - 2025
N1  - https://doi.org/10.11648/j.ajcst.20250804.13
DO  - 10.11648/j.ajcst.20250804.13
T2  - American Journal of Computer Science and Technology
JF  - American Journal of Computer Science and Technology
JO  - American Journal of Computer Science and Technology
SP  - 189
EP  - 205
PB  - Science Publishing Group
SN  - 2640-012X
UR  - https://doi.org/10.11648/j.ajcst.20250804.13
AB  - This paper presents the design and evaluation of a jacket–helmet assistive system for visually impaired individuals in India. The system integrates a Raspberry Pi 4B with a USB web camera, USB microphone, vibration motor cluster, earphone, pushbuttons, and a rechargeable 7.4 V, 10,000 mAh battery. Two primary functions are implemented: (i) object detection and distance estimation using YOLO algorithms with 2D depth estimation, and (ii) text recognition on posters and hoardings using optical character recognition (OCR). Comparative analysis of YOLOv5, YOLOv7, and YOLOv8 models demonstrated that YOLOv8 achieved the highest mean Average Precision (mAP) of 92.4%, outperforming YOLOv7 (89.6%) and YOLOv5 (87.3%). For monocular 2D depth estimation, MiDaS achieved the lowest mean absolute relative error (0.124) compared to Monodepth2 (0.156) and DPT (0.139). Speech-to-text efficiency was tested across Google Speech Recognition, Vosk, and CMU Sphinx, with Google achieving 94.1% accuracy, followed by Vosk (88.3%) and CMU Sphinx (81.6%). User trials were conducted with ten visually impaired individuals across diverse environments (bus stand, garden, bungalow, and home settings). System usability was measured using the System Usability Scale (SUS), yielding an overall average score of 84.6, indicating “excellent” usability. The proposed system demonstrates high accuracy, robustness, and practicality for real-world navigation and reading assistance, thus contributing to improved autonomy and quality of life for visually impaired users.

VL  - 8
IS  - 4
ER  -

Copy | Download

Author Information

Kashvi Ruparelia

Higher Secondary Department, Anand Niketan Satellite, Ahmedabad, India

Contact Email
Priyam Parikh

Product Design Department, Anant National University, Ahmedabad, India

Contact Email

http://orcid.org/0000-0001-8538-3669
Parth Atulkumar Shah

Product Design Department, Anant National University, Ahmedabad, India

Contact Email

http://orcid.org/0000-0003-3395-8189

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Ruparelia, K., Parikh, P., Shah, P. A. (2025). An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR. American Journal of Computer Science and Technology, 8(4), 189-205. https://doi.org/10.11648/j.ajcst.20250804.13

Copy | Download

ACS Style

Ruparelia, K.; Parikh, P.; Shah, P. A. An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR. Am. J. Comput. Sci. Technol. 2025, 8(4), 189-205. doi: 10.11648/j.ajcst.20250804.13

Copy | Download

AMA Style

Ruparelia K, Parikh P, Shah PA. An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR. Am J Comput Sci Technol. 2025;8(4):189-205. doi: 10.11648/j.ajcst.20250804.13

Copy | Download

@article{10.11648/j.ajcst.20250804.13,
  author = {Kashvi Ruparelia and Priyam Parikh and Parth Atulkumar Shah},
  title = {An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR
},
  journal = {American Journal of Computer Science and Technology},
  volume = {8},
  number = {4},
  pages = {189-205},
  doi = {10.11648/j.ajcst.20250804.13},
  url = {https://doi.org/10.11648/j.ajcst.20250804.13},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20250804.13},
  abstract = {This paper presents the design and evaluation of a jacket–helmet assistive system for visually impaired individuals in India. The system integrates a Raspberry Pi 4B with a USB web camera, USB microphone, vibration motor cluster, earphone, pushbuttons, and a rechargeable 7.4 V, 10,000 mAh battery. Two primary functions are implemented: (i) object detection and distance estimation using YOLO algorithms with 2D depth estimation, and (ii) text recognition on posters and hoardings using optical character recognition (OCR). Comparative analysis of YOLOv5, YOLOv7, and YOLOv8 models demonstrated that YOLOv8 achieved the highest mean Average Precision (mAP) of 92.4%, outperforming YOLOv7 (89.6%) and YOLOv5 (87.3%). For monocular 2D depth estimation, MiDaS achieved the lowest mean absolute relative error (0.124) compared to Monodepth2 (0.156) and DPT (0.139). Speech-to-text efficiency was tested across Google Speech Recognition, Vosk, and CMU Sphinx, with Google achieving 94.1% accuracy, followed by Vosk (88.3%) and CMU Sphinx (81.6%). User trials were conducted with ten visually impaired individuals across diverse environments (bus stand, garden, bungalow, and home settings). System usability was measured using the System Usability Scale (SUS), yielding an overall average score of 84.6, indicating “excellent” usability. The proposed system demonstrates high accuracy, robustness, and practicality for real-world navigation and reading assistance, thus contributing to improved autonomy and quality of life for visually impaired users.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - An Integrated Jacket–Helmet Assistive System for Visually Impaired Individuals Using YOLO-Based Object Detection, Depth Estimation, and OCR

AU  - Kashvi Ruparelia
AU  - Priyam Parikh
AU  - Parth Atulkumar Shah
Y1  - 2025/10/30
PY  - 2025
N1  - https://doi.org/10.11648/j.ajcst.20250804.13
DO  - 10.11648/j.ajcst.20250804.13
T2  - American Journal of Computer Science and Technology
JF  - American Journal of Computer Science and Technology
JO  - American Journal of Computer Science and Technology
SP  - 189
EP  - 205
PB  - Science Publishing Group
SN  - 2640-012X
UR  - https://doi.org/10.11648/j.ajcst.20250804.13
AB  - This paper presents the design and evaluation of a jacket–helmet assistive system for visually impaired individuals in India. The system integrates a Raspberry Pi 4B with a USB web camera, USB microphone, vibration motor cluster, earphone, pushbuttons, and a rechargeable 7.4 V, 10,000 mAh battery. Two primary functions are implemented: (i) object detection and distance estimation using YOLO algorithms with 2D depth estimation, and (ii) text recognition on posters and hoardings using optical character recognition (OCR). Comparative analysis of YOLOv5, YOLOv7, and YOLOv8 models demonstrated that YOLOv8 achieved the highest mean Average Precision (mAP) of 92.4%, outperforming YOLOv7 (89.6%) and YOLOv5 (87.3%). For monocular 2D depth estimation, MiDaS achieved the lowest mean absolute relative error (0.124) compared to Monodepth2 (0.156) and DPT (0.139). Speech-to-text efficiency was tested across Google Speech Recognition, Vosk, and CMU Sphinx, with Google achieving 94.1% accuracy, followed by Vosk (88.3%) and CMU Sphinx (81.6%). User trials were conducted with ten visually impaired individuals across diverse environments (bus stand, garden, bungalow, and home settings). System usability was measured using the System Usability Scale (SUS), yielding an overall average score of 84.6, indicating “excellent” usability. The proposed system demonstrates high accuracy, robustness, and practicality for real-world navigation and reading assistance, thus contributing to improved autonomy and quality of life for visually impaired users.

VL  - 8
IS  - 4
ER  -

Copy | Download