Open Access Open Access  Restricted Access Subscription Access

Drishti : Real-Time Object Recognition for the Visually Impaired


Affiliations
1 Student, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India
2 Assistant Professor, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India

   Subscribe/Renew Journal


In 2017, the World Health Organization (WHO) reported that nearly 284 million individuals worldwide experienced some degree of visual impairment, with approximately 39 million individuals suffering from total blindness. People with visual impairments often rely on assistance from others or use canes to move around and identify obstacles. Our proposed system aims to aid the visually impaired by identifying and classifying common objects in real-time, as well as recognizing text from various sources such as documents and signs. This system provides voice feedback to enhance understanding and navigation, and utilizes depth estimation algorithms to determine a safe distance between objects and individuals, promoting self-sufficiency and reducing dependence on others. We employ the COCO image dataset, which contains everyday objects and people, and utilize the Mobilenet SSD algorithm for real-time object identification. To enable real-time Optical Character Recognition (OCR) Text-To-Speech functionality, we employ advanced technologies such as OpenCV, Python, and Tesseract for text detection and recognition, and the Pyttsx3 library for converting recognized text into audible speech. Our proposed system is dependable, affordable, realistic, and feasible.

Keywords

COCO Dataset, Depth Estimation, Machine Learning, Object Detection, Optical Character Recognition (OCR), SSD Mobilenet, TensorFlow Object Detection API, Voice Alerts, Text-to-Speech, Visually impaired people

Paper Submission Date : January 20, 2023 ; Paper sent back for Revision : February 10, 2023 ; Paper Acceptance Date : February 18, 2023 ; Paper Published Online : April 5, 2023

User
Subscription Login to verify subscription
Notifications
Font Size

  • G. Balakrishnan and G. Sainarayanan, “Stereo Image Processing Procedure for Vision Rehabilitation,” Appl. Artif. Intell., vol. 22, no.6, pp. 501–522, Jul. 2008, doi: 10.1080/08839510802226777.
  • N. Mahmud, R. K. Saha, R. B. Zafar, M. B. H. Bhuian, and S. S. Sarwar, "Vibration and voice operated navigation system for visually impaired person," in 2014 Int. Conf. Inform., Electronics Vision, 2014, pp. 1–5, doi: 10.1109/ICIEV.2014.6850740.
  • R. Jiang, L, Qian, and Q. Shuhui,, “Let blind people see: Real-time visual recognition with results converted to 3D audio,” pp. 1–7, 2016. [Online]: Available: http://cs231n.stanford.edu/reports/2016/pdfs/218_Report.pdf
  • J. Dai, Y. Li, K. He, and J. Sun, “R-FCN: Object detection via region based fully convolutional networks,” in Conf. Neural Inform. Process. Syst., Barcelona, Spain, Dec. 4–6, 2016, pp. 379–387, doi: 10.48550/arXiv.1605.06409.
  • D. Choi and M. Kim, “Trends on object detection techniques based on Deep Learning,” Electronics Telecommun. Trends, vol. 33, no. 4, pp. 23–32, 2018, doi: 10.22648/ETRI.2018.J.330403.
  • A. A. D. Toro, S. E. Campaña Bastidas, and E. F. Caicedo Bravo,"Methodology to build a wearable system for assisting blind people in purposeful navigation," in 2020 3rd Int. Conf. Inf. Comp. Tech., San Jose, CA, USA, 2020, pp. 205–212, doi: 10.1109/ICICT50521.2020.00039.
  • D. P. Khairnar, R. B. Karad, A. Kapse, G. Kale, and P. Jadhav, "PARTHA: A visually impaired assistance system," in 2020 3rd Int. Conf. Communication Syst., Comput. IT Appl., 2020, pp. 32–37, doi: 10.1109/CSCITA47329.2020.9137791.
  • V. Kunta, C. Tuniki, and U. Sairam, "Multi-functional blind stick for visually impaired people," in 2020 5th Int. Conf. Commun. Electronics Sys., Coimbatore, India, 2020, pp. 895–899, doi: 10.1109/ICCES48766.2020.9137870.
  • R. R. Karmarkar and V. N. Hommane, “Object detection system for the blind with voice guidance,” Int. J. Eng. Appl. Sciences Technol., vol. 6, no. 2, 2021, pp. 67–70. [Online]. Available: https://ijeast.com/papers/67-70,Tesma602,IJEAST.pdf
  • M. Rajesh, K. R. Bindhu, K. A. Roy, A. Thomas, A. Thomas, T. B. Tharakan, and C. Dinesh, "Text recognition and face detection aid for visually impaired person using Raspberry PI," in 2017 Int. Conf. Circuit, Power Comput. Technol., pp. 1–5. IEEE, 2017, doi: 10.1109/ICCPCT.2017.8074355.
  • Z. Feng, “ResNet architecture and its variants: An overview.” Builtin.com. [Online]. Available: https://builtin.com/artificial-intelligence/resnet-architecture
  • “What is the COCO Dataset? What you need to know in 2023.” viso.ai. [Online]. Available: https://viso.ai/computer-vision/coco-dataset/#:~:text=The%20large%20dataset%20comprises%20annotated,popular%20technique%20in%20computer%20vision.
  • SSD Model Architecture. Packtpub.com. [Online]. Available: https://subscription.packtpub.com/book/programming/9781838821654/11/ch11lvl1sec74/5-ssd-model-architecture

Abstract Views: 134

PDF Views: 0




  • Drishti : Real-Time Object Recognition for the Visually Impaired

Abstract Views: 134  |  PDF Views: 0

Authors

Hitanshu Parekh
Student, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India
Niyati Agarwal
Student, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India
Pranav Bangera
Student, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India
Roger D’souza
Student, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India
Grinal Tuscano
Assistant Professor, Department of Information Technology, St. Francis Institute of Technology, Sardar Vallabhbhai Patel Road, Mount Poinsur, Borivali West, Mumbai, Maharashtra - 400 103, India

Abstract


In 2017, the World Health Organization (WHO) reported that nearly 284 million individuals worldwide experienced some degree of visual impairment, with approximately 39 million individuals suffering from total blindness. People with visual impairments often rely on assistance from others or use canes to move around and identify obstacles. Our proposed system aims to aid the visually impaired by identifying and classifying common objects in real-time, as well as recognizing text from various sources such as documents and signs. This system provides voice feedback to enhance understanding and navigation, and utilizes depth estimation algorithms to determine a safe distance between objects and individuals, promoting self-sufficiency and reducing dependence on others. We employ the COCO image dataset, which contains everyday objects and people, and utilize the Mobilenet SSD algorithm for real-time object identification. To enable real-time Optical Character Recognition (OCR) Text-To-Speech functionality, we employ advanced technologies such as OpenCV, Python, and Tesseract for text detection and recognition, and the Pyttsx3 library for converting recognized text into audible speech. Our proposed system is dependable, affordable, realistic, and feasible.

Keywords


COCO Dataset, Depth Estimation, Machine Learning, Object Detection, Optical Character Recognition (OCR), SSD Mobilenet, TensorFlow Object Detection API, Voice Alerts, Text-to-Speech, Visually impaired people

Paper Submission Date : January 20, 2023 ; Paper sent back for Revision : February 10, 2023 ; Paper Acceptance Date : February 18, 2023 ; Paper Published Online : April 5, 2023


References





DOI: https://doi.org/10.17010/ijcs%2F2023%2Fv8%2Fi2%2F172774