School of Science and Technology 科技學院
Computing Programmes 電腦學系

A Mobile Application for Assisting the Visually Impaired in Shopping Products Based on Machine Learning

LAM Chun Yip Chris, LAU Ka Chun Gordon, TANG Chun Man Tommy

 
Programme Bachelor of Computing with Honours in Internet Technology
Supervisor Dr. Vanessa Ng
Areas Intelligent Applications
Year of Completion 2018

Objectives

The aim of this project is to develop an application to assist visually impaired in shopping products in supermarket. As there are numerous products in a supermarket, we will focus on the canned food in this project. The reason is that canned foods have the similar shape which makes visually impaired cannot distinguish among them by touching. If they have an allergy to some food such as pork or beef, the consequence of misidentification may be fatal.

For this application, object recognition based on machine learning will be used to classify different canned foods. Object recognition is recognizing the canned foods from an image. Machine learning approach allows the application to learn from numerous data and perform recognition after training.

To achieve the aim, the main objective of the project is to recognize different canned foods through smartphone's camera. The project has also defined several sub-objectives as follows:

  • Collect canned food images and information
  • Train the object recognition system on recognizing the canned food
  • Develop the image processing system for pre-processing before recognition
  • Implement the Optical-Character-Recognition (OCR) to recognize the text on the image
  • Implement the system to provide the reference price of canned food
  • Implement the application to provide bilingual result (Cantonese and English)
  • Develop the Android application for user to capture a short video or photo
  • Implement the gesture control for visually impaired to interact with the Android application
  • Implement the guide module to guide user to use the application
  • Design the evaluation plan and evaluate the application

Video Demonstration

Background and Methodology

To assist visually impaired in shopping products, the solution should be portable and provide relevant information to them. A mobile platform prototype solution to assist visually impaired in shopping products based on machine learning is designed in this project.

The prototype solution is developed to classify canned foods. Visually impaired can access through simple gestures, the application can recognize the canned food in photos which taken by them. The latest information of the canned foods will be read out to them.

Agile Development is used through the development of this project. Features such as Optical Character Recognition and guiding module are added to the application one by one with the integrated test.

Figure 1: Information flow diagram

Figure 2: Chopped pork and ham: Highlights of text information

The text on canned foods is useful in classifying the canned food. To extract the printed text, we need Optical Character Recognition. As it is difficult to build our own OCR, we used the OCR service provided by Google. Google Vision provides OCR service, we can use the service through sending HTTP requests.

Once the photos are sent to Google Vision, it returns a JSON object which is including the recognized text in the photos. Our server will extract the text in JSON object and compare to the words stored in the Database. We will rate the canned foods according to the matched word and their weighting. The score will be normalized and sum with the result from the machine learning model for predicting the canned food.

Although the text canned food is informative, there are some limitations of OCR. Since the photos taken by visually impaired may not be clear enough to recognize, we cannot rely on the result of OCR only. Also, the text on canned food is not always in proper font. As in figure 13, the Chinese words ”火腿豬肉” cannot be recognized by OCR. So, the machine learning model is essential in recognizing the canned foods. 

Conclusion and Future Development

After performed several evaluations to the prototype, our team found that iDetect can assist visually impaired in shopping products without help from others and gain more independence in their daily life. This makes them to have a stronger intention to go outside and join the majority.

However, our solution is not perfect. There are some limitations in our solution.

Firstly, the provided information is not enough. For this prototype, what iDetect provides are only the price, brand, name, volume and special offer of the canned food. However, these may not enough for visually impaired to enjoy shopping in supermarkets. For example, our group, as ordinary people, we always want to read nutrition facts printed on the package to protect our health. Visually impaired do have this demand too. But it is quite difficult for our team to collect nutrition facts for each product from a reliable source. If the solution can provide the nutrition facts, visually impaired can shop in supermarkets with more supports.

Secondly, the solution is not able to provide an offline service. Since the relevant information of canned food is updated hourly, which is a power and bandwidth consuming operation, it is not suitable to store canned food information into the user's phone. However, according to our team's experience, phones may have bad reception in some supermarkets, especially in some underground supermarkets. iDetect may be out of service in these situations. If iDetect can be used offline, users are able to use it everywhere.

For the future work, iDetect can be enhanced by increasing the number of supported products. Canned food is only one kind of products in supermarkets. There are more kinds of products in supermarkets, such as drinks, snack foods, wines, etc. Visually impaired can have more choices when they are shopping in supermarkets.

Furthermore, iDetect can also be enhanced by providing iOS version. Not all the visually impaired use Android phone. Some of them may use iPhone. Providing IOS version can benefit more visually impaired.