School of Science and Technology 科技學院
Computing Programmes 電腦學系

CanChat – Cantonese Empathetic Chatbot for Secondary School Student

Cheng Tsz Chun, Li Chak Fung, WONG Chun Kiu Vincent

ProgrammeBachelor of Computing with Honours in Internet Technology

Bachelor of Science with Honours in Computer Science
SupervisorDr. Keith Lee
AreasIntelligent Applications
Year of Completion2023

Objectives

Pressure and depression are persistent problems among teenagers, often stemming from academics, peer and family relationships. CanChat, a Cantonese Chatbot for Secondary Counseling Services, aims to address these issues by providing a platform for students to seek help.

Project Aim

The project aims to create a chatbot application to support student counseling services in secondary schools. The chatbot will provide basic counseling services and information on academic and family issues. It will also diagnose mental health problems through NLP technologies and alert social workers in case of serious issues. The target audience is teenagers, who are at a prime time to receive treatment to prevent future consequences. Research shows high acceptance of automated conversational agents among users.

Project Objectives

The main goal of this project is to develop a Cantonese and English Deep Learning Chatbot that utilizes NLP technology, allowing the target audience to conduct basic consultations.

  • Design and develop NLP model to identify user emotions and generate advice

  • Implement life experiences to create a realistic character for empathy communication

  • Collect diverse Q&A data to train machine learning model

  • Design data flow architecture diagram

  • Improve response accuracy

  • Develop application for chatbot service with interactive features

  • Implement chatbot system and application as a prototype system

Impact of Project

The project aims to develop a new platform that combines NLP chatbot technology with counseling services to address the gap in the market in Hong Kong. The chatbot can diagnose and predict potential patients, assist in reducing shorthand problems, ensure fairness and privacy for all, and provide an added layer of security to protect client information.

Videos

Demonstration Video

Presentation Video

Methodologies and Technologies used

Overview of Solution

The project’s main focus was on developing the Cantonese and empathy model, with a react native application for the frontend. Telegram was initially planned as the user interface but was discarded due to limitations. The Cantonese empathetic model was converted to API using Nodejs and Flask for the backend. Hardware technical gaps were not a concern during development.

We could separate our whole architecture diagram into Frontend, Backend, and CanChatGPT, respectively. When a user interacted with our bot, a message was sent to the CanChat LLM model and entered our backend processor. It was then processed in the model classification processes. Finally, the message was fetched to the frontend and returned to the user.

Figure 1: Draft Demo

Technical Gap

The project aimed to develop an NLP model and React Native frontend, but limited data presented a challenge. Alternative sources of training data were needed. New libraries like LangChain and cloud storage like Firebase also posed difficulties. The team explored options to ensure success.

System Design

Figure 2: Whole Architecture diagram

Figure 3: Fronted Diagram

Figure 4: CanChatGPT diagram

Figure 5: Node js Diagram

Figure 6: Usercase diagram

Method on building

 

Implementaion of initial testing for CanChat from Bot Father

We utilized useState and useEffect hooks to manage state and side effects in the system, created a chatbot, and used React Native UI components. The chatbot was connected to a Node.js backend, and a fetch function retrieved data for a personalized user experience. The frontend system included six navigation screens, including login, home, informative, and chat rooms. The system was designed to be intuitive and user-friendly, providing a seamless user experience with consistent design language.

React Navigation

The application utilized the React Navigation library to handle screen navigation and create a stack of screens. Screen differentiation was used to distinguish between Auth and non-Auth screens, streamlining the user experience and ensuring authorized access. This approach created an intuitive and secure application with a seamless user experience.

Figma

To develop an easy-to-use and attractive interface, the team preferred to implement a UI design in Figma. Figma provided a plugin that converted Figma graphics to React Native CSS code, increasing efficiency for student developers.

Method on Backend

The team chose to utilize Node.js for handling Firebase and data processing in the application. They followed the official documentation to create a new Node.js project and implemented backend functionality for user authentication, data storage, and processing. Hosting the backend locally allowed for testing and debugging before deployment. Setting the host IP address to the local IP address enabled testing on multiple devices. The API call could be accessed using the local IP address and tested using tools like Postman. The Node.js backend played a critical role in the overall functionality and performance of the application.

Firebase

The developer started a Firebase project for data storage and authentication, configured Firestore and Authentication, and established a connection between Node.js and Firebase to handle data transfer. The configuration was written in a .env file to prevent API key leakage during Git push.

OpenAI

To open an OpenAI account, the team visited their website and signed up for an account. Once the account was opened, an API key was generated from the dashboard. The website provided documents on how to use the API to generate a chatbot service, but the team customized the data themselves. The API key could be used to access the OpenAI API service, and the settings page tracked the number of requests used. OpenAI charged per token, with one Chinese character being about 4 tokens, and every 2000 tokens costing 0.012 USD.

LangChain

LangChain provided instructions on how to use its library for high-level prompting and searching ability through a database. After installing the library, the team separated prompts into categories for an empathetic persona that responded to the user. The prompts were created based on research and templates, including the Character Book from Character AI. The response primarily used the team’s own data, and when there was no relevant data in ChromaDB, Open AI used the prompts to answer.

ChromaDB The team gathered data related to Hong Kong students for responses and used ChromaDB to import the data for Open AI to use. They added Chroma to their code and tested it on Jupyter Notebook to prevent bugs. The data was properly read and processed using ChromaDB and stored as vector values in .chroma/index. These values were passed to OpenAI when the dosearch function was executed. The dataset “Brain” addressed problems faced by teenagers and was categorized into topics such as DSE, Academic, Family, and Personal.

Flask

The team relied mostly on Python code to form their CanChatGPT backend and used a native API library to host their response through an API. Hosting it as an API allowed for use on other messaging platforms and increased development speed. They used Flask combined with jsonify to convert the response into a JSON API request and separated it into intent, emotion, and response. This allowed for more intelligent and user-friendly features. The API call was checked using Postman to ensure proper functionality.

Method on testing

Expo

The team followed the Expo documentation to integrate Expo into their React Native app and hosted it on a localhost port for testing. They conducted two types of testing: on a computer mobile simulator and on a real mobile device. Debugging on a simulator allowed for fast debugging of multiple bugs, while testing on a real device caught bugs that only appeared on a real device. They recommended testing on both to ensure all bugs were caught.

Real Life Testing

To test on a real device, the team downloaded the Expo app on their Android or iOS device and scanned the QR code that appeared after running ‘npx expo start’ in the terminal. The Expo bundle download process initialized their app on their device, and the application data was downloaded from the Expo database and streamed to their phone. The app data was downloaded to their Expo application, which could be quickly accessed without scanning the QR code again. However, the Expo server in Visual Studio Code still needed to be started.

Results

Figure 7: CanChat result

Figure 8: CanChat result

Figure 9: Canchat result

Figure 10: CanChat result

Evaluation

During the testing and quality assurance process, 15 users were invited to test the CanChat GPT application. The user testing phase involved experiencing the UI and testing the accuracy of the application’s responses. The survey phase aimed to evaluate the effectiveness of the responses in terms of empathy and information provided, as well as the user’s feelings towards the UI avatar. The results of the survey were analyzed to identify areas for improvement and to ensure that the application met the desired standards of quality. The feedback from the users was used to refine the application and to ensure that it provided a positive user experience.

Conclusion

The project aimed to create an online chatbot application to support student counseling services in secondary schools. The team developed a Cantonese and English Deep Learning Chatbot that utilized NLP technology to provide basic counseling services and valuable information on academic and family issues. They designed and developed an LLM model capable of processing Cantonese and English sentences and identifying the user’s emotion and purpose through sentiment and entity detection. The team collected diverse Q&A data to train the machine learning model to select appropriate follow-up questions. They also designed a data flow architecture diagram for the development of the chatbot system and application. The team implemented the chatbot system and application as a prototype system, achieving their primary objective. The system included various features such as a login system, Firebase Authentication, a signup system, and two chat modes. The user interface (UI) design of the app was centered on simplifying technology by displaying only the essential information on the screen, using intuitive navigation, simple language, and soothing colors. The system also included a User Content Recommendation System, an Avatar and emotion interaction system, and a Thumb Zone Designing for Mobile Users aspect. The user evaluation of the Cantonese and English deep learning chatbot for secondary school student counseling provided valuable feedback from users on various aspects of the chatbot. The feedback received from users gave the team valuable insights into how they could improve the chatbot to make it more engaging, useful, and accessible to a wider range of users.

Future Development

The team behind CanChat GPT was committed to expanding the reach of the application beyond Hong Kong and to individuals of all ages who may be struggling with mental health issues. They planned to offer the application in multiple languages to cater to a diverse audience. By implementing the application on popular messaging platforms such as WhatsApp and Telegram, the team hoped to make it more accessible to users and increase its usage. The addition of social login would provide users with greater flexibility and ease of use. The team was dedicated to continuously improving the application and ensuring that it met the needs of its users.

Jonathan Chiu
Marketing Director
3DP Technology Limited

Jonathan handles all external affairs include business development, patents write up and public relations. He is frequently interviewed by media and is considered a pioneer in 3D printing products.

Krutz Cheuk
Biomedical Engineer
Hong Kong Sanatorium & Hospital

After graduating from OUHK, Krutz obtained an M.Sc. in Engineering Management from CityU. He is now completing his second master degree, M.Sc. in Biomedical Engineering, at CUHK. Krutz has a wide range of working experience. He has been with Siemens, VTech, and PCCW.

Hugo Leung
Software and Hardware Engineer
Innovation Team Company Limited

Hugo Leung Wai-yin, who graduated from his four-year programme in 2015, won the Best Paper Award for his ‘intelligent pill-dispenser’ design at the Institute of Electrical and Electronics Engineering’s International Conference on Consumer Electronics – China 2015.

The pill-dispenser alerts patients via sound and LED flashes to pre-set dosage and time intervals. Unlike units currently on the market, Hugo’s design connects to any mobile phone globally. In explaining how it works, he said: ‘There are three layers in the portable pillbox. The lowest level is a controller with various devices which can be connected to mobile phones in remote locations. Patients are alerted by a sound alarm and flashes. Should they fail to follow their prescribed regime, data can be sent via SMS to relatives and friends for follow up.’ The pill-dispenser has four medicine slots, plus a back-up with a LED alert, topped by a 500ml water bottle. It took Hugo three months of research and coding to complete his design, but he feels it was worth all his time and effort.

Hugo’s public examination results were disappointing and he was at a loss about his future before enrolling at the OUHK, which he now realizes was a major turning point in his life. He is grateful for the OUHK’s learning environment, its industry links and the positive guidance and encouragement from his teachers. The University is now exploring the commercial potential of his design with a pharmaceutical company. He hopes that this will benefit the elderly and chronically ill, as well as the society at large.

Soon after completing his studies, Hugo joined an automation technology company as an assistant engineer. He is responsible for the design and development of automation devices. The target is to minimize human labor and increase the quality of products. He is developing products which are used in various sections, including healthcare, manufacturing and consumer electronics.

Course Code Title Credits
  COMP S321F Advanced Database and Data Warehousing 5
  COMP S333F Advanced Programming and AI Algorithms 5
  COMP S351F Software Project Management 5
  COMP S362F Concurrent and Network Programming 5
  COMP S363F Distributed Systems and Parallel Computing 5
  COMP S382F Data Mining and Analytics 5
  COMP S390F Creative Programming for Games 5
  COMP S492F Machine Learning 5
  ELEC S305F Computer Networking 5
  ELEC S348F IOT Security 5
  ELEC S371F Digital Forensics 5
  ELEC S431F Blockchain Technologies 5
  ELEC S425F Computer and Network Security 5
 Course CodeTitleCredits
 ELEC S201FBasic Electronics5
 IT S290FHuman Computer Interaction & User Experience Design5
 STAT S251FStatistical Data Analysis5
 Course CodeTitleCredits
 COMPS333FAdvanced Programming and AI Algorithms5
 COMPS362FConcurrent and Network Programming5
 COMPS363FDistributed Systems and Parallel Computing5
 COMPS380FWeb Applications: Design and Development5
 COMPS381FServer-side Technologies and Cloud Computing5
 COMPS382FData Mining and Analytics5
 COMPS390FCreative Programming for Games5
 COMPS413FApplication Design and Development for Mobile Devices5
 COMPS492FMachine Learning5
 ELECS305FComputer Networking5
 ELECS363FAdvanced Computer Design5
 ELECS425FComputer and Network Security5