Chat bubble
Navigating university can be a daunting task for any student.
Student Advisors play a vital role in assisting students
throughout their time at universit. It is an important role
that, when performed efficiently and effectively, will yield
success for both the students and the university. The University
of Cape Town (UCT) enrolls over 25000 students each year.
Student Advisors assist all of these students, directly or
indirectly, over the course of the year, for example, during
registration. This poses a challenge as the sheer number of
queries is a great deal for them to process.
Students can get assistance from Student Advisors through email,
virtual meetings, or physical meetings (when it is safe to do
so). Given the sheer number of students being enrolled, it is
evident that each year Student Advisors are faced with an
overwhelming number of queries which have to be attended to. Due
to the number of queries and sometimes complexity of queries,
students do not get responses immediately and Student Advisors
are overworked. This inefficiency can have a detrimental effect
on a student’s academic career. Student advisors also may not
have enough time to attend to all queries as they also have to
complete their academic duties like lecturing.
We proposed a Virtual Student Advisor system as a solution to
the problem. Such a system would offer an alternative for
students, rather than email a Student Advisor, the student could
simply visit the website and find the information they need.
This page focuses on the Chatbot aspect of the website.
Student Advisors cannot efficiently and effectively attend to
queries submitted by students. The number of Student Advisors
and the time they are available is not proportional to the
number of students in the university and number of queries. The
current form of communication, email or virtual meetings, is not
sufficient for students. Depending on the availability of the
Student Advisor, response times vary regardless of how simple or
complex the query is. Student queries can range from simple
queries like what the prerequisites for a course are to complex
queries like add/drop protocols for modules.
The aim of the chatbot is to eliminate the unpredictability of
response times. It should also be available to provide answers
to students at all times. The chatbot would be able to answer
most of the simple queries, short questions that most students
ask, with short answers. Thechatbot should beasaccurate
aspossible. It would provide general information, information
mostly found in handbooks and UCT’s various websites. A chatbot
to answer queries, available at all times, would offer relief
for Student Advisors. Students get answers immediately and
Student Advisors will have additional time to attend to complex
queries.
Requirement Analysis & Design
Requirements Analysis
Main functionalities of the chatbot are providing answers to
simple short queries, providing short explanations about
courses, providing contact details for relevant staff or
departments, providing links to websites, and storing previous
chats.
The chatbot should be capable of various kinds of greetings and
goodbyes. One of the aims of the chatbot is to be human-like, to
a point. It should be able to respond to informal greetings and
goodbyes. Each course has a short description in the handbooks.
The chatbot should provide a short summary of all these
descriptions. Messages should be short and concise. This
information will be manually extracted from handbooks and
inserted into the training data for the chatbot.
Each registered student will have a history of their chats
stored on the database. Chats are stored in an array of strings
as one of the attributes of the Users relation. Chats are stored
prefixed with “usr_” or “bot_” to differentiate between input by
the user and chatbot responses. These chats should be secured to
ensure data privacy.
Non-Functional Requirements
-
Quick response time. Users should get replies
immediately, they should not have to wait, as they must with
emails. Every response should take no more than 3 seconds.
However, several factors should be considered, response time
could be affected by speed of the internet or network coverage
as well.
-
Short responses. Users should not feel discouraged to
read responses sent back by the chatbot. Ensuring that the
responses are of reasonable length is imperative for how often
users engage with the bot.
-
Grammatical correctness. Text input by the user does
not have to be spelled correctly or grammatically correct. The
input also does not have to match sentences in the training
data exactly in order for responses to be correct. The chatbot
will use pattern matching (further described in the following
section), so if the query has the expected keywords then a
response will be generated.
-
Reliability, maintainability, and availability. The bot
will be available for users at all times (“24/7”). It may not
be available when the website is undergoing maintenance or
data is being updated. Maintaining the chatbot is a challenge
because one must ensure that the dates and information the
system uses are accurate and relevant to the current year.
Where information is retrieved from sites, the link should be
provided.
Class Diagram presenting architecture of the chatbot.
Architecture. The system follows a Model-View-Controller
architecture. The class diagram in the figure above shows the
objects that make up the chatbot system, back-end. The actual
view of the system is what is displayed on the website UI, in
this context, the view is the object that is responsible for
accepting user input from the front end and returning responses,
Chatbot. The Chatbot object is the Controller (in context of the
website), it is only responsible for accepting texts and
returning the response generated by the model to the front end.
The Bot, PreProcessor and Trainer objects make up the Model
aspect of the system. information is passed to the Bot, which
then uses the trained neural network model to generate a
response. Pre- Processor cleans user input off punctuation marks
or any acronyms or derived words. The Trainer object loads the
trained neural network, processes the cleaned input, and
provides a response that is then sent to the Chatbot object
(Controller) to be sent to the view. DatabaseManager is
responsible for any queries made to the database, either to save
chats or get information.
The user is abstracted from the logic of the system. All
information required by the view can be acquired through the
controller. Each object has one responsibility. Trainer is only
responsible for creating or loading the neural network for
language processing. PreProcessor is only responsible for
processing user input before it is passed to the neural network.
Using a layered architecture makes the system more maintainable
as changing one layer (i.e. View, Model or Controller) has less
consequence over the whole system. For example, the view can be
changed with little to no consequence to the model.
System Development & Implementation
Development Software
For implementation of the chatbot, several factors had to be
considered when choosing the technologies to be used. The
chatbot would have to access the database therefore the language
must be efficient with databases. The chatbot would use Machine
Learning so it would be beneficial to choose a language with
good support,an established community and various libraries, for
Natural Language Processing models. Python was chosen as the
language to implement the chatbot. It is an established
programming language with a large community and libraries. There
are several libraries that have been created for Natural
Language processing purposes. The main NLP libraries used in
this project are NLTK, TFLearn, and NumPy. The Natural Language
Toolkit (NLTK) provides libraries and programs for processing
human language. NLTK also provides test data that can be used to
test the chatbot. Another library used is TFLearn, a TensorFlow
Deep Learning Library. TFLearn is an extension of the tensorflow
framework, a software library for machine learning and
artificial intelligence mostly used for implementing deep neural
networks. TFLearn provides a higher-level Application
Programming Interface (API) which makes is easier to use. These
libraries make it easier to implement the neural network model
that will be used to match a text to a pattern and provide the
appropriate response. Python’s Flask will be used to build the
chatbot into a web application with an endpoint that the main
website can make POST requests to.
Training data is stored in JavaScript Object Notation (JSON)
format. This provides an easy way to access the data. Python has
a JSON library which provides an efficient and simple way of
loading and processing a JSON document. JSON uses human-readable
text to store and transmit data. This choice was also better so
the document could be stored in the MongoDB database, a document
database that uses JSON to represent data. The files are
lightweight and thus make it efficient to load and process large
amounts of data. This is favorable as the chatbot needs
thousands of sentences to learn questions and the appropriate
answers. Questions and answers are stored in a pattern-response
format. Each pattern has a tag, e.g., “greetings”,
“registrationForm”, “uctLocation”, etc. There are hundreds of
tags and thousands of questions which the bot is trained on. The
JSON file is arranged as follows:
{
"intents" : [
{
"tag" : "<intentName>" ,
"patterns" : ["<questions_pattern>"] ,
"responses" : [ "<responses_pattern>"]
}
]
}
JSON representation of the training data
MongoDB, a NoSQL database, was chosen to store information. A
NoSQL database is preferred because data does not have to be
strictly structured as in SQL, which makes it easier to store,
update or query. It is highly efficient and enables easy updates
to the schema design if the need arises.
Potential vulnerabilities of the development process are
integration of the chatbot into the overall virtual student
advisor website. Training the chatbot could take a really long
time which would then affect the progress of the project.
Changing requirements could also have affected the progress,
however, an agile approach made it possible to adjust to those
changing requirements.
Classes
-
ChatApp: the controller class that connects with the
View and Model. However, relative to this paper ChatApp also
behaves as the display. Information which would be sent to the
View is displayed on terminal. The ChatApp object accepts user
input passed in from the View (Website User Interface). It
sends the input to the Model and receives a response back. The
response would then be passed on to the View to be displayed
to the user. The main behaviour of the ChatApp class is to
display the previous chats and the current chats. It also uses
functions of the DatabaseManager class to get previous chats
to be displayed and to save chats of the current session.
-
Bot: uses the trained model to create a response for a
sentence using the saved neural network model. The saved model
is fetched from memory when a chat session is started.
-
PreProcessor: responsible for processing training data
before it is used to train the model. Processing the training
data refers to removing trailing spaces, removing some
punctuation like question marks, replacing words with their
root/stem words, and replacing any acronyms with the complete
words. This helps with removing any inconsistencies in
sentences, it also makes it easier for the model to recognize
patterns and match user input to appropriate responses.
-
Trainer: uses the TFLearn library to initialize the
deep neural network, with its hidden layers and output layer.
The hidden layer is tested with different number of neurons
and batch sizes to see which yields the best results.
-
DatabaseManager: Responsible for establishing a
connection with the database and querying the database for
academic information or previous chats.
Flow chart showing the processes of the chatbot system.
The Figure above shows the flow of the program. When a user
opens the program, a new chat session is started (Appendix A.2).
The previous chats are fetched from the database and displayed
for the user to see. Previous chats can only be fetched if the
user is registered, else the chats are not saved at all. The
user can begin with the current chat session. The user gets a
response after each entry. Inputs and responses are saved
immediately, this ensures that if the program is unexpectedly
stopped the chat history will still be available. If the user
wants to end the chat session then the user enters “quit”. This
is for the terminal version. On the website the user would
simply cancel the chat window or bubble.
Integration
The goal of the chatbot is to be used in website therefore it
has to be integrated into the website. For the front-end to be
able use the chatbot, POST requests are sent to the chatbot
endpoint (see figure Code snippet of how POST requests are
processed and what is returned.) using a unique Unified Resource
Allocator (URL). The chatbot app was deployed on Heroku which
then generated a unique URL,
https://advicechatapp.herokuapp.com/chat that can be made to the
chatbot. The URL accepts a student number, if there is one, and
the user input in JSON format and returns the generated response
together with the student number in JSON format. Requests are
transported over a secure communication (Hypertext Transfer
Protocol Secure- HTTPS) which offers more data protection
compared to regular Hypertext Transfer Protocol (HTTP).
Testing
The chatbot was tested together with the virtual student advisor
website by users, students and student advisors. users were
asked to give feedback on the appearance, quality of responses
and ease of use of the system. The system was also tested for
average response time of each user input. As this is a chatbot
meant for people, whether or not time was satisfactory was
dependant upon each individual. The Python time library was used
in automated testing as it yields more accuracy compared to a
person with a timer. The chatbot was also tested for accuracy
with unseen questions.
Users completed the tasks and filled in the Google form or
answered questions based on the experience. The tasks and
questions that people had to complete are listed under the
Findings section along with the results.
Example of the working chatbot system on the site.
Above is a screenshot of the working chatbot on the website.
Each chat is displayed with a timestamp to assist users in
recalling when texts were exchanged. To fetch chat history from
the database, MongoDB is queried to search for the user’s chats
using the student number as the unique key. The query is shown
in Figure Example of query made to the Mongo database.
User testing and automated testing yielded a variety of
interesting results. The User Interface design of the chat
bubble was successful as users could easily identify the chatbot
icon and were able to use it. It is worth noting that making the
icon bigger could improve the experience as more users would be
drawn to it and utilize its functionalities. The chatbot’s
response time was excellent. All users were satisfied with the
speed at which the chatbot responded. Previous chats for
signed-in users were stored successfully whilst chats of users
not signed in were not stored, as expected. Users found the
chatbot to be useful however still had much room for improvement
in terms of answering a wide range of queries. The range of
queries that can be answered improves with increase in user
testing as more information is gained in terms of what users
want to know.
With a few more iterations and improvements this chatbot could
be a very effective feature of the virtual student advisor
website. Users would be able to ask any question and get instant
responses rather than searching for the information throughout
the site. Relative to the site, the chatbot could also be a
navigation tool of where to find information or sections of a
site. The chatbot has great potential, not only within the site,
but also as a stand-alone application that students can use. The
application could have a mobile version which would further
simplify interaction with the chatbot, especially for students.
Use of the chatbot could significantly decrease the number of
queries Student Advisors receive and thus give them additional
time for complex queries.
A chatbot is an automated conversational agent that can be
utilized to assist users instantly. In the beginning we
identified that student advisors are faced with the burden of
responding to too many queries and students do not get responses
immediately. The project aimed to develop a conversational agent
capable of immediately responding to simple student queries
about academics and potentially university as a whole. The
developed chatbot was successful in responding to simple queries
about academics effectively and efficiently. The chatbot could
not effectively respond to random queries about university as a
whole and thus needs much improvement in this area. It can best
be used for simple frequently asked queries. Using the chatbot
as per the suggestions would also greatly improve the
experience. Testing showed that the chatbot was relatively
successful. However, it is still in need of much improvement.
Suggestions for the future would be to split such a project into
two sections, a data gathering section and an agent development
section. This would allow for more time to be spent on each
section and thus yield better results. User testing should also
be replicated multiple times as it provides quality and
actionable feedback. More research on previous implementations
of chatbots and how to optimize them would also improve it. Also
develop a User Interface that would make it easy for
administrators or Student Advisors to input requirements that
have changed from previous years. Extend the system to work for
other faculties, departments at UCT and potentially other
universities.