Honours Project 2012

Project Outline

Acknowledgements

Overview

The purpose of this project was to create and test a speech based interface for people who are functionally illiterate. Android's speech synthesises was used, as well as Google's voice recognition was used to decipher what the user said. Pocket-Sphinx was also tested for speech reconition and the recognition rates were compared. During the cration of the interface the Java Expert System Shell (Jess) was ported to Android.

A limited vocabulary of words was tested. The interface was also tested for usability, efficiency, effectiveness and satisfaction. Testers did not speak English as their first language. and To provide a speech-based interface to the diabetes expert system using speech recognition and speech synthesis.


SYSTEM ARCHITECTURE




IMPLEMENTATION

The front end is just a blank screen, with a speech icon occasionally appearing to tell the user when to start speaking. Implementing Pocket-Sphinx on Android was not as easy as tutorials make it out to be. The Pocket Sphinx implementation was neatly packaged in an Android library for future use. Jess was said to be impossible to port to Android as it was too reliant on java.beans, java.awt and java.applet but this problem was worked arround. The result was a fully functional Jess Android Library.





CONCLUSION

The speech-based interface was successfully implemented and user evaluation was positive. Users found the system easy to use and satisfying.

Google's Speech recognition was also tested. It had some problems recognizing certain words.It had a Word Accuracy (WA) of 69.35% for all words and a WA of 82.69% when ignoring words that are difficult to recognize

Pocket-Sphinx's reconition was very poor. It had a Word Accuracy (WA) of 17%.



FUTURE WORK

One possibility is to implement pocket-sphinx so that it can recognize accents, that is the user must say some phrase, and from there recognize what accent the user has and insert the appropriate trained model for that user, thus increasing the recognition ability and making Sphinx usable.

The words that Google voice recognition can recognize well can be recorded to further increase the reliability of the application. Perhaps another cloud based recognition tool can be investigated such as Bing.

As Jess works on Android, so applications using expert systems can be made or 'refitted' for android. The modular design of the program makes replacing the speech interface with a GUI easy.

One of the most relevant things that could be done is to create a language model for South-Africa languages (other than English) for use in speech recognition. This could allow many users who are not first language English a more natural and comfortable experience.