Background

Screen readers are a form of assistive technology, that use a text-to-speech (TTS) system to convert electronic text into synthesised speech. There are efforts to make this available for most of South Africa's 11 official languages, as the South African Centre for Digital Language Resources (SADiLaR) supports research on all aspects of natural language processing. Despite this, the TTS systems in their repositories cannot produce easily understood audio renderings of mathematical formulae. This leaves visually imapired users unable to efficiently listen to mathematics.

Project Aims

This study aims to improve access to mathematics for isiZulu speakers with visual impairments, by investigating natural language generation (NLG) techniques to translate math expressions into textual descriptions. This math-to-text process is called math verbalisation.

Research Question

How do we build a Content MathML-to-Text NLG system whose output is perceived as understandable by isiZulu speakers?

Math Verbaliser

Template-based Math Verbaliser

We approached the design of this math verbaliser with an NLG pipeline, that uses a template-based realisation method combined with a few word-level rules, to generate textual descriptions. E.g. U-<operand1> simsusa ku-<operand2>.

Templates were designed for addition, subtraction, multiplication, divison, exponents, integration, square roots and equations. Complex sentences can be constructed by nesting the templates into one another.

The math verbaliser is a Python program that runs from the terminal.

The Text Evaluation

To evaluate the perceived understandability of the generated text we conducted a Google Form questionnaire with isiZulu speakers.

For each description, a participant rated the understandability of the description on a Likert scale and then typed out the expression's formula. The inputted formulae were declared as either an exact match or a mismatch.

The Results

We present the findings from the human evaluation. Five isiZulu speakers with different levels of language proficiency (first, second and third language speakers) were recruited to join the questionnaire, which resulted in a total of 50 responses over 10 descriptions.

A Stacked Chart Comparing the Proportion of Ratings and the Total Exact Matches (out of 5) of Each Formula; and Displays the Total Percentage of Each Rating and the Total Exact Matches for the 50 Responses.

68% of all responses were rated as understandable.
26% were unsure of a description’s understandability.
6% of all responses disagreed on a description’s understandability.

The expressions that received disagreeing or unsure responses, and low exact matches, contained minus, integral or square root operators.

The total number of exact matches across all descriptions is 25 out of 50 (50%).

Conclusions

scroll Down to read the full research paper

Based on a human evaluation of the generated text, we conclude that our template-based realisation method produced text that was averagely perceived as understandable among isiZulu speakers; and thus, is an appropriate technique for building a Content MathML-to-Text NLG system.

Although the descriptions were perceived as understandable on average, the results also show that a formula could only be accurately typed out 50% of the time. This was due to a consistent problem among participants with the minus, square root and integral templates. Hence, these three templates will require improvements to their terminology.

Generating Natural Language IsiZulu Text From Mathematical Expressions

Background

Project Aims

Research Question

Math Verbaliser

Template-based Math Verbaliser

The Text Evaluation

The Results

Conclusions

Downloadable Documents

Research Paper

Literature Review

Project Proposal

Poster