Screen readers are a form of assistive technology, that use a text-to-speech (TTS) system to convert electronic text into synthesised
speech. There are efforts to make this available for most of South Africa's 11 official languages, as the South African
Centre for Digital Language Resources (SADiLaR) supports research on all
aspects of natural language processing. Despite
this, the TTS systems in their repositories cannot produce easily understood audio renderings of mathematical formulae. This leaves visually imapired users unable to efficiently
listen to mathematics.
This study aims to improve access to
mathematics for isiZulu speakers with visual impairments, by investigating natural
language generation (NLG) techniques to translate math expressions into textual descriptions. This math-to-text process is called math verbalisation.
How do we build a Content MathML-to-Text NLG system whose output is perceived as understandable by isiZulu speakers?
We approached the design of this math verbaliser with an NLG pipeline, that uses a template-based realisation method combined with a few word-level rules, to generate
textual descriptions. E.g. U-<operand1> simsusa ku-<operand2>.
Templates were designed for addition, subtraction, multiplication, divison, exponents, integration, square roots and equations. Complex sentences can be constructed by nesting the templates into one another.
The math verbaliser is a Python program that runs from the terminal.
To evaluate the perceived understandability
of the generated text we conducted a Google Form questionnaire with isiZulu speakers.
For each description, a participant rated the understandability of the description on a Likert scale and then typed out the expression's formula. The inputted formulae were declared as either an exact match or a mismatch.
We present the findings from the human evaluation. Five isiZulu speakers with different levels of language proficiency (first, second and third language speakers) were recruited to join the questionnaire, which resulted in a total of 50 responses over 10 descriptions.
A Stacked Chart Comparing the Proportion of Ratings and the Total Exact Matches (out of 5) of Each Formula; and Displays the Total Percentage of Each Rating and the Total Exact Matches for the 50 Responses.
68% of all responses were rated as understandable.
26% were unsure of a description’s understandability.
6% of all responses disagreed on a description’s understandability.
The expressions that received disagreeing or unsure responses, and low exact matches, contained minus, integral or square root operators.
The total number of exact matches across all descriptions is 25 out of 50 (50%).
scroll Down to read the full research paper
Based on a
human evaluation of the generated text, we conclude that our template-based realisation method
produced text that was averagely perceived as understandable among isiZulu
speakers; and thus, is an appropriate technique for building a Content MathML-to-Text NLG system.
Although the descriptions were perceived as understandable on average, the results also show that a formula could only be accurately typed out 50% of the time. This was due to a consistent problem among participants with the minus, square root and integral templates. Hence, these three templates will require improvements to their terminology.