VerbOWL
Verbalisation of Afrikaans OWL 2 DL Ontologies

Overview


Overview img
Figure 1: High level overview of project.


Background


As the usage of ontologies increases, the act of verbalising them has more and more value. An Ontology cannot be verbalised fluently as is without adding the linguistic knowledge relating to the specific language. This means that functionality to add this linguistic knowledge needs to either be included in future OWL releases or that the information needs to be provided externally and combined with an OWL file during the verbalisation process, this happens with the lemon model.

Controlled natural languages have been created for a similar purpose; they aim to make verbalisation more robust. A Controlled Natural Language is a language which is unambiguous and well defined; this is in contrast to natural language which is arguable neither.

Research has been done on verbalising Controlled Natural Languages since the mid-1990s, it is however even more pressing since the adoption of OWL as a standard and the development of the semantic web. It is still being explored as researchers attempt to map OWL ontologies to controlled English. An example of this is Attempto Controlled English (ACE) which has a well-defined syntax in order to curb ambiguity.

Ontologies are generally created by logicians and domain experts working together. The domain experts are the ones with the knowledge which needs to be expressed, and the logicians are those who can express this knowledge correctly in an ontology. Due to the fact that domain experts are very seldom logicians, this process is hugely collaborative. The problem with collaborating in this manner is that knowledge is often lost during the collaboration process. In order to make sure that the final ontology is created in a manner which makes sense, the domain expert needs to be able to check and reason through the axioms which are created.

Most work on ontologies is done in English, and while various efforts have been made to include other languages, the other languages which are represented are mainly Indo-European. This is merely a side effect of the fact that most of the work being done in ontology engineering is in Europe and North America; however, there has been some work locally. Keet & Kumalo have put together some initial steps for a template which verbalises in isiZulu. The lack of support for African languages is troubling and is a hugely untapped area for research.

Ontologies are expressed in a manner which is hard for non-logicians to understand. Tools exist, such as Protégé – A tool which helps with the creation of ontologies – however the representation created is still only in Manchester Syntax at best. Manchester Syntax was created to be less like Description Logic, however it is still significantly less understandable than normal language. An example of Manchester syntax is given below:

Class: VegetarianPizza
EquivalentTo: Pizza
    and not (hasTopping some FishTopping)
    and not (hasTopping some MeatTopping)

This expresses the fact that a vegetarian pizza has no meat or fish toppings in a semi confusing manner. Controlled natural languages are used as a common solution to this problem and are often closely tied to ontology verbalisation.

In order to improve the collaboration between domain experts and logicians, this project aims to create and compare two approaches to ontology verbalisation in Afrikaans.

The two approaches are the grammar based approach (discussed in this paper) and the template based approach (discussed in Lauren Sanby’s paper). The scope of each project is to read in an Afrikaans OWL 2 DL ontology and verbalise it into understandable Afrikaans sentences using either of the two approaches.

These two approaches will then be compared based on a variety of criteria such as axiom coverage, comprehensibility and time taken, the results of this are split up into Grammar based and template based.

The grammar based approach uses the OWL API (An API for interacting with ontologies, in Java) for extracting axioms and Grammatical Framework (GF) for verbalising the axioms with a pre-processing layer in-between. While the template based approach uses a template to provide the verbalisation on an axiom by axiom basis.

This project is important because it aims to increase the quality of ontologies produced by making checking of the logical representation easier. It is also the first attempt to do this verbalisation in Afrikaans. Due to the fact that most work in ontologies is done in English, speakers of other languages can be left out/excluded.