TRENDy
Temporal database tooling for your Research.
Background
Temporal Databases are gaining increased attention in recent years, in both industry and academia . Temporal data refers to any data that includes one or more time period for which that data is considered to be valid. This contrasts atemporal data, in which the data stored is assumed to be currently valid.
The standard conceptual data modelling language for atemporal databases is the Entity Relationship (ER Diagram) along with The unified modelling language (UML), however these do not allow for effective modelling for temporal data. The structure of a temporal database can be modelled using a Temporal Conceptual Data Modelling Language (TCDML), however there isn't currently a standard language for modelling temporal databases.
TREND
TREND (Temporal information Representation in ENtity-relationship Diagrams) is a specialized conceptual data modeling language based of the ER Diagram and is designed to effectively represent and manage temporal data within databases. TREND implements its temporal extensions through the core constraints of dynamic extension and dynamic evolution. These constraints outline the primary ways in which temporal entities in a model can change over time.
- Dynamic Extension (EXT/ext): The extended entity remains an instance of the original entity. For example, an Employee may get promoted to Manager but still stays an Employee.
- Dynamic Evolution (CHG/chg): The extended entity is no longer the same instance as the entity it has extended from. For example, in a law firm, when an Article Clerk becomes a lawyer, it stops being an Article Clerk.
- Icons: Temporal entities are labeled with a clock icon, while atemporal entities (legacy ER/EER entities) are labeled without a clock.
- Mandatory or Optional: Mandatory temporal constraints (that must have occurred in the past or must occur in the future) are denoted by a solid line, and optional constraints (that may happen in the future or could’ve happened in the past) are denoted by a dashed line.
- Past or Future: If a temporal constraint is on a past condition, lowercase letters are used (i.e., chg, ext), and capital letters are used (i.e., CHG, EXT) for future constraints.
- Quantitative: If a constraint has a quantitative value attached to it (for example, an employee must have been an assistant-manager for 2 years before becoming a manager), that value is displayed in time chronon next to the constraint type.
Richard Taylor
Graphical Tool
Problem
There exists a notable gap in specialized graphical modeling tools that effectively support temporal conceptual data modeling languages like TREND. Current tools, including ERD Tool, Crowd, and Draw.io, primarily cater to atemporal or general-purpose modeling needs and lack comprehensive features for integrating temporal elements and reasoning capabilities essential for accurate and efficient temporal model design. This deficiency hinders modelers' productivity, increases the likelihood of errors, and elevates the barrier to entry for both novice and experienced users, thereby impeding the adoption and advancement of temporal data modeling practices.
Aim
The primary aim of this project is to design and develop TRENDy, a dedicated graphical modeling tool tailored specifically for the TREND Temporal Conceptual Data Modelling Language. The completed tool can be found here.
Features
Infinitely Expanding Canvas
As a space to build your models, an infinitely expanding canvas has been added, where nodes and edges can be dropped, moved, and edited. Basic zooming and panning functionalities have also been added to navigate the canvas.
Dragging and Dropping of TREND Compatible Objects
As part of implementing the conceptual data modelling functionality, a shape node palette has been added. This contains all the necessary entities, attributes, and relationships you need to model in TREND. An advantage of this palette over the palettes of alternative tools is that there is no way to add objects that are not part of TREND. This restriction prevents many user errors.
Resizing and Renaming of Nodes
Once a node has been added to the canvas, you can change the label of the node to give it your desired name. Additionally, if your label is too long for the node, you can resize the node to your liking too.
Ability to Connect Objects
Each object added to the screen has 4 drag handles (top, bottom, left, and right). When dragging from one of these drag handles to another, an edge is made to connect these objects together.
When selecting an edge, an edge option menu is displayed. This menu has all the necessary options to customize the edge to any edge the user desires in the TREND language. There are three main edge types: Atemporal (default), temporal, and subsumption. After selecting an edge type, the user then has access to other options to further customize their line (e.g., cardinality and persistence).
Schema Generation Tool
This feature needed to be able to convert the models we have developed in the user interface directly into a database schema. This feature was actually developed by my project partner, Stephan Maree, so for this tool, integration with the schema generation tool was necessary. A button called "Schema Generation" is added to the UI, and when pressed, it sends a JSON object with the nodes and edges serialization (described in the design section) to the schema generation tool, which then controls the conversion and downloads the schema for the user. Since our tool used almost identical serializations, the JSON object can be used by the schema generation tool with little processing done on the receiver side.
Saving and Loading of Models
Due to the Model-View-Controller design of this tool, the model information is independent from the rest of the code. This allows the program to add the ability to load and save models. When clicking the Save button, the tool creates a JSON object and saves it to the user’s computer. When clicking the Browse button, the user is given a prompt to select the file they would like to upload, and then the tool loads in the JSON object to the canvas. If the model that is uploaded to the program has an incorrect format (i.e., it was edited outside of the graphical tool), then the tool will do its best to display as many correctly formatted objects on-screen as possible, and displays a relevant error to the user.
State Management
To implement the undo and redo functionalities requirements, a state management system has been implemented. This gives the user the ability to undo and redo actions done in your model, including but not limited to: resizing, adding elements, editing labels, and changing features about an edge. The main strategy in implementing this surrounds the takeSnapshot() function in the UndoRedo class. After the model is changed (e.g., label name is changed, edge is added, edge is changed), the model state is saved as a snapshot, and using the CTRL+Z and CTRL+Y shortcuts navigates us between these snapshots. This strategy works as the serialization has a very small file size, thus multiple snapshots can be easily stored without any storage or performance concerns.
Full Keyboard Shortcut Support
Shortcut support is not natively supported in the ReactFlow library, so they had to be manually implemented. Shortcuts for features of deleting elements (DEL), saving files (CTRL+S), selecting all elements (CTRL+A), copying selected elements (CTRL+C), cutting selected elements (CTRL+X), pasting copied elements (CTRL+V), undo (CTRL+Z), and redo (CTRL+Y) were implemented. These shortcuts were purposely chosen as they are commonplace amongst most computer tools, thus there would be an expectation from the user for them to be supported.
Additionally, this meant the user would not have to learn the shortcuts, and instead discover them naturally, lowering the time taken to learn the tool.
Exporting Models
As requested by our experienced modelers, the ability to export our models as screenshots was added to the tool. An "Export" button triggers this feature.
Model Validation
The added temporal constraints can be confusing for beginners to understand, thus a model validation feature has been added to help correct users if they make an error in their model. The first way the validation feature works is by checking the user interface after every change made by the user, and using edge information to restrict what options the user has to change. This means that, rather than trying to throw an error if the user makes a mistake, the feature prevents the user from making a mistake in the first place. This can be seen in the way that the edge menu displays variables the user can change. These changeable variables are only available if they are relevant to the specific source and target nodes and type of lines.
Edge options and their requirements are summarized in the table below. Additionally, the validation feature highlights invalid edge connections by displaying them in red and providing specific error messages when selected.
Edge Options and Their Requirements
| Edge Option | Requirements for Rendering |
|---|---|
| Edge Type | Always displayed |
| Optional | Edge is type temporal or atemporal |
| Cover | Edge is type subsumption |
| Edge Label | Edge is type temporal |
| Quantitative | Edge is type temporal and source and target nodes are both entities (Quantitative) |
| Value | Quantitative is set to True |
| Persistent | Edge is type temporal, target not is type relationship and source node is type entity |
| Cardinality (target node side) | Edge Type is atemporal |
| Cardinality (target node side) | Edge Type is atemporal, target not is type entity and source node is type relationship |
Verbalization
When determining the preferred representation of temporal constraints in conceptual data models, it was concluded that there was no clear favorite representation (when comparing graphical and textual representations), thus a multi-modal interface was suggested, to allow the switching between graphical notation and natural language. Since our tool is already graphical, this necessitates a feature which can represent our temporal constraints using natural language.
To implement this feature, inspiration has been drawn from NORMA. NORMA is a software tool that implements the Object-Role Modelling (ORM) language. The ORM schemas are represented in a graphical notation; however, an additional feature called the ORM verbalization browser has been added, which represents the ORM schema in natural language to aid the user’s understanding about the schema. This feature has inspired the verbalization feature for this tool.
When a temporal edge is selected, the bottom left of the screen displays the natural language representation of the edge. Additionally, when selecting a node, a natural language description of that node is also displayed. The feature works by using the type of edge, the source and target node, the variables attached to the edge, and the time quanta global variable. The specific natural language representation was taken from the supplementary material from [14]. For experienced modelers who find the feature distracting, this feature can also be disabled in the Leva control panel.
Other UI View-port Features
Other UI features were added to the canvas to assist modelers. This includes zoom controls, the ability to lock the canvas, fit-to-screen (this button will center the screen on your model), a mini-map in the bottom right-hand corner to keep modelers oriented on the canvas. Additionally, the ability to lock the canvas, thus allowing the user to pan over it without accidentally moving something has been added too. Lastly, some global UI movement options have been added to the Leva control panel; snap-to-grid, pan-on-scroll, and zoom-on-double-click.
User Testing Results
Evaluating the effectiveness and usability of TRENDy compared to Draw.io through structured user testing. A total of 16 user tests were conducted, with 8 participants using the control tool (Draw.io) and 8 participants using TRENDy. The results indicate significant improvements in both efficiency and correctness, as using TRENDy resulted in a 21.43% improvement in model correctness and a 16.01% reduction in completion time compared to Draw.io.
- Draw.io: Average Mark = 12.63/21 (60.12%), Average Time = 25 minutes 10 seconds
- TRENDy: Average Mark = 17.13/21 (81.55%), Average Time = 21 minutes 7 seconds
System Usability Scale (SUS)
The System Usability Scale (SUS) was employed to evaluate the usability of both tools. Participants rated their experience on a scale of 0 to 100, with higher scores indicating better usability.
- TRENDy: Average SUS Score = 73.4%
- Draw.io: Average SUS Score = 43.13%
TRENDy significantly outperformed Draw.io in usability, demonstrating that users found TRENDy easier to use, better integrated, and less cumbersome.
Methods and Materials
The user testing aimed to assess TRENDy's performance and usability against the widely-used Draw.io tool. The study was conducted as follows:
- Hosting and Access: TRENDy was publicly hosted on the Vercel platform and accessible here.
- Control Tool: Draw.io (formerly diagrams.net) was selected as the control tool due to its popularity in conceptual data modeling.
- Participants: 16 user testers were recruited, comprising 8 participants for each tool. Participants were primarily computer science students familiar with ER modeling.
- Incentives: Participants received pizza upon completing the test to encourage participation.
- Materials Provided:
- Consent Form: Ensured informed participation and maintained user privacy by storing consent forms separately.
- Task-Based Questionnaire: Included three progressively challenging tasks based on controlled natural language specifications of TREND models.
- Feedback Form: Comprising the System Usability Scale (SUS) and additional feedback questions for TRENDy users.
- Resources on TREND: Provided explanations of TREND's concepts, including an appendix table of temporal ER elements and example models.
- Procedure:
- Participants signed the consent form.
- They were randomly assigned to use either TRENDy or Draw.io.
- Participants reviewed the introduction to TREND and the provided resources.
- They completed three modeling tasks within a timed session.
- Upon task completion, participants submitted their models and filled out the SUS questionnaire.
- TRENDy users additionally provided qualitative feedback on their experience.
- Evaluation: Models were marked for correctness based on predefined criteria, and SUS scores were calculated to compare usability between the two tools.
Results Visualized
Usability Comparison: TRENDy vs. Draw.io
User Testing Results: Marks vs. Completion Time
Stephan Maree
Schema Generation
Problem
Support for temporal data in Database Management Systems (DBMSs) is still quite limited. Although temporal data became part of the SQL standard in 2011, many DBMSs, such as PostgreSQL, have not yet implemented it. This means that developers need to manually implement temporal data constraints in the application logic, leading to an increase in development time and cost.
Aim
This project addresses these problems by developing an algorithm that converts a TREND model directly into a SQL file. The SQL file will contain all the table definitions and triggers, ready for use in a DBMS and enforcing the temporal transitions of TREND. This was implemented in MariaDB, as it is open source and free. MariaDB also implements the majority of SQL:2011's temporal features and gives the user more expressive queries for interacting with temporal data. The development of the project can be broken into two parts: developing the algorithm and designing the temporal triggers.
Algorithm
In order to convert from a TREND model to SQL code, the serialisation needs to be algorithmically processed into data structures representing the final tables. The input of the program is a list of nodes and a list of edges, and these would need to be efficiently converted into objects which are used for writing SQL code.
The algorithm is divided into five components:
First, the nodes are processed and entity, attribute, relationship, and inheritance objects are made. The interactions between these nodes are processed with the edges. The TREND temporal transitions are also identified in this step.
After this the relationships are processed. Foreign keys are added to tables in one-to-one and one-to-many relationships, while many-to-many relationships gain their own tables.
Inheritance constraints are then considered, passing along the primary keys from the parents to the children. If the inheritance is disjoint and complete, the parent entity does not get its own table, instead passing all of its attributes to its children.
Finally, the entities are sorted to prevent tables with foreign keys being created before the table who the foreign key belongs to.
Each component of the algorithm, except for entity sorting, scales linearly with the size of the input. This means that the time taken for node processing, edge processing, relationship handling, and inheritance constraints grows directly in proportion to the number of nodes, edges, and relationships in the model. Entity sorting is done using a merge sort algorithm, which scales log linearly with input size.
After all the objects are created, the SQL code for each of these is written to a file which is downloaded to the user's device, ready to be used instantly in a MariaDB database.
Temporal Tables/Triggers
When an entity is labelled as temporal, two columns are added at the end of the table. These are named name_start and name_end, where name is the name of the table. They use the DATE datatype, and are included as part of the primary key. This implementation follows the SQL:2011 standard for temporal data. The period between the start and the end is labelled as unique and overlaps are disabled. The following table is an example of a temporal entity named "Student".
| StudentID | Name | Student_Start | Student_End |
|---|---|---|---|
| MRXSTE013 | Stephan Maree | 2020-01-01 | 2024-01-01 |
| TYLRIC007 | Richard Taylor | 2019-01-01 | 2024-01-01 |
Enforcing the temporal constraints of TREND was done by designing unique SQL triggers that prevent the insertion of any data that violates the semantics of TREND. Given two entities, A and B with identifiers AID and BID, a mandatory past evolution from A to B will have the following SQL triggers written for it:
CREATE TRIGGER A_chg_to_B_1
BEFORE INSERT ON B
FOR EACH ROW
BEGIN
DECLARE initial_exists INT;
DECLARE overlap_count INT;
SELECT COUNT (*)
INTO initial_exists
FROM A
WHERE NEW.BID = AID
AND NEW.B_start >= A_end;
SELECT COUNT (*)
INTO overlap_count
FROM A
WHERE NEW.BID = AID
AND (
(NEW.B_end > A_start AND NEW.B_start <= A_start)
OR
(NEW.B_end >= A_end AND NEW.B_start < A_end)
OR
(NEW.B_start >= A_start AND NEW.B_end <= A_end)
);
IF overlap_count > 0 THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'B cannot overlap with A.';
END IF;
IF initial_exists = 0 THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'B must have been in A before.';
END IF;
END;
//
CREATE TRIGGER A_chg_to_B_2
BEFORE INSERT ON A
FOR EACH ROW
BEGIN
DECLARE initial_exists INT;
SELECT COUNT (*)
INTO initial_exists
FROM B
WHERE NEW.AID = BID
AND (
(NEW.A_end > B_start AND NEW.A_start <= B_start)
OR
(NEW.A_end >= B_end AND NEW.A_start < B_end)
OR
(NEW.A_start >= B_start AND NEW.A_end <= B_end)
);
IF initial_exists > 0 THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'A has already changed into B.';
END IF;
END;
This can be understood as enforcing that "Each object in B must have been in A, but is no longer in A". Temporal referential integrity is also enforced using triggers, meaning that you can't have a temporal relationship existing during a period when one of its entities don't. This is applied whenever two temporal elements reference each other directly.
Results and Conclusion
The software produced is able to efficiently and correctly generate SQL code for databases from a TREND model. The testing of the schema generation algorithm shows great promise in terms of robustness, and the comparisons with the TREND semantics indicate that the temporal transition constraints are correctly implemented. Using a planned and iterative development approach, this software project produced a tool that can generate complex temporal databases from a TREND model. This removes the need for developers to implement advanced temporal features in the application logic, saving people and organisations development costs and time. The software is expected to reduce the barrier to entry of temporal databases and helping to realise Trend as the preferred Temporal Conceptual Data Model in research and industry.