Metanorma datamodel directive

Ronald Tse

Peter Tam on 09 Oct 2019

Standard authors often have to define and describe relationships among data within systems in our scope. This gives rise to the need of defining "data models".

Typically, one would draw diagrams to visualize those relationships, and add definitions and descriptions to the elements in those diagrams.

It works for the process but it is tedious because we have to identify all the elements within the diagrams that need to be documented. Sometimes we may also have to include an overview of all the diagrams, e.g. a top-down diagram, in the standard. It makes elements between the actual diagram and the overview repetitive. The manual process of the identification of all the elements to be documented and management of the repetitive elements among diagrams will gradually become unmanageable, especially if we have to update or review a single element within the standard, we will have to search for each occurrences within the diagrams and definitions. Errata will appear if we miss any one of the occurrence.

The datamodel directive of Metanorma is a tool to ease the situation. It is a tool to unify the creation of datamodel diagrams and the rendering of elements' description and definitions. Its purpose is to reduce the manual process involving the management of data models.

Build our datamodel

Assumptions

For the ease of elaboration, let’s assume we are going to document a datamodel of forests that is planted with apple trees and orange trees. We build the datamodel such that:

A forest has many trees
Trees can only be apple or orange trees
Trees can have fruits

Define the required models

To build Metanorma’s datamodels, we have to build two major things, models and views. Let’s build our models first to start things over. We have the following models for forests:

Forest
Tree
1. AppleTree
2. OrangeTree
Fruit

And each model will have its attributes:

Forest(id, name)
Tree(id, fruitKind, position)
AppleTree
OrangeTree
Fruit(kind)

We know that apples are only on apple trees and oranges are only on orange trees. We have to define such constraints for the corresponding models:

AppleTree's constraints:
1. fruitKind="apple"
OrangeTree's constraints:
1. fruitKind="orange"

Finally, we can actually start defining our models. In Metanorma, the file path of our models is <project_root>/sources/models/models/**(<model_path>)/<model_name>.yml.

sources/models/models/Forest.yml

name: Forest
modelType: class
definition: An area planted with many trees.
attributes:
  id:
    definition: The unique identifier of the Forest.
    type: String
  name:
    definition: The name of the Forest.
    type: String
relations:
  - target: Tree
    relationship:
      source:
        type: aggregation
        attribute:
          forest:
      target:
        attribute:
          tree:
            cardinality:
              min: 0
              max: "*"

In this model of a Forest, we can see some important attributes of a model.

modelType of the model here is a class. We define it as a class because we want to define the structure of a Forest instance by its attributes and relations.

definition of the model will become the text to define it in Metanorma. The definition can be written as Metanorma format while we only wrote it as simple text here.

attributes is the place to define the attributes of a model. In every attribute, we define its definition to be rendered in our Metanorma document. Similar to the definition of the model, it can be written as Metanorma format. type is the data type of the attribute.

We know that a forest can have many trees in our requirements. relations is where we place associations. In the association between a forest and its trees, source is the Forest model (the model we are defining in the YAML file) and target is the Tree model. The relationship between the source and target model can be specified through the relationship attribute.

We can specify the type of association by the type attribute, and here we specify it to be aggregation. Besides the type of association, we specify also the attribute of the source model. Here the attribute of the source model is forest. It is to indicate the additional attribute given from the source model in this association.

In the target of the relationship, we specify the cardinality for the tree attribute. It is to represent the one-to-many relationship given by the requirement of the datamodel.

sources/models/models/Tree.yml

name: Tree
modelType: class
definition: A woody plant that regularly renews its growth.
attributes:
  id:
    definition: The unique identifier of the Tree.
    type: String
  fruitKind:
    definition: The kind of Fruit fruited by the Tree.
    type: FruitKind
  position:
    definition: The 2D geographic coordinates relative to the Forest.
    type: Position
relations:
  - target: Fruit
    action:
      verb: produces
      direction: target

The attributes of the Tree model is quite similar to that of the Forest model. The only new thing we are seeing is the action attribute under relations attribute. The action attribute is to specify the label of the association in the UML diagram. We know that a tree produces fruits, so produces is the verb, and target is the direction to specify the action is from the source model to the target model.

sources/models/models/AppleTree.yml

name: AppleTree
modelType: class
definition: A tree that produces apples.
constraints:
  - fruitKind="apple"
relations:
  - target: Tree
    relationship:
      target:
        type: inheritance

Here we can see a new attribute constraints of the AppleTree model. We know that an apple tree only produces apples but not orange or any other fruits, so we set an addition constraint of fruitKind that is must be apple. The constraints set will be rendered in the UML diagram.

We can also see a new association type inheritance here. It is to express the AppleTree model is inherited from the Tree model. Other than aggregation and inheritance, we can also specify association type as composition or direct.

sources/models/models/OrangeTree.yml

name: OrangeTree
modelType: class
definition: A tree that produces oranges.
constraints:
  - fruitKind="orange"
relations:
  - target: Tree
    relationship:
      target:
        type: inheritance

sources/models/models/Fruit.yml

name: Fruit
modelType: class
definition: A product grown by tree and consumable by animals.
attributes:
  kind:
    definition: The kind of Fruit grown by a tree.
    type: FruitKind

sources/models/models/Position.yml

name: Position
modelType: class
definition: The relative position to a Forest.
attributes:
  x:
    definition: The horizontal coordinate of the Position.
    type: Float
  y:
    definition: The vertical coordinate of the Position.
    type: Float

sources/models/models/FruitKind.yml

name: FruitKind
modelType: enum
definition: The enumeration value of a kind of Fruit.
values:
  apple:
    definition: A fruit produces by an AppleTree.
  orange:
    definition: A fruit grown by an OrangeTree.

The models we have seen are all classes (modelType is class). Actually we can specify model as enumeration as well by setting the modelType as enum. The value options of an enumeration can be defined through the values attribute.

Coordinate and render models with views

Now we have models with their defined attributes and associations but we haven’t defined how they are rendered in our document. To get models rendered in our documents, we have to create views as the coordinator.

In Metanorma, the file path of our views is <project_root>/sources/models/views/<view_name>.yml.

sources/models/views/Overview.yml

name: Overview
title: Overview of Forest datamodel
caption: Forest datamodel overview in UML
imports:
  Forest:
  Tree:
  Fruit:
fidelity:
  hideMembers: true
  hideOtherClasses: true

We have defined the models we need to build the datamodel of forests, but we haven’t specified how they are organized into UML diagrams and sections in our document. For our datamodel to be easily understood by readers, it’s beneficial to include an overview of it. In the UML diagram for the overview, we don’t want to include all the details, e.g. attributes of the models and models other than the core ones. We want to express only the core models Forest, Tree and Fruit and their associations in the overview.

The imports attribute for a view is the place where we import the necessary models. Let’s recall the file path of our models <project_root>/sources/models/models/**(<model_path>)/<model_name>.yml. The keys Forest, Tree and Fruit we see in the imports attribute actually come from the partial (<model_path>/<model_name>) of the file path.

The fidelity attribute for a view controls how much details to be seen in the UML diagram and whether the definitions of models to be rendered in the document. The hideMembers boolean attribute means not to include the model attributes in the UML diagram and not to render the definitions of models in the document. The reason not to include them here because we want to include an overview of the datamodel only but not to define them in details. The hideOtherClasses boolean attribute means not to include the associated models from the imported ones because we aim to focus only to the core models in the overview.

sources/models/views/Forest.yml

name: Forest
title: Forest datamodel
caption: Forest datamodel
imports:
  Forest:
  Tree:
    skipDefinition: true
fidelity:
  hideOtherClasses: true

In the Forest view of the datamodel, we want to describe the attributes of the Forest model but not the Tree model. The skipDefinition boolean attribute serves this purpose by not rendering the definitions of Tree’s attributes in our Metanorma document.

sources/models/views/Tree.yml

name: Tree
title: Tree datamodel
caption: Forest datamodel
imports:
  Tree:
  AppleTree:
  OrangeTree:
  Fruit:
  Position:
  FruitKind:
fidelity:
  hideOtherClasses: true

Include views in our document

At this point we have defined the complete datamodel of forests. The final step is to include it in our document. As usual, we define our Metanorma sections under <project_root>/sources/sections. But this time we include also the corresponding views in them.

sources/sections/01-overview.adoc

== Forest datamodel overview

Some introduction of the Forest overview...

[datamodel]
....
include::../models/views/Overview.yml[]
....

Here we can see how to include a view of the datamodel in our Metanorma document by using the [datamodel] directive.

sources/sections/02-forest.adoc

== Forest datamodel

Some introduction of the Forest datamodel...

[datamodel]
....
include::../models/views/Forest.yml[]
....

sources/sections/03-tree.adoc

== Tree datamodel

Some introduction of the Tree datamodel...

[datamodel]
....
include::../models/views/Tree.yml[]
....

Conclusion

After compiling our document, we should see the UML diagrams and sections of models being rendered properly. By using the datamodel directive, we can ensure missing definitions of models and their attributes to be easily searchable (by grepping definition). Also, the models can be maintained and updated easily. Without the datamodel directive, we had to somehow draw the diagrams of the datamodels and describe them in the document. In case we want to update or review some of the definitions, we would have to update our drawings of the diagrams and the description of them separately, not to mention some models and their definitions may occur multiple times. It made it easy to miss the update of some of the occurences.

We believe the new syntax of the datamodel directive of Metanorma can help define datamodel more easily in a maintainable way, and readers will be happy to read documents with logical presentation of datamodels.