Metadata and Predefined text
The front material of documents generated by Metanorma routinely involves templated text, including both the front page, and “predefined text” about legal and other obligations surrounding the document. Those text templates in turn are routinely populated using metadata extracted from the document.
Metadata
The bibdata
element in a Metanorma document contains various metadata
elements about the document, as a bibliographic description.
These elements are populated either from the document attributes in the Metanorma AsciiDoc input, or with default values.
Specifically, the bibdata
element is populated through the
Metanorma::Standoc::Converter.metadata
method, and its inheritors.
The bibdata
element is not rendered directly as the document front
page. Instead, the document front page, and other templated texts, are
populated wth elements extracted from the bibdata
element. That
extraction takes place using the Isodoc.info
method and its
inheritors, which invoke the Isodoc::Metadata
class and its
inheritors. The extraction results in a Hash
of metadata keys and
values, which is used to populate any templated text.
For example, in the Metanorma ISO flavour, the document header
= This title is overriden by :title-main-en:
:docnumber: 33032
:edition: 1
:technical-committee: TC
:technical-committee-number: 399
:technical-committee-type: TC
:docstage: 10
:docsubstage: 20
:title-intro-en: Cybernetics
:title-main-en: Neuro-information interchange interface
generates the following bibdata
element:
<bibdata type="standard">
<title language="en" format="text/plain" type="main">Cybernetics — Neuro-information interchange interface</title>
<title language="en" format="text/plain" type="title-intro">Cybernetics</title>
<title language="en" format="text/plain" type="title-main">Neuro-information interchange interface</title>
<docidentifier type="iso">ISO/NWIP 33032</docidentifier>
<docidentifier type="iso-with-lang">ISO/NWIP 33032 (E)</docidentifier>
<docnumber>1000</docnumber>
<contributor>
<role type="author"/>
<organization>
<name>International Organization for Standardization</name>
<abbreviation>ISO</abbreviation>
</organization>
</contributor>
<contributor>
<role type="publisher"/>
<organization>
<name>International Organization for Standardization</name>
<abbreviation>ISO</abbreviation>
</organization>
</contributor>
<edition>1</edition>
<language>en</language>
<script>Latn</script>
<status>
<stage>10</stage>
<substage>20</substage>
</status>
<copyright>
<from>2020</from>
<owner>
<organization>
<name>International Organization for Standardization</name>
<abbreviation>ISO</abbreviation>
</organization>
</owner>
</copyright>
<ext>
<doctype>article</doctype>
<editorialgroup>
<technical-committee number="1" type="TC">TC</technical-committee>
<subcommittee/>
<workgroup/>
</editorialgroup>
<structuredidentifier>
<project-number>ISO 33032</project-number>
</structuredidentifier>
</ext>
</bibdata>
In turn, that generates the following metadata Hash:
{
:agency => "ISO",
:authors => [],
:authors_affiliations => {},
:docnumber => "ISO/NWIP 33032",
:docnumeric => "33032",
:docsubtitle => "",
:docsubtitlemain => "",
:docsubtitlepartlabel => "Partie ",
:doctitle => "Cybernetics — Neuro-information interchange interface",
:doctitlemain => "Neuro-information interchange interface",
:doctitlepartlabel => "Part ",
:doctype => "Article",
:docyear => "2020",
:draft => nil,
:draftinfo => "",
:edition => "2",
:editorialgroup => ["TC 399"],
:ics => "XXX",
:obsoletes => nil,
:obsoletes_part => nil,
:revdate => nil,
:sc => "XXXX",
:secretariat => "XXXX",
:stage => "10",
:stage_int => 10,
:statusabbr => "NWIP",
:tc => "TC 399",
:tc_docnumber => [],
:unpublished => true,
:wg => "XXXX"
}
Some metadata hash values are normalized, especially as the contents of the hash are intended for display; dates, for example, are often resolved from the ISO 8601-1 and ISO 8601-2 formats to formats with the month spelled out.
Default metadata values
Each gem can customise its own metadata values.
These are the default metadata values extracted by the base
Isodoc::Metadata
class, and the corresponding Metanorma XML locations
they are populated from:
authors
-
an array of personal author names, each name extracted from
//bibdata/contributor[role/@type = 'author' or xmlns:role/@type = 'editor']/person
, and being either./name/completename
or./name/forename
+ " "./name/surname
. authors_affiliations
-
a hash of affiliations that personal authors have, each personal affiliation mapping to the array of personal names of authors working there. The affiliations are extracted from the personal author names (see above) as
./affiliation/organization/name
plus./affiliation/organization/address/formattedAddress
, comma-delimited, or else either the name or the address. So for example,{ "CSIRO" ⇒ ["Fred Nerk", "Joe Bloggs"], "University of Auckland" ⇒ ["John Doe"] }
. {type}date
-
The date at which the
{type}
event occurred. The date is extracted from//bidata/date[@type = {type}]
. The{type}
is the name of the lifecycle event modelled by Relaton, including:-
published
-
accessed
-
created
-
implemented
-
obsoleted
-
confirmed
-
updated
-
issued
-
received
-
unchanged
-
circulated
-
announced
-
vote-started
-
vote-ended
-
doctype
-
Flavour-specific document type, from
//bibdata/ext/doctype
. doctype_display
-
Flavour-specific localised document type, from
//local_bibdata/ext/doctype
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.5]. agency
-
A concatenation of all the agency abbreviations (or, if that is unavailable, agency names) responsible for publishing the document. Extracted from
//bibdata/contributor[xmlns:role/@type = 'publisher']/organization
, using either./abbreviation
or./name
. E.g. “ISO/IEC”. publisher
-
A concatenation of all the agency names responsible for publishing the document. Extracted from
//bibdata/contributor[xmlns:role/@type = 'publisher']/organization/name
[added in https://github.com/metanorma/isodoc/releases/tag/v1.0.23]. subdivision
-
Subdivision of the first agency responsible for publishing the document, extracted from
organization/subdivision
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6]. pub_address
-
Address of the first agency responsible for publishing the document, extracted from
organization/address/formattedAddress
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6]. pub_phone
-
Phone number of the first agency responsible for publishing the document, extracted from
organization/phone[not(@type = 'fax')]
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6]. pub_fax
-
Fax number of the first agency responsible for publishing the document, extracted from
organization/phone[@type = 'fax']
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6]. pub_email
-
Email of the first agency responsible for publishing the document, extracted from
organization/email
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6]. pub_uri
-
URI of the first agency responsible for publishing the document, extracted from
organization/uri
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6]. unpublished
-
Boolean value of whether the document is considered to be an unpublished draft or published, based on the status of the document.
keywords
-
An array of the keywords of the document.
stage
-
The stage of the document, extracted from
//bibdata/status/stage
. stage_display
-
The localised stage of the document, extracted from
//local_bibdata/status/stage
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.5]. stageabbr
-
The abbreviation of the stage of the document, as extracted from
//bibdata/status/stage
. By default, this is the initials of the stage if the document is unpublished, andnil
if the document is published. substage
-
The substage of the document, extracted from
//bibdata/status/substage
. substage_display
-
The localised substage of the document, extracted from
//bibdata/status/substage
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.5]. iteration
-
The iteration of the document stage, extracted from
//bibdata/status/iteration
. docnumber
-
The first document identifier given in the XML for the document, extracted from
//bibdata/docidentifier
. docnumeric
-
The numeric identifier for the document, extracted from
//bibdata/docnumber
. The canonical document identifier indocnumber
is typically thedocnumeric
value, preceded by an agency abbreviation and/or a document type. edition
-
The document edition, extracted from
//bibdata/edition
. docyear
-
The document copyright year, extracted from
//bibdata/copyright/from
. draft
-
The document draft number, extracted from
//bibdata/version/draft
. revdate
-
The document revision date, extracted from
//bibdata/version/revision-date
. revdate_monthyear
-
The document revision date, extracted from
//bibdata/version/revision-date
, given as month name and year (internationalised where defined). draftinfo
-
The draft number and revision date, preceded with the local label for DRAFT.
doctitle
-
The document title, extracted from the first
//bibdata/title[@language='en']
found in the document. partof
-
The identifier of the document this document is part of, extracted from
//bibdata/relation[@type = 'partOf']//docidentifier
. obsoletes
-
The identifier of the document this document obsoletes, extracted from
//bibdata/relation[@type = 'obsoletes']//docidentifier
. obsoletes_part
-
The part of this document that has been obsoleted, extracted from
//bibdata/relation[@type = 'obsoletes']//locality
. html
-
The URL for an HTML version of this document, extracted from
//bibdata/uri[@type = 'html']
. xml
-
The URL for an XML version of this document, extracted from
//bibdata/uri[@type = 'xml']
. pdf
-
The URL for an PDF version of this document, extracted from
//bibdata/uri[@type = 'pdf']
. doc
-
The URL for a DOC version of this document, extracted from
//bibdata/uri[@type = 'doc']
. url
-
The URL for an unspecified version of this document, extracted from
//bibdata/uri[not(@type)]
. keywords
-
The keywords of the document, extracted from
//bibdata/keywords
. title_footnote
-
Footnotes belonging to the document title, extracted from
//bibdata/note[@type = 'title-footnote']
[added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].
Predefined text processing
The metadata hash is used by the Isodoc::Convert.populate
method, to
populate all templated text. Templated text is expected to be in
Liquid template language.
The keys of the metadata hash are the variable names passed into Liquid.
Given given the metadata Hash above, the following templated text:
<div class="doctitle-en">
<div>
<span class="title">{{ doctitleintro }}{% if doctitleintro and doctitlemain %} — {% endif %}</span><span class="subtitle">{{ doctitlemain }}{% if doctitlemain and doctitlepart %} —{% endif %}</span>
{% if doctitlepart %}
</div>
<div class="doctitle-part">
{% if doctitlepartlabel %}
<span class="partlabel">{{ doctitlepartlabel }}:</span>
{% endif %}
<span class="part">{{ doctitlepart }}</span>
{% endif %}
</div>
</div>
is populated as:
<div class="doctitle-en">
<div>
<span class="title"></span><span class="subtitle">Main Title — Title</span>
</div>
</div>
and all the conditional output is ignored, because the document has
neither a part component nor an introductory component to its title:
only {{ doctitlemain }}
ends up populated.
The Isodoc::Convert.populate
method merges the metadata Hash with the
@labels
hash used for internationalisation, taken from the i18n YAML files in each flavour (see
Localization how-to guide).
This is so that any templated text can also access localised labels
defined for the current language. Those labels are accessed through a labels
object; e.g. {{ labels["table_of_contents"] }}
for the table of contents title
in the current flavour and language [added in https://github.com/metanorma/isodoc/releases/tag/v1.5.0]
(previously they were accessible at the top level of variables.)
The metadata hash for a flavour is also populated with the absolute file locations of the gem’s copy of any logo images. That means that any logos are populated in templated text using the metadata hash.
For example, the HTML and Word logo images for the Metanorma M3AAWG flavour
are defined in IsoDoc::M3d::Metadata.initialize
as:
def initialize(lang, script, labels)
super
here = File.dirname(__FILE__)
set(:logo_html,
File.expand_path(File.join(here, "html", "m3-logo.png")))
set(:logo_word,
File.expand_path(File.join(here, "html", "logo.jpg")))
end
That means that the HTML logo image is populated in the HTML cover page for M3AAWG through a Liquid variable:
<img src="{{ logo_html }}" alt="m3 logo"/>
Note
|
Although the absolute file location of the image inside the gem is used, postprocessing replaces this with either a local copy or a Data URI, in the case of HTML, and a MIME embedded attachment containing the image, in the case of Word. |
The templated text populated through metadata can include:
-
Under the
isodoc/*/html
directory of the gem:-
The HTML cover page (
html_*titlepage.html
) and Word cover page (word*_titlepage.html
), which are the main destination forbibdata
metadata. -
The introductory page for HTML and Word (
html_*intro.html
,word*_intro.html
), although this is usually populated instead via Metanorma predefined text (see below). -
The Word header (
header.html
). -
The HTML and Word Stylesheets (
*.scss
). This is in case any variables are used to either populate the stylesheet, or to conditionally include text; NIST and IEC use the current document status to turn line numbering on or off in the Word stylesheet. (Draft documents are line-numbered, and whether a document is in draft or not depends on the value ofbibdata/status
.)
-
-
Under the
metanorma/*
directory of the gem:-
The Metanorma predefined text file (
boilerplate.xml
,boilerplate.adoc
)
-
Predefined text
XML
The boilerplate
element in Metanorma XML follows after bibdata
, and
contains text that is repeatedly included in each instance of the
document class, and that outlines the rules under which the document
may be used.
By default, the boilerplate
element contains up to four elements:
-
copyright-statement
, -
license-statement
, -
legal-statement
, and -
feedback-statement
.
Each of those statements is a Metanorma clause, which can contain a title, multiple paragraphs, and subclauses.
Because the predefined text is repeated for each document in its class, it is not expected to be supplied by the
user (although the user can supply their own predefined text file using the :boilerplate-authority:
document attribute).
Instead, the predefined text is included as a Metanorma XML file within the gem; by default, it is called
boilerplate.xml
.
Some of the predefined text may be populate with metadata specific to the current document, so the predefined text file is a Liquid template, populated with variables from the current flavour metadata Hash as with other templated text.
The content in the boilerplate
element is processed as part of the document preface, and converted to HTML or
Word like the rest of the Metanorma XML. However, predefined text usually ends up in the cover page or
introductory page of the document instead. The following are the default conventions in Metanorma, although
they can be overridden in the IsoDoc::*::Converter.authority_cleanup
method (as is currently done in NIST):
-
Content in the
copyright-statement
element is rendered in a<div class="boilerplate-copyright">
container. -
The
authority_cleanup
method, defined in postprocessing for both the HTML and the Word converters, looks for a single element withid
attributeboilerplate-copyright-destination
. -
If it finds such an element, it moves the
<div class="boilerplate-copyright">
container and its contents to replace that element. This is how predefined text can populate the cover page or introductory page, instead of occurring within the document body. -
This is repeated for each of
license-statement
,legal-statement
, andfeedback-statement
.
For example, in Metanorma ISO:
-
the copyright statement for ISO occurs on the second page:
-
<div id="boilerplate-copyright-destination"/>
appears accordingly in the introductory page template;
-
-
the license statement is the warning present, if the document is in draft:
-
<div id="boilerplate-license-destination"/>
appears in the title page template for the flavour; -
the CSS styling for the front page draft warning is styled as
boilerplate-license
.
-
A user-supplied boilerplate file need not provide all four statements [added in https://github.com/metanorma/metanorma-standoc/releases/tag/v2.8.2]. If the user-supplied element is missing an element in the default for the flavour, the default is retained. If the element is to be deleted, provide it as an empty title.
The following predefined text from metanorma-csa exemplifies all four statements in a predefined text, and its processing as a Liquid template.
<boilerplate>
<copyright-statement>
<clause>
<p>© {{ docyear }} Cloud Security Alliance, LLC.</p>
</clause>
</copyright-statement>
{% if unpublished %}
<license-statement>
<clause id="draft-warning">
<title>Warning for Drafts</title>
<p>This document is not a CSA Standard. It is distributed for review and
comment, and is subject to change without notice and may not be referred to as
a Standard. Recipients of this draft are invited to submit, with their
comments, notification of any relevant patent rights of which they are aware
and to provide supporting documentation.
</p>
</clause>
</license-statement>
{% endif %}
<legal-statement>
<clause>
<p>All rights reserved. Unless otherwise specified, no part of this
publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the
internet or an intranet, without prior written permission. Permission can
be requested from the address below.
</p>
</clause>
</legal-statement>
<feedback-statement>
<clause>
<p>Cloud Security Alliance</p>
<p align="left">
2212 Queen Anne Ave N<br />
Seattle<br />
WA 98109<br />
United States of America<br />
<br />
<link target="mailto:copyright@cloudsecurityalliance.com">copyright@cloudsecurityalliance.com</link><br />
<link target="www.cloudsecurityalliance.com">www.cloudsecurityalliance.com</link>
</p>
</clause>
</feedback-statement>
</boilerplate>
The following user-provided predefined text will delete the license statement, and override the legal statement, leaving the copyright statement and feedback statement of the flavour alone:
<boilerplate>
<license-statement/>
<legal-statement>
<clause>
<p>All rights reserved. Unless otherwise specified, no part of this
publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the
internet or an intranet, without prior written permission. Permission can
be requested from the address below.
</p>
</clause>
</legal-statement>
</boilerplate>
ADOC
Predefined text can also be specified in Metanorma Asciidoc, with the file suffix .adoc
[added in https://github.com/metanorma/metanorma-standoc/releases/tag/v2.4.6]. The following special
processing rules apply:
-
Top-level clauses ending in
-statement
are converted into the equivalentboilerplate
tags; so== copyright-statement
corresponds to Metanorma XML<copyright-statement>
. -
The Asciidoctor is a Liquid template, just as for XML, so
{{ … }}
is reserved for Liquid, and cannot be used for Metanorma AsciiDoc concepts. The AsciiDoc is processed by Liquid before it is passed on for processing. -
Clauses in predefined text often do not have clause titles; as usual for Asciidoc, introduce such titles with
===
. -
Clauses in Asciidoctor with no title and no user-assigned anchor are automatically assigned an anchor of the form
{n}
, where{n}
is an integer. In order to prevent those anchors colliding in the boilerplate and the main document, we overwrite any such anchors in the predefined text with the normal Metanorma default,followed by a GUID. Do not use
_{n}
as an anchor in any of your predefined text. -
Also be on the lookout for any clauses with identical titles in your predefined text and your main document; if no user-defined anchor is supplied, they will end up with the same title. To prevent that, the simplest thing to do is to provide user-defined anchors for all titled clauses in the predefined text.
-
The values that populate Liquid templates in Metanorma are in Metanorma XML, if they contain any formatting; Metanorma automatically treats AsciiDoc Liquid variables as Metanorma XML passthrough values. For example, the
pub-address
document variable may be specified as document attribute as::pub-address: 1 John St + \ London
But its value in Liquid will be
1 John St<br/>London
, since Liquid interpolation is developed for Metanorma XML. Metanorma will treat any instance of{{ pub_address}}
in AsciiDoc predefined text as{{pub_address}
(i.e. when converting the AsciiDoc predefined text to Metanorma XML, the contents of{{pub_address}}
will be left alone.)
The following is the Asciidoctor equivalent of the boilerplate
Metanorma XML just given. This is for
a complete boilerplate document:
== copyright-statement
=== {blank}
© {{ docyear }} Cloud Security Alliance, LLC.
{% if unpublished %}
== license-statement
[[draft-warning]]
=== Warning for Drafts
This document is not a CSA Standard. It is distributed for review and
comment, and is subject to change without notice and may not be referred to as
a Standard. Recipients of this draft are invited to submit, with their
comments, notification of any relevant patent rights of which they are aware
and to provide supporting documentation.
{% endif %}
== legal-statement
=== {blank}
All rights reserved. Unless otherwise specified, no part of this
publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the
internet or an intranet, without prior written permission. Permission can
be requested from the address below.
== feedback-statement
=== {blank}
Cloud Security Alliance
[align="left"]
2212 Queen Anne Ave N +
Seattle +
WA 98109 +
United States of America +
+
mailto:copyright@cloudsecurityalliance.com[copyright@cloudsecurityalliance.com] +
https://www.cloudsecurityalliance.com[www.cloudsecurityalliance.com]
And this will do a partial update, as above:
== license-statement
== legal-statement
=== {blank}
All rights reserved. Unless otherwise specified, no part of this
publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the
internet or an intranet, without prior written permission. Permission can
be requested from the address below.
Cover page notes
Metanorma provides a mechanism for notes and admonitions to appear on the cover page of a document [added in https://github.com/metanorma/isodoc/releases/tag/v2.0.8].
This is rendered in a similar fashion to boilerplates:
In Word and HTML output, <div id="coverpage-note-destination"/>
is a reserved
element in the document template. If the element is present, then any notes and
annotations flagged as coverpage=true
are moved to that location in
postprocessing.