Strict Standards: Declaration of action_plugin_blog::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /www/htdocs/w00d9226/oliverh.com/lib/plugins/blog/action.php on line 13
Strict Standards: Declaration of action_plugin_discussion::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /www/htdocs/w00d9226/oliverh.com/lib/plugins/discussion/action.php on line 745
Strict Standards: Declaration of action_plugin_importoldchangelog::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /www/htdocs/w00d9226/oliverh.com/lib/plugins/importoldchangelog/action.php on line 157
Deprecated: Assigning the return value of new by reference is deprecated in /www/htdocs/w00d9226/oliverh.com/inc/parserutils.php on line 202
Deprecated: Assigning the return value of new by reference is deprecated in /www/htdocs/w00d9226/oliverh.com/inc/parserutils.php on line 205
Deprecated: Assigning the return value of new by reference is deprecated in /www/htdocs/w00d9226/oliverh.com/inc/parserutils.php on line 314
Deprecated: Assigning the return value of new by reference is deprecated in /www/htdocs/w00d9226/oliverh.com/inc/parserutils.php on line 454
Strict Standards: Declaration of cache_instructions::retrieveCache() should be compatible with cache::retrieveCache($clean = true) in /www/htdocs/w00d9226/oliverh.com/inc/cache.php on line 291
Deprecated: Function split() is deprecated in /www/htdocs/w00d9226/oliverh.com/inc/auth.php on line 146
Warning: Cannot modify header information - headers already sent by (output started at /www/htdocs/w00d9226/oliverh.com/lib/plugins/blog/action.php:13) in /www/htdocs/w00d9226/oliverh.com/inc/auth.php on line 236
Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /www/htdocs/w00d9226/oliverh.com/inc/auth.php on line 390
Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /www/htdocs/w00d9226/oliverh.com/inc/auth.php on line 390
Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /www/htdocs/w00d9226/oliverh.com/inc/auth.php on line 387
Strict Standards: Only variables should be passed by reference in /www/htdocs/w00d9226/oliverh.com/doku.php on line 69
Warning: Cannot modify header information - headers already sent by (output started at /www/htdocs/w00d9226/oliverh.com/lib/plugins/blog/action.php:13) in /www/htdocs/w00d9226/oliverh.com/inc/actions.php on line 350
====== RELAX NG with Content Classes ======
Version: 2006-11-20 \\
Author: [[http://www.oliverh.com/|Oliver Horn]]
This document describes ERNC, a syntactic extension of [[http://www.oasis-open.org/committees/relax-ng/|RELAX NG]]'s Compact Syntax (RNC). ERNC introduces a lightweight syntax to make the definition of so-called "content classes" more convenient. Note that all extensions are just syntactic sugar. Every ERNC grammar can be transformed into a pure RELAX NG grammar; a simple processor is available for [[#ERNC Processor|download]].
This document assumes knowledge of the [[http://www.oasis-open.org/committees/relax-ng/spec-20011203.html|RELAX NG]] specification and the [[http://www.oasis-open.org/committees/relax-ng/compact-20021121.html|RELAX NG Compact Syntax]] specification.
===== Introduction =====
Content classes are named groups of element types of the same class. Such content classes occur frequently in documentation-oriented schemata like DocBook, TEI or XHTML. For example, in XHTML the group of the list element types or the group of the heading element types are "classes". There are a lot of other examples for content classes, many can be found in the current RELAX NG grammars for XHTML 2 or DocBook.
Actually, in usual RELAX NG schemata content classes are just named patterns containing all element types as choice.
Organizing element types in such a way makes a schema more modular and flexible. However, the current solution for handling content classes in RELAX NG is
===== Membership Declaration =====
A named pattern can be declared as member of a content class by using the ''<:'' operator in its definition. A content class is just a named pattern representing the combination of all members by means of the choice-operator ''|''.
The following ERNC grammar looks very similar to a pure RNC grammar. It defines two patterns ''UnorderedList'' and ''OrderedList''. The new thing is the ''<: List'' following the pattern name on the left-hand-side of the definition. By this declaration, each pattern becomes a member of the ''List'' content class.
UnorderedList <: List =
element ul { ... }
OrderedList <: List =
element ol { ... }
An equivalent pure RNC grammar is:
UnorderedList =
element ul { ... }
List |= UnorderedList
OrderedList =
element ol { ... }
List |= OrderedList
It is even possible to associate more than one content class with a pattern. For example, the following definition puts ''UnorderedList'' into both content classes ''List'' and ''Block''.
UnorderedList <: List, Block =
element ul { ... }
The schema is equivalent to:
UnorderedList =
element ul { ... }
List |= UnorderedList
Block |= UnorderedList
**Note**: Only regular definitions can be associated with a content class. Definitions using the combining operators ''&='' or ''|='' cannot be associated with a content class.
===== Content Class Declarations =====
A content class can be declared by the (new) keyword ''class'' followed by the name of the class. Note that there is no requirement to declare a content class explicitly. In that case, the content class is created implicitly by means of the class associations.
However, an explicit declaration of a content class may be useful, e.g. for documentation purposes. Similar to other constructs, a content class declaration can be decorated with documentation or annotations. For example, the following declaration defines the content class ''List'' and annotates it with a small documentation:
## Class for list elements
class List
Further, the declaration can be also be used to define the content class itself as member other content classes. In the following example, the content class ''List'' is declared and becomes also member of content class ''Block'':
## Class for list elements
class List <: Block
This example is equivalent to the following pure RNC grammar:
## Class for list elements
List = notAllowed
Block |= List
**Note**: The default content of a content class is ''notAllowed''. This is correct, because the members are combined by means of the choice-operator ''|''. The ''notAllowed'' alternative is removed when the grammar is processed (see RELAX NG specification,
[[http://www.oasis-open.org/committees/relax-ng/spec-20011203.html#notAllowed|section 4.20]]).
===== Content Classes with Text =====
In documentation formats it is usual that some content classes have mixed content, i.e. they contain text as part of their content models. For this purpose, ERNC allows to declare the ''text'' pattern as member of a context class by means of the membership operator.
text <: Inline
**Note**: The ''text'' pattern is treated as normal member of a content class, i.e. it is combined by means of choice. It is //not// equivalent to the ''mixed'' pattern which combines text by interleaving. However, it is a usual way to deal with mixed content, see e.g. the RELAX NG schemata for DocBook or XHTML 2.
===== ERNC Processor =====
The ERNC processor is a little [[http://www.python.org|Python]] script which takes a ERNC grammar as input and generates a pure RNC grammar as output.
Download: [[http://www.oliverh.com/ernc-0.1.zip|ernc-0.1zip]] (License: MIT License, see file LICENSE in the distribution)
==== Annotations ====
The ERNC processor preserves additional information about the content classes as annotations: Content class declarations are annotated with ''ernc:class'' and class memberships are annotated with ''ernc:member''.
For example, the simple declaration ''class List <: Block'' is transformed to something like:
[ ernc:class ]
List |= notAllowed
[ ernc:member ]
Block |= List
==== Known Issues ====
The preprocessor generates some additional definitions suffixed with ''_ernc''. You can ignore them! The reason is simply that the current version of the preprocesor is just a better tokenizer. It does not parse a grammar file and hence cannot detect the end of a definition.
However, the generated definitions has to inserted after the original definition, otherwise it would break documentation and annotations. The workaround used by the preprocessor is to introduce each time a useless interim definition (those suffixed with ''_ernc'').