and unique values, whether the graph element contains only vertex elements, the vertex
elements contain only edges an so on, can be verified with an XML schema. Semantic
properties, like testing whether the graph is connex, loop free and is respecting refer-
ential integrity cannot be tested using W3C XML schemas. W3C XML schemas are
missing powerful constructs letting define required relationships between any compo-
nents within the XML document. Roberts [10] describes it as missing cross-field checks
such as ”if element x contains value y then element z should be mandatory”. W3C
schema grammars allows only to define which element x can appear as child of element
z, but any other conditions than parent-child relationship cannot be expressed. Hence a
new kind of not grammar based schema has to be introduced. The solution therefore are
rule based schemas. Section 3 gives a short introduction to the most outstanding rule
based schema language: Constraint Language in XML (CLiXML). Unlike grammar
based schema languages, rule based schema languages have a much simpler structure.
A rule based schema consists of a list of rules. Each rule has an output part which will
be outputted by the schema validator when the rule succeeds, respectively fails and the
rule itself, which contains constraints and restrictions of the XML document, and its
content, i.e. required relationships between its content.
3 Testing of Semantical Properties in XML Documents
W3C XML schema cover basically syntactical properties. So only semantical proper-
ties have to be tested by the newly introduced language, since it does not assert it’s
claim to be an all-in-one language suitable for every purpose, doing work already done
by W3C XML schema. All grammar based rules show that their template-like structure
is cumbersome and hindering to describe semantical properties. They let only define a
parent child relationship. In the language described here, CLiXML, another approach is
used. Nodes (elements, attributes, text, comments) are selected by an XPath [3] expres-
sion and constraints are set up on, respectively between the selected nodes. XPath is the
language defined by W3C to select nodes of an XML document. Selections are made
using a regular expression inspired language, adopted for XML’s tree structure. XPath
requests result in lists. Hence using first order logic expressions seems to be quite a
natural approach.
Quantifiers forall and exists contain a predicate variable (variable name and an
XPath expression to select a list of nodes). The quantifier contains a further quantifier, a
logical predicator (equal, less, bigger...) or a logical operator (and, or, not, ...). Predica-
tors are binary. This means that predicator elements contain two attributes op1 and op2,
which are both related together by the predicator. Logical operators are elements that
group operators, predicators or quantifiers. Nested quantifiers and predicators can use
the preliminarily defined variable to specify the needed relationships between the se-
lected nodes; i.e. inner quantifiers can define XPath expressions relative to the selected
node or comparators can use the selected node and compare it with another one. The
quantifiers act like iterators when being evaluated. In each iteration step, the predicate
variable takes one value from the list. A forall test succeeds only if the nested test suc-
ceeds for all iteration steps. Exists succeeds if the nested test succeeds for at least one
iteration step.
40