the Non-Truman model, a query that violates access
control specifications is rejected, rather than modi-
fied. Only valid queries are answered.
Our model is an authorization-transparent Truman
model. We allow users to pose XPath queries against
the original labeled XML document. The evaluation
of an XPath query over a labeled XML document has
two parts. First, we change the usual XPath query
semantics as follows. If a child axis occurs, the eval-
uation follows a parent-child path; if a descendant-or-
self axis occurs, the evaluation follows an ancestor-
descendant path; if a preceding-sibling axis occurs,
the evaluation follows a preceding-sibling path; if a
following-sibling axis occurs, the evaluation follows
a following-sibling path.
Second, we need to make sure that for each path ac-
cessed, a user is allowed access to that path based on
the path label and the user’s access label. Suppose a
path P has a path label L
1
and a user Mike has a read
access label L
2
. According to the XML-FGAC pol-
icy, (1) if L
2
is “EXISTENCE”, Mike could read the
path P regardless of the value of label L
1
; (2) if L
2
is “VALUE”, Mike could read the path P if L
1
is not
“EXISTENCE”; (3) if L
2
is “NULL”, Mike can only
access paths with “NULL” labels; if the DEFAULT
policy coexists, Mike could ask queries to existence
access the path P if L
1
is “VALUE”. The discussion
for Write Access is similar. The above logic is in-
serted into the query access plan. When the access
plan is executed, the access rules from the label ac-
cess policy associated with the labeled XML docu-
ment are evaluated for each path accessed in the doc-
ument. This approach allows the cached access plan
to be reused because the access labels of the user who
issued the query are acquired during runtime.
For an XML document, there is an ordering, docu-
ment order (Clark and DeRose, 1999), defined on all
the nodes in the document corresponding to the order
in which the first character of the XML representa-
tion of each node occurs in the XML representation
of the document. This ordering information may leak
information as shown in the following example.
Example 6.1 Let us look at Figure 1 again. Sup-
pose one security policy wants to block public ac-
cess to the sibling relationships between the Customer
Nodes and their Order Nodes. Suppose the following
queries are allowed to return their answers in docu-
ment order: //Customer and //Order. Then the order
of Customer output might match the order of Order
output, hence leaks secret information. The situa-
tion becomes worse if the document has a registered
schema and the schema shows publicly that each cus-
tomer has a fixed number, say 2, of orders. In this
case, the association between a Customer and his Or-
ders is completely leaked.
To prevent an information leak based on document
order, we shuffle the output as follows. Each node
in the output will receive a random number. And the
nodes will be output based on the order of their as-
signed random numbers.
In sum, the processing algorithm to be inserted in
the access plan for a labeled XML document with
XML-FGAC and DEFAULT policies is as follows.
Algorithm: Insert Read and Write Access logic into
a query access plan for a labeled XML document.
1. Fetch the user’s Access Labels for Read and Write
actions (e.g., from a system catalog table).
2. For all paths accessed, do the following.
(a) If it is a Read Access and READ Access rules
do not permit access, skip the path unless (1) the
Read Access Label is “NULL”, (2) the Path La-
bel is “VALUE”, and (3) it is an existence access.
(b) If it is a Write Access and Write Access rules do
not permit access, skip the path.
3. Shuffle output.
Example 6.2 Suppose the document in Figure 1 has
two labels attached to its paths as specified in Exam-
ple 5.4 and the label access policies are XML-FGAC
and DEFAULT. Suppose a database user Mike with a
read access label “EXISTENCE” asks the query Q
1
:
//Account[Customer/Name]. The query access plan
checks the following paths:
1. the paths P
1
from the root of the document to Ac-
count Nodes, i.e., //Account,
2. the paths P
2
from Account Nodes to their descen-
dant Name Nodes via Customer Nodes, i.e.,
ANCS //Account DESC /Customer/Name,
3. the paths P
3
from Customer Nodes to their children
Name Nodes, i.e., Customer/Name.
Paths P
1
and P
3
have “NULL” labels, hence, access is
allowed. Paths P
2
have “EXISTENCE” labels. Mike
could read them since his read access label is “EXIS-
TENCE”. Read access to P
2
is denied for any other
labels and the authorized answer set is empty.
Next, suppose another user John with a read access
label “VALUE” asks the query Q
2
: //Item//Cost. The
query access plan checks the following paths:
1. the paths P
1
from the root of the document to the
Item Nodes, i.e., //Item,
2. the paths P
2
from the Item Nodes to their descen-
dant Cost Nodes, i.e., ANCS //Item DESC //Cost.
Paths P
1
have “NULL” labels, hence, access is al-
lowed. For P
2
, one path P
21
has a “NULL” la-
bel; the other path P
22
has a “VALUE” label as it is
ANCS //Item[Name=“IPOD”] DESC //Cost. John
could read P
2
if his read access label is “VALUE”.
John could read P
21
but not P
22
if his read access la-
bel is “NULL”. Hence, the authorized answer set is
“450$”. However, even if John’s read access label is
SECRYPT 2006 - INTERNATIONAL CONFERENCE ON SECURITY AND CRYPTOGRAPHY
376