Wednesday, May 20, 2009

XPath 1.0 nodes and axes

XPath 1.0 partitions the nodes in an XML document into 4 types (ignoring comments, processing instructions and namespaces):

  • root - a document has exactly one of these; no parent; exactly one 'element' child
  • element - parent must be 'element' or the 'root'; zero or more children of types 'attribute', 'element' or 'text'
  • text - parent must be 'element'; no children
  • attribute - parent must be 'element'; no children

XPath specifies a vocabulary of 12 'axes' - ancestor, descendant, preceding, following, preceding-sibling, following-sibling, self, ancestor-or-self, descendant-or-self, parent, child and attribute. There is a denotation function from axes and (context) nodes in a document to sets of nodes in the same document, defined as follows:

  • [[self]]D,n = {n}
  • [[child]]D,n = the set of non-attribute nodes in D which n immediately dominates
  • [[attribute]]D,n = the set of attribute nodes in D which n immediately dominates
  • [[descendant]]D,n = M ∪ Um∈M [[descendant]]D,m, where M = [[child]]D,n
  • [[descendant-or-self]]D,n = [[descendant]]D,n ∪ [[self]]D,n
  • [[parent]]D,n = {m} where m is the parent of n
  • [[ancestor]]D,n = M ∪ Um∈M [[ancestor]]D,m, where M = [[parent]]D,n
  • [[ancestor-or-self]]D,n = [[ancestor]]D,n ∪ [[self]]D,n
  • [[preceding]]D,n = the set of non-attribute nodes in D which precede n (i.e. finish before n starts)
  • [[following]]D,n = the set of non-attribute nodes in D which follow n (i.e. that start after n finishes)
  • [[preceding-sibling]]D,n = [[preceding]]D,n ∩ {n' | ∃ m∈[[parent]]D,n, n'∈[[child]]D,m}
  • [[following-sibling]]D,n = [[following]]D,n ∩ {n' | ∃ m∈[[parent]]D,n, n'∈[[child]]D,m}

You can read the XPath 1.0 reccomendation here.

No comments:

Post a Comment