Who’s On Phirst

Official blog of Phurnace Software.

Tag >> Xpath Expressions

Posted by: Pete Pickerill on

In my previous post, "Automatically generate XPath Expressions in Java,” I showed you how you could use Java to automatically generate XPath expressions from a single xml document or a group of XML. So now you have your XML files and the XPath expressions to validate them…but how do you do it?

Below is a sample Java class for verifying your XML using XPath expressions. Enjoy!

In Xpath ExpressionsXML
Comment (0) Read More...


Posted by: Pete Pickerill on

Anyone who’s worked with J2EE application servers has more than a passing familiarity with XML. It’s the weapon of choice for configuration in the 3 platforms I’ve spent most of my time working with; JBoss, WebLogic, and WebSphere. When I first started working at Phurnace, I would verify these files by hand using a diff tool that could handle single documents or a directory tree. I would compare a ‘known good’ configuration to the configuration produced by Phurnace Deliver. This worked great while I was getting my feet wet and finding my way around each platform’s configuration peculiarities.


For obvious reasons, this method sucked a lot after I got a handle on the different configuration conventions used by each platform. The process was incredibly monotonous and error prone. My mind would start to wander as I robotically clicked through the differences in dozens of XML files. Most of the time the differences were expected due to system generated unique IDs or environmental differences. I decided I needed to automate this process simply because I dreaded doing it manually. Because we were dealing with XML, XPath seemed to be the obvious choice for programmatic validation. So I just needed to whip up some XPath expressions and a simple Java class to resolve them and I was done. No bubbles, no troubles. Right?


Well I was forgetting that a WebSphere profile with one node and one server contains more than 100 individual XML files ranging in size from <1K to 600K+. If I thought using the diff tools was tedious, it was nothing compared to generating thousands of XPath expressions by hand. Plus, every time the product expanded into new areas of configuration, I was looking at hours of updating the automation data to keep it current. I needed a tool that would generate these XPath expressions for me. I had some specific needs that weren’t addressed by the sample classes and open source solutions I was finding on the web so I decided to write my own. Keep in mind:

 

  • Coding isn’t my forte. You may find bone-headed logic or hacky work-arounds. But for my purposes it works and it’s quick. Consider it your chance to test the tester and file a bug by leaving a comment. I’d love to get feedback and improve it!
  • This is a simplified version of the class I use and it works fine on simple XML. This sample class has some limitations. It doesn’t handle namespace designations and I haven’t added handling for special characters. I encourage you to play around with it and add methods that suit your purposes.

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.apache.commons.lang.StringUtils;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class XPathGenerator {

/*
* fileList - a list of files to generate XPath expressions from xpath - an
* xpath instance used to verify generated xpaths
*/
private static List fileList = new ArrayList();
private static XPath xpath = XPathFactory.newInstance().newXPath();

/*
* Simple utility method that checks for whitespace at the beginning a text
* node
*/
private static boolean isWhiteSpace(String nodeText) {
if (nodeText.startsWith("\r") || nodeText.startsWith("\t")
|| nodeText.startsWith("\n") || nodeText.startsWith(" "))
return true;
else
return false;
}

/*
* Simple utility method to verify brutishly assembled xpath expressions
*/
private static void checkXPath(String xpathExpression, Node node) {
Object xpathCheck;
try {
xpathCheck = xpath.evaluate(xpathExpression, node
.getOwnerDocument(), XPathConstants.BOOLEAN);
if (xpathCheck.toString() == "true") {
/*
* print the file/xpath combo to use for future verification For
* Example:
* file:/C:/tmp/sample.xml=/rootNode/firstChild/firstGrandChild
* [@attrib="value"][text()="some text"]
*/
System.out.println(node.getOwnerDocument().getDocumentURI()
+ "=" + xpathExpression);
}
} catch (XPathExpressionException xpe) {
System.out.println("\n\n" + xpathExpression);
xpe.printStackTrace();
}
}

/*
* Simple utility method to check for a text node on the currenrt XML
* element
*/
private static boolean hasValidText(Node node) {
String textValue = node.getTextContent();

return (textValue != null && textValue != ""
&& isWhiteSpace(textValue) == false
&& !StringUtils.isWhitespace(textValue) && node.hasChildNodes());
}

/*
* Simple utility to check for attributes on the current element
*/
private static boolean hasValidAttributes(Node node) {
return (node.getAttributes().getLength() > 0);

}

/*
* Build XPath attribute list for individual XML Element Nodes Iterate
* through all attribute nodes and build a string that can be included in an
* XPath expression to validate an XML document. Resulting string will look
* like: [@attrib1="value1" and @attrib2="value2"] Skip this if the
* attribute list is empty.
*/

private static String buildAttribString(Node node, String pathExpr) {
NamedNodeMap nnlist = node.getAttributes();
pathExpr = pathExpr + "[";

int attribCount = 0;
// iterate over attributes
for (int i = 0; i < nnlist.getLength(); i++) {
// grab attribute name and value
String attribName = nnlist.item(i).getNodeName();
String attribValue = nnlist.item(i).getNodeValue();

// if we've already added attributes to the path expression append
// the current one
if (attribCount > 0) {
pathExpr = pathExpr + " and ";
}

pathExpr = pathExpr + "@" + attribName + "=\"" + attribValue + "\"";
attribCount++;
}

pathExpr = pathExpr + "]";

return pathExpr;
}

/*
* processNode checks for attributes and text nodes on an xml node and
* process them accordingly
*/
public static NodeList processNode(Node node) {

if (hasValidAttributes(node) || hasValidText(node)) {
String pathExpr = "/" + node.getNodeName();

// check for attributes
if (hasValidAttributes(node)) {
pathExpr = buildAttribString(node, pathExpr);
}

// Make a copy of node to preserve it's state
Node tmpNode = node;

// Build pathExpr for XPath Expression by working backward through
// the XML until we hit the document node.
while (tmpNode.getParentNode() != null
&& tmpNode.getParentNode().getNodeType() != Node.DOCUMENT_NODE) {

tmpNode = tmpNode.getParentNode();
String nodeName = tmpNode.getNodeName();

if (hasValidAttributes(tmpNode)) {
String attribString = buildAttribString(tmpNode, nodeName);
pathExpr = "/" + attribString + pathExpr;
} else {
pathExpr = "/" + nodeName + pathExpr;
}
}

if (hasValidText(node)) {

/*
* Build XPath text value expression to verify text values for
* node. This is a little tricky because linebreaks, tabs and
* spaces included in the XML document are considered text
* nodes.
*/

// Copy original node to a textNode to preserve state of
// original node for future operations
Node textNode = node;

/*
* This check iterates through child nodes of the original node
* until it reaches the next element node. As long as the node
* is not null, starts with white space and is not an element
* node, move on to the next node
*/
while (textNode != null
&& isWhiteSpace(textNode.getTextContent()) == true
&& StringUtils.isWhitespace(textNode.getTextContent())
&& textNode.getNodeType() != Node.ELEMENT_NODE) {
textNode = textNode.getFirstChild();
}

/*
* If the text content of the current node is not null, doesn't
* start with whitespace and has child nodes, we grab the text
* and frame it into an XPath text argument. A sample element
* and it's resulting text argument:
*
* some text here YIELDS
* element1[text()="some text here"]
*
* NOTE: The check for child nodes prevents us from creating a
* blank text argument for self contained elements (example:
* )
*/
pathExpr = pathExpr + "[text()=\""
+ StringUtils.strip(textNode.getTextContent()) + "\"]";

}

checkXPath(pathExpr, node);
}
// gather children and return
return node.getChildNodes();
}

// This function takes a group of nodes and generates XPath expressions for
// them until it runs out of nodes
public static void processNodeList(NodeList nodelist) {
for (int i = 0; i < nodelist.getLength(); i++) {
if (nodelist.item(i).getNodeType() == Node.ELEMENT_NODE
&& (hasValidAttributes(nodelist.item(i)) || hasValidText(nodelist
.item(i)))) {
processNode(nodelist.item(i));
}
processNodeList(nodelist.item(i).getChildNodes());
}
}

// This function takes the cmd line argument and adds to fileList all files
// in the directory and it's subdirectory
private static void processDir(File dir) {
File[] fileArray = dir.listFiles();

for (int i = 0; i < fileArray.length; i++) {
if (fileArray[i].isFile()) {
if (fileArray[i].toString().endsWith(".xml")) {
fileList.add(fileArray[i]);
}
} else if (fileArray[i].isDirectory()) {
processDir(fileArray[i]);
}
}
}

// This starts the process of an individual XML file
private static void processXML(File xmlFile) throws SAXException,
IOException, ParserConfigurationException, XPathExpressionException {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory
.newInstance();
factory.setIgnoringComments(true);

DocumentBuilder parser = factory.newDocumentBuilder();
Document doc = parser.parse(xmlFile);

NodeList startlist = doc.getChildNodes();

processNodeList(startlist);
} catch (Exception e) {

}
}

/*
* Entry point. Pass in the xml file or top level directory of a group of
* files to generate XPath expressions for them.
*/

public static void main(String[] args) {
File startDir = new File(args[0]);
try {
if (startDir.isFile()) {
processXML(startDir);
} else if (startDir.isDirectory()) {
processDir(startDir);
for (int i = 0; i < fileList.size(); i++) {
processXML(fileList.get(i));
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

In Xpath ExpressionsXML
Comment (3) Read More...