1

I have an XML of the form:

<?xml version="1.0" encoding="UTF-8"?>
<semseg:Envelope xmlns:semseg="http://a-random-URL" xmlns="http://another-random-URL">
    <semseg:subject>Subject</semseg:subject>
    <semseg:Sender>
        <semseg:name>Me</semseg:name>
    </semseg:Sender>
    <Triangle>
        <Triangle time='2017-11-29'>
            <Triangle key='a' value='b'/>
            <Triangle key='c' value='d'/>
            <Triangle key='e' value='f'/>
            <Triangle key='g' value='h'/>
        </Triangle>
    </Triangle>
</semseg:Envelope>

And I am trying to retrieve the element <Triangle> (not <Triangle time='2017-11-29'> - element names are a bit repetitive in this XML) using XPath. Part of the code is the following:

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document doc = documentBuilder.parse("file.xml");

XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xpr = xPath.compile("/semseg:Envelope/Triangle");
NodeList nodes = (NodeList)xpr.evaluate(doc, XPathConstants.NODESET);

I have tried many possible combinations for the XPath without any luck unfortunately since no elements are selected. Nevertheless, testing the same XPath with this online XPath checker and the same XML file yields exactly the results I am looking for. It evens works for attribute retrieval using XPaths like

/semseg:Envelope/Triangle/Triangle/@time

Seems like there is a problem with the namespace prefixes. Parsing XMLs without any namespace prefixes works just fine with XPath.

4
  • 1
    Have you had a look at stackoverflow.com/questions/7020638/… and stackoverflow.com/questions/3939636/… ? Commented Sep 19, 2018 at 13:35
  • I think you can try this, " //Triangle[not(@*)]" as your xpath Commented Sep 19, 2018 at 13:36
  • @reflexdemon this does not work unfortunately Commented Sep 19, 2018 at 13:45
  • @GPI I cannot really use a namespace context since the prefix is only applicable to some of the elements in the XML. You can see that Envelope does have such a prefix but Triangle does not. Commented Sep 19, 2018 at 13:48

2 Answers 2

2

Your XML input actually has two namepsaces.

Default namespace

The first is the default one, declared as such :

<semseg:Envelope ... xmlns="http://another-random-URL" ...

Being the default one, any XML element that has no namespace on it belongs to this default namespace.

semseg namespace

Defined as such :

<semseg:Envelope xmlns:semseg="http://a-random-URL" ...

Meaning every XML element prefixed with semseg belongs to this namespace.

Translating your requirements

So you're aiming at an XPath expression that will target

  • any Triangle element (no prefix, so that actually translates to any Triangle element from the http://another-random-URL namespace).
  • That is a direct child of a root semseg:Enveloppe element (that actually translates to a root element of the local name Enveloppe belonging to the "http://a-random-URL" namespace).

Programming this in XPath.

We create a NamespaceContext that describes what namespaces we are working with : I define prefixes that I wish to work with, and map them to the namespaces. These prefixes will be used by the XPath engine. I map :

  • The main prefix to the http://a-random-URL namespace
  • The secondary prefix to the http://another-random-URL namespace

Using this mapping that I defined, I can translate your requirement to this XPath :

/main:Envelope/secondary:Triangle

And this works :

XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
xPath.setNamespaceContext(new NamespaceContext() {
    @Override
    public String getNamespaceURI(String prefix) {
        if ("main".equals(prefix)) {
            return "http://a-random-URL";
        }
        if ("secondary".equals(prefix)) {
            return "http://another-random-URL";
        }
        return null;
    }
    @Override
    public String getPrefix(String namespaceURI) {
        // This should be implemented but I'm lazy and this sample works without it
        return null;
    }

    @Override
    public Iterator getPrefixes(String namespaceURI) {
        // This should be implemented but I'm lazy and this sample works without it
        return null;
    }
});
XPathExpression xpr = xPath.compile("/main:Envelope/secondary:Triangle");
NodeList nodes = (NodeList)xpr.evaluate(doc, XPathConstants.NODESET);
System.out.println(nodes.getLength());

Outputs :

1

Here I have implemented a really dumb namespace context, but if you hava Spring framework, CXF, guava (I think), or other frameworks at reach, you often have something like SimpleNamespaceContext or MapBasedNamespaceContext that are probably better options.

Sign up to request clarification or add additional context in comments.

Comments

1

This is working for me

/\*[local-name()='Envelope']/\*[local-name()='Triangle']/\*[local-name()='Triangle']/@time

1 Comment

This does work but I was looking for something more elegant and concise that's why I upvoted it but did not mark it as an accepted answer . Thanks a lot though!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.