XZ Backdoor: “That was a close one”

XZ Backdoor: “That was a close one”

Backdooring SSH

A nefarious or compromised maintainer inserted malicious behavior in a library named liblzma, part of the xz compression tools and libraries, resulting in a backdoor in SSH. This is an advanced software supply chain attack as the library was intentionally modified for the backdoor, with obfuscation and stealth techniques for hiding the attack payload from reviewers.

It was discovered and disclosed recently (on past Mar 29th), and the attack handling is ongoing. However, it was quickly contained as it seems to affect only pre-release versions of a limited set of environments (DEB and RPM packages, for the x86_64 architecture, and built with GCC). Anyway, the CVE was given a CVSS base score of 10, which is reserved for the most critical cybersecurity vulnerabilities. Should it enter stable distributions the impact would be overwhelming. 

The technical analysis of the attack, including the xz backdoor explained in depth, was analyzed elsewhere. This post will focus on the timeline of the attack, how it could be detected, how the incident was handled up to date, and what lessons may be extracted from the attack.

How the XZ backdoor was injected

Note: The git repository is in git.tukaani.org. However, there was also a GitHub hosted repository (currently blocked) where the GitHub account was posting the changes that were later integrated into the Git repository.

One portion of the backdoor seems to be only in the distributed tarballs for the 5.6.0 and 5.6.1 versions, not in the git repositories and relies on a single line in the build-to-host.m4 macro file used by autoconf. The other portion was in two supposed testfiles bad-3-corrupt_lzma2.xz and good-large_compressed.lzma

that were committed by the GitHub account “Jia Tan” (JiaT75) in the xz repository on 23 Feb. It was an innocuous change adding testfiles (supposedly .lzma and .xz compressed blocks). Interestingly enough, the test files were not used by the tests! The line in the .m4 file injects an obfuscated script (included in the tarball) to be executed at the end of configure if some conditions match. It modifies the Makefile for the liblzma library to contain code that extracts data from the .xz file, which after deobfuscation ends in this script, is invoked at the end of configure. It decides whether to modify the build process to inject code: only under GCC and the GCC linker, under Debian or rpm, and only for x86_64 Linux. When matched, the injected code intercepts execution by replacing two ifunc resolvers so certain calls are replaced. This causes the symbol tables to be parsed in memory (this takes time, which led to the detection, as explained later).

Then things get interesting: The backdoor installs an audit hook into the dynamic linker, waiting for the RSA_public_decrypt function symbol to arrive, which is redirected to a point into the backdoor code, which in turn calls back libcrypto, presumably to perform normal authentication. And the payload activates if the running program has the process name /usr/sbin/sshd. It was clear that SSH servers were the target. Traditionally, sshd servers like OpenSSH were not linked with liblzma, but sshd is often patched to support systemd-notify so other services can start when sshd is running. And then liblzma is indirectly loaded by systemd, closing the circle.

The backdoor is not yet fully analyzed, but it seems to be allowing remote command execution (RCE) with the privileges of the sshd daemon, running in a pre-authentication context. Info from the remote certificate, when matched by the backdoor, is decrypted with ChaCha20, and when it decrypts successfully it is passed to the system(). So this is essentially a gated RCE, much worse than a mere public key bypass. 

A later 5.6.1 tarball showed additional efforts to hide the traces, adding further obfuscation for symbol names, and trying to fix the errors seen. An extension mechanism where additional test files were looked for certain signatures to add to the backdoor was also put in place.

This fairly sophisticated attack could pass unnoticed until stable Linux distributions are reached. Fortunately, some people like to check why abnormal things happen.  

Boost your SCA with Xygeni Open Source Security

Download our brief to see how we protect your open-source dependencies from vulnerabilities and threats.

The Discovery

Many times injected malicious behavior is unearthed by chance or accident. A good example was a deprecation warning (“Who cares about warnings?”) that led to the discovery of the event-stream attack in Oct 2018. Another is the user who warned Codecov in April 2021 that their bash uploader script did not pass the checksum (“Who verifies the integrity of artifacts with checksums”?)

Anomalies and odd symptoms with ssh logins (logins taking a lot of CPU and increased elapsed time, valgrind errors) aroused the curiosity of Andres Freund, a vigilant PostgreSQL developer but not a security analyst (as he stated). After some investigation with OpenSSH on Debian Sid, he concluded that a response time problem relied on a library, liblzma, part of the xz-utils compression library. The reason: “the upstream xz repository and the xz tarballs have been backdoored”. This diagnostic was so accurate!  

On Mar 29 2024 Andres posted in Openwall the first analysis: “backdoor in upstream xz/liblzma leading to ssh server compromise”.  The fact: XZ Utils 5.6.0 and 5.6.1 tarballs contain a backdoor. These tarballs were created and signed by the aforementioned Jia Tan account. 

He posted in Mastodon later that day, recognizing that the discovery was accidental and required a lot of coincidences. The comments from other users are worth reading.

GitHub user thesamesam (aka Sam James) published a nice Gist FAQ on the xz-utils backdoor where the attack was summarized, linking into more in-depth analyses of the attack payload.

These analyses were technically juicy, and helped us to better understand the injection, which was highly elaborated:

This nice poster from Thomas Roccia  shows part of the activity of JiaT75 on the GitHub repository, and how the injection script inserts the binary backdoor, further illustrating the xz backdoor explained.

How the incident was handled

Disclosure by Andreas Freund was cautious because, in his own words:

“Given the apparent upstream involvement, I have not reported an upstream bug. As I initially thought it was a debian specific issue, I sent a more preliminary report to security@...ian.org. Subsequently, I reported the issue to distros@. CISA was notified by a distribution.”

Red Hat assigned this issue CVE-2024-3094. Then the word circulated like wildfire.

Lasse Collin, the other maintainer for the XZ, added a new commit on Sat 30 Mar titled “CMake: Fix sabotaged Landlock sandbox check”. One of the library sandboxing landlock methods was sabotaged, at least when building with CMake. He promptly disclosed the issue in the XZ Utils backdoor.

Red Hat assigned this issue CVE-2024-3094 (see also in CVE, NVD, Ubuntu). It was assigned a whopping CVSS Base Score of 10. Such scores always take the Internet by storm.

CISA on the same Mar 29th released an alert, perhaps too simplistic due to the urgency, recommending users to downgrade to the 5.4.6 stable version.

GitHub repositories under the Tukaani organization were disabled (is this good or bad? I think good: Many distros and organizations were still linking to the GitHub releases to source the infected tarballs for building. Disabling the repo prevents that. There is anyway a copy or the repos at git.tukaani.org). The GitHub accounts JiaTan75 and Lasse Collins’ (Larhzu) were suspended as well. This is part of the containment, even when it may affect innocent people. JiaT75 activity in non-disabled repositories cannot be seen yet.

The industry reacted promptly. Many vendors published rules for detecting vulnerable systems, like Yara rules, or support in commercial tools from Sysdig, PAN, and others. Security specialists like James Berthoty posted about reviewing how we approach open-source software. 

We are now in the Eradication and Recovery phase of the incident. Other projects maintained by JiaTan75 are under close review, notably the libarchive/libarchive (where JiaTan75 was a regular contributor) and the fuzzer oss-fuzz (where this commit made by JiaTan75 tried to avoid oss-fuzz, which in fact was not able to detect the backdoor). These concealing attempts add further evidence. 

Who is under the attack?

Either the GitHub JiaT75 account was compromised (remember that GitHub mandated 2FA recently) or the physical user owing the account went to the dark side. But there are compelling reasons to think of an advanced persistent threat (APT), perhaps state-backed, due to the technical sophistication of the attack. Further investigation by cybersecurity agencies and law enforcement will tell …

This entry in YCombinator Hacker News about Jia Tan throws some light on the “who” and his activity. Recommended! It gives much information about how the bad guys try to deceive other users, using social engineering.

“Very annoying - the apparent author of the backdoor was in communication with me (rwmj) over several weeks trying to get xz 5.6.x added to Fedora 40 & 41 because of it's "great new features". We even worked with him to fix the valgrind issue (which it turns out now was caused by the backdoor he had added). We had to race last night to fix the problem after an inadvertent break of the embargo. He has been part of the xz project for 2 years, adding all sorts of binary test files, and to be honest with this level of sophistication I would be suspicious of even older versions of xz until proven otherwise.”

Jia Tan took measures to prevent being tracked: It seems to have used VPN (vpn.singapore.witopia.net) to connect – which is ok per se. And many changes seem to be backed by temporal, one-usage emails (from ProtonMail in this case) urging to merge changes.

The actor might intend to go even deeper, up to the Linux kernel, as the contributor to the xy-embedded project. An initial analysis found no evidence of miscarriage, as of today.

Note: another low-profile XZ contributor “Hans Jansen“ (GitHub user “hansjans162”) is under scrutiny. Its account at debian is now blocked. He made many updates to Debian Games to conceal the one he wanted on debian/xz-utils, an update to upstream 5.6.1 to hurry up with the distribution of the backdoor to debian/unstable

All we can say, for now, is that this is a (yet unidentified) APT using different accounts, working for at least two years on this campaign, and patiently working to implant an RCE in SSH.

Could this be prevented?

Rather difficult. 

First, part of the injected backdoor came into compressed test files that were not used by tests. Retrospectively that could raise some (noisy) alarms, but who cares about checking that all test files are used by actual tests in the real world? Second, part of the injected backdoor came in macro files into the release tarballs, and it is difficult to check manually for differences with the expected tarballs. Automation is also complex, as the expected result from the build itself (for anyone who knows how automake/autoconf works) is difficult to model for analyzing if the real tarball matches the expectations. Some posed it as “The tarballs mismatching from the git tree is a feature, not a bug”. Provenance for binary tarballs from its source code is an unresolved problem.

User reputation? Well, the JiaTan75 GitHub account was not doing rogue things according to the past commits. It was suspended only after the evidence accumulated, but up to 29 Mar that was a regular user performing normal business. Well, not so normal. Later commits (this, this, this, and this which adjusted the exploit code) tried to fix the valgrind errors and crashes in some configurations, due to differences with the stack layout expected by the backdoor. Commit reviews could detect this, but who has the patience to analyze changes in a binary test file or the real motivation for a change in GCC attributes in C source code?

Should one raise alarms when an SSH login takes 800 ms instead of 300 ms? Probably only hyper-prudent people would take note. Cicero said, “Rashness belongs to youth; prudence to old age.”  

The ifunc infrastructure was added in Jun 2023 by “Hans Jansen” and “Jia Tan”. This is the first commit adding ifunc support to crc64_fast.c (later used to inject the backdoor). Months before injecting the backdoor binaries in the test files!

Note: Author and committer differ here, but this is normal: Lasse Collin is the project maintainer, and he merged the changes. He even thanks “Hans Jansen” …

Nobody raised concerns before Andres Freund’s post and the CVE created by RedHat. If you see a cascade of tools that would catch this, they detect the affected component now, ex post facto

Probably the best prevention came from the nature of Linux distributions, and how unstable, bleeding edge versions only pass to stable distributions downstream following a paced process.

Lessons Learned

We have noted how difficult it is to detect intentional backdoors. Backdoors should be considered an internal threat, as they are planted by internal staff or via compromised internal accounts. And those guys are mostly trusted. And when the backdoor is implanted in the distributed artifact, it makes it harder to detect.

Some authors like Kevin Beaumont pointed at system, which opens a large attack surface of third-party services to backdoor. This is what the bad actor abused here. Systemd has a lot of eyeballs, but XZ is an obscure library up the chain. “When the upstream is tainted, everyone drinks poisoned water downstream”.

An unrelated change request in the system for dynamically loading compression libraries, which would remove the backdoor, was already merged into the system but not delivered yet. The extra dependencies introduced by libsystemd may be the source of vulnerabilities, and yesterday this request was opened

A comment in the “xz: Disable ifunc to fix Issue” commit gave a sharp insight on where to put the focus if we want to prevent such activity (emphasis is mine):

“The lesson that we should learn as a community is more to secure software supply chain security holistically, auditing build systems beyond just source code. Like the SolarWinds breach where attackers modified software updates for SolarWinds closed-source monitoring software offering.”

The early discovery and prompt reaction limited the impact so much. If you remember the ending scene from Men in Black III:  “That was a close one”. Once again, K did not forget to leave the tip. And no boglodite entered Linux stable distributions.

1.  “I am *not* a security researcher, nor a reverse engineer.”
2.  Jia is a common Chinese given name. Tan is also a common family name meaning “magnificent”. Many unrelated people share this name, please do not condemn anyone by this name!

Explore Xygeni's Features!
Watch our Video Demo

Secure your Software Development and Delivery

with Xygeni Product Suite