open-source packages

Protecting Against Open Source Malicious Packages: What Does (Not) Work

This is the third episode in a series of articles about the most prevalent kind of software supply chain attacks: those that abuse a public registry of open-source software components. After analyzing in the previous episode “Anatomy of Malicious Packages: What Are the Trends?” how the bad actors inject malicious behavior into new or existing published components, we are ready to put on our firefighting jackets and examine how we can successfully block malicious software delivered this way, or alternatively, deal with a potentially serious cyber incident because we took the wrong approach.

Most security-aware professionals have ideas about how to handle this threat. We have heard security managers saying without hesitation that SCA tools already tell you when a package version is malware. Or that they depend on well-known, highly reviewed software components, where any malware would be promptly detected and removed. They use open minor/patch versions for automatically getting vulnerability fixes, and that is the proper, recommended way to lower the risk on open source dependencies, following the “patch early, patch often” principle. 

In this episode, we will review why these ideas are wrong, and how such misconceptions are contributing to the popularity of this attack mechanism, and to an overwhelming risk that organizations are experiencing. We will end with what does work, and which is the effort and resources involved.

Common Misconceptions

During our journey with software security, we saw the attack techniques evolving and a wide range of ideas from security-conscious people. Organizations often misunderstand what works against this threat, so first we will examine what does not work, condensed in the following, not exhaustive, list of misconceptions.

Misconception #1: SCA tools already report malicious components

Indeed! But after the fact… When probably it is too late if the element was used in a software build, and the bad actors have already gained a foothold in a developer or CI/CD host. Secrets might have been exfiltrated, additional malware downloaded and installed, and perhaps the adversary moved laterally and already gained access elsewhere. 

Software Composition Analysis (SCA) tools were designed to identify potential known vulnerabilities. Modern tools do a great job by augmenting the signal-noise ratio, determining if the vulnerability is actually reachable or exploitable. But they are useless against new malware. Think of a malicious component as a zero-day vulnerability: Only when its malicious behavior is detected, the component is reported to the holding registry, which after a review by a security team is confirmed as malicious and removed from the registry [1]

At that point, the world (including SCAs) knows that installing or using the component (or some version(s) of an existing component) is not a good thing. But this is when the component is not available from the registry. Knowing that I have vulnerabilities in third-party components, or even components that were categorized as malicious by the registry is good, but unfortunately SCA or common audit tools do not help in this context. Unless the SCA/audit tool can really know in advance that a component is malicious before it is used at your organization.

Remember, any solution against malicious open-source components must detect them on-the-fly, between when the component is published in the registry and when the component (version) is first used at your organization. And that includes transitive components.  

Misconception #2: Controlling installation scripts at build time prevents malicious behavior from open-source components

Various package managers offer the ability to run scripts (included in the component tarball [2]), for legitimate reasons, such as compiling required items on different platforms, generating code, or running tests, and we should all know that they can be abused by bad actors if malicious scripts are included in the tarball, or if the attacker can make a malicious script run instead of the good one.

Knowing this, we can configure the package manager to ignore scripts. For example, with NPM the –ignore-scripts flag (or a configuration property in the .npmrc file)  skips the scripts during installation. This may produce some issues because running scripts is common in many ecosystems: Some package managers do not even permit disabling script execution (hint: prompt “Which package managers do not permit disabling the execution of install scripts?” in your favorite AI).  But this does not protect in general (we need to enforce that the skip disable configuration is everywhere). 

And when the malicious behavior is not located in install scripts but in the software to execute at runtime, this option alone does not protect us. 

Misconception #3: Version pinning prevents malicious components from being installed

There is a tradeoff between patching early and often with open versions (letting the package manager to automatically install new updates when available for security fixes) and version pinning (having all the direct and transitive dependencies for a software at a fixed version). The security principles are stubborn and sometimes contradictory, as happens with “patch early, patch often” and “upgrading should not be taken lightly”. Some package managers make automatic updates with server ranges the recommended way. Great if you also want to receive the malicious updates! Yes, components must be updated to receive security fixes that close vulnerabilities as soon as possible, but … never let the package manager do this automatically.

Misconception #4: Using trusted components is safe. Any malicious version would be promptly found, disclosed and removed.

Why is a component trusted? Possibly because it is highly popular, with many eyeballs looking for vulnerabilities, a large number of contributors for maintenance, with multiple core maintainers who diligently review all pull requests. The reality is quite different. Some essential components are maintained by a single, unpaid developer. Widely used frameworks have a few regular contributors, with a quickly decreasing number of commits per maintainer (popular projects have a long tail of contributors that perform some drive-by commit and never come back). And popular projects with a single maintainer abound.

Imagine yourself saying “Oh, we are using Spring Boot / Angular / React / PyTorch / official base Docker images, so the risk you are talking about is pretty low.” Perhaps that is true, we security vendors scare-mongering all the time, and interfering with development teams to mitigate a debatable risk is nonsense. You might be tempted to jump to the risk acceptance paragraph (in the next section) and all done. Unfortunately, the most popular components are targets for bad actors, and for example, the popular PyTorch library was attacked in the past.

“Promptly found, disclosed, and removed”.  It takes days for a new malicious component to be removed from the public registry. Registries are cautious about removing a component version, for the good. Our experience is that once reported from our side, the median time for the registry to remove the affected version is 39 hours, more than a day and a half. There are malicious components that are a week after our initial reporting in the registry before removal. And in some cases, the component is removed only after a victim or an incident response company reports an incident involving the component. 

What Does NOT Work Against Malicious Components

Any unspecific approach will fail miserably. This is a certainty, you are not providing effective countermeasures for the risk associated with this threat. 

Traditional SCA tools tell you about known malware but have a large exposure window. Unless they are proactively performing malware detection with enforced blocking of malicious components, they do not work against this threat. 

Disabling installation scripts could help but needs to be enforced everywhere a component needs to be installed. Same with version pinning, as versions cannot be pinned from a safe initial state forever.

Assuming that popular components get enough attention that they cannot be injected with unintended behavior in a supply chain attack without an almost instantaneous detection to prevent any damage is naive and risky. You don’t want to live on the edge, do you?

If you stop at this point, then risk acceptance is the only thing you can do: This is a decision that needs to be documented in your threat model/risk assessment, including the rationale for accepting the risk and its potential implications. Raise awareness by communicating it to management and other relevant parties.  Some contingency could be planned when a malicious component is installed or included in your software, but this is hard because attackers have many paths to follow. The details of a supply chain attack based on the use of a malicious component will drastically change the public disclosure of the incident, which probably is mandatory under your organization’s regulatory framework. You may also address compensating controls or transfer risk e.g. with insurance.

However, there are controls that address the threat and should be considered if you are not satisfied with risk acceptance. Please read on.

What Does Work Against Attacks Using Malicious Components

Solid Version Handling

Version pinning with controlled and informed version bumps is the way to go, to balance the need for removing vulnerabilities without receiving malware. But remember misconception #3: Version pinning alone does not suffice to block malicious code coming from new versions, because you will need in the future to update versions in any direct or indirect dependency. At that moment you need evidence strong enough that all modified versions do not contain malware.

Early Warning

One approach to the problem of malicious components is an early warning system (named here as Malware Early Warning or MEW), where new versions published (for new or existing components) are analyzed by a detection engine, which when enough evidence is found may classify the new version as potentially malicious. 

Automation is essential here, as it is impossible to manually review all the new components at the current publishing rate. So the detection engine needs to combine a variety of techniques, perhaps including static, dynamic, and capability analysis, user reputation, and evidence coming from discrepancies between the component metadata and the tarball contents, or between tarball and the source repository where the component supposedly comes from.

There is a dark zone between the publishing time and when the engine analyzes the component contents, but it should not exceed a few minutes. The scheme can be modified, for example by waiting for new components to be analyzed before allowing them to be installed and used in the software build pipelines, or analyze them on demand when needed. A component at a given version is immutable [3], so it needs to be analyzed only once.

Full automation is not possible, and a security review for potentially malicious components is needed. Beware of digital panacea proponents: AI and Machine Learning are not developed enough to take the last word when it comes to confirming if a suspect component has malware. Sure, machine learning plays a key role in the detection engine in classifying the input component from the raw evidence captured, but once the component is “quarantined” the final word is on the manual review by a security team with experience in malicious components. This confirms any potential malware or re-classifies it as safe. And the time period is in the hours range. 

The registry reports on the malicious version/component; the registry then performs its review to confirm and proceeds to public disclosure and removal from the registry. Some registries keep a security holding package. The time range here is the days or weeks since the publication, which is the ‘dwell time’ or ‘exposure window’ for most malicious components.

Is it possible to know if a component version is malicious?

So for early warning, we need to give a satisfactory answer to this question: How can I know that a library or package is (not) malicious? How to gather enough evidence of malicious behavior? Possible, but difficult, as the adversaries use much ingenuity to avoid detection. There are different approaches, each with pros and cons.

Static analysis can examine all execution paths check for techniques used by attackers without running the component, and perform preprocessing tasks like de-obfuscation or deciphering. As attackers try to hide their mischief, obfuscation attempts are indeed evidence of malware (but note that legit components obfuscate code for preserving intellectual property, contradicting “open source”). Only a minority of highly sophisticated attacks with strong obfuscation need sandboxing, but such strong obfuscation is a tell-tale sign of maliciousness. Please note that conventional SAST tools were designed for unintentional vulnerabilities, not for malicious intent like backdoors.

Dynamic analysis runs the component and examines the response by instrumenting the runtime, typically by providing a sandboxed environment. Malicious behavior triggered under certain conditions may pass undetected: please note that malware may use evasion techniques like Virtualization/Sandbox Evasion to activate only when not under scrutiny, and also a tell-tale sign of malicious activity for any static analysis engine.

Capabilities analysis considers what the component does: where it connects to, which files it accesses, which commands or programs are run, the terminal or device I/O performed, or which system calls are invoked. This fingerprinting of behavior could be compared (for an existing component) across versions, so when unexpected behavior is detected, that evidence could raise suspicion of potential malicious activity injected in the new version. This approach follows the triage steps that security analysts follow when faced with potential malware: an inspection using strings or similar tools. This approach detects malicious behavior regardless of triggering conditions and works when no source code is available.

Context analysis collects information about how the component was published and by whom. Bad actors’ campaigns often use a new user account(s) not subject to any strict vetting process. Tracking past activity may give insights into the underlying user, mostly for anomalies that may hint at a potential compromise. Reputation is so hard to earn and so easy to lose! A user with no past activity is neutral, but karma pursues the malevolent. Hacktivists, or normal users who have their publishing credentials stolen should be tracked carefully.

Another contextual information is any discrepancy between the source repository supposedly used to create the component tarball and the contents of the tarball itself. And also following good practices, like creating tags or releases in the source repository matching the versions of the component published in the public registry. When the source repository at a particular commit is tagged with release, and then suddenly one version fails to follow it, that alone is strong evidence that the component could be tainted: the bad actor might have compromised the account used for publishing the component, but has no write permissions in the source code repository). Many attacks are routinely detected using these rules: for example, the Ledger attack could be easily detected along these lines. Context analysis, therefore, identifies such anomalies in the publishing process.

Dependency Firewalling

A different approach is to have a comprehensive whitelist of components for all dependency graphs used in your software, so in any build pipeline run in your organization only approved component versions can be installed and used. The “firewall” is enforced using an internal registry where the tarballs for the allowed component versions are served (cached or proxied). Please note that any whitelist will not work unless you have the technology for classifying any new version as reasonably safe so it can be added to the whitelist. 

Please note that early warning (quick detection as soon as possible after the new version publication) needs to be combined with some way to use that information proactively to block the component affecting the build pipelines or the developers’ machines [4]. We call this “dependency firewalling”: a quarantine mechanism for protecting automated builds from malicious packages. Internal packages and image registries are good to insulate organizations from outer evil, but evidence strong enough is necessary to make quarantine effective. 

Runtime Sandboxing

An alternative approach for detection at publishing time is to analyze behavior at runtime. The idea is to capture the expected behavior from the software and detect (or block) any anomalies that are found. This line of action has the problem of having to instrument the runtime for monitoring or blocking, and it is a promising idea that will be added to the arsenal of protection mechanisms against the malicious component pest.

Setting a Comprehensive Strategy

The recommended strategy needs to combine different techniques in the software development process, taking control of the version updates to block incoming malicious components. We must accommodate version pinning to avoid automatic infection with updating versions to get fixes for the vulnerabilities that matter; a quick and efficient assessment of direct and indirect dependencies during version updates to have enough evidence that they are not malware-ridden. Builds of software that depend on known malicious components must be blocked. And all must be enforced.

Use version pinning, when possible, as it makes builds more reproducible. Version pinning with controlled, manually approved version bumps, and assisted by helper technology, should assess if the update brings malware or breaks the software, and reconcile updating for fixing vulnerabilities with avoiding malware infection. Tooling can help here, by (1) prioritizing which vulnerabilities really matter (reachable and exploitable, with a high risk of being targeted by attackers), (2) selecting the target versions that are compatible with the current component usages and do not break the software, (3) choosing target versions that do not contain malicious behavior, and (4) making the version update for direct and indirect dependencies a snap, by suggesting changes in the manifest files that could be quickly approved. The step (3) needs specific information about malicious components as close to their publication time as possible.

This process of updating dependencies must be enforced and verified in all places. The process must be documented, and all parties involved should be trained, as often the development and software build/deployment is externalized. The CI/CD pipelines should be modified accordingly, so automation does not allow a malicious indirect dependency to slip into the build: guardrails blocking the build if there is enough evidence of potential malware in a dependency is the recommended way to go. 

If your organization has an internal registry acting as a security proxy for holding the allowed component versions, you must obtain intelligence on malicious components (besides other criteria) for vetting a requested component before adding it to the allowance list. 

Consuming open-source software with safety is not easy, and the malware factor must be fully taken into account, with similar effort put into vulnerability handling.

One final note: Source provenance, in the form of software attestations, generated at the build time of the component, is another key piece in the effort to trace the artifact (component tarball) with the sources and build process that produced it. Note that this link between the source snapshot + build environment and the associated software artifact (signed by the trusted build system) does not prevent per-se that the component does not contain malicious behavior, but makes it harder for the bad guys to inject malware. And making provenance validation a common requirement for consuming open source components will take a long time, and only recently added to NPM. Making those trusted build and deploy systems tamper-proof, or enabling detection of any tampering in the build is a different story, out of the scope of this post. 

Further reading

The next episode Open Source Malicious Packages: The Xygeni Approach will present the strategy we follow at Xygeni for our Malware Early Warning (MEW) system. New package versions in the public package and image registries are scanned and evidence is obtained using a combination of static, dynamic, capabilities, and contextual analysis. The evidence, combined with user reputation and the history of changes in source code repositories, allows for an all-automated classification of a component into high-risk and probably malicious categories. The system learns from past evidence gathered from packages to reduce the false positives to a minimum. 

Subscribed organizations receive a warning notification for components they are using, directly or indirectly, when a malicious version is categorized. Then a manual analysis is done by our analysts, which confirms or rejects the classification. For confirmed malware, the public registry is notified so that it can perform its own analysis and typically remove the malicious version or take additional action, such as blocking or removing the user account in question.

We will explain how we are helping NPM, PyPI, GitHub, and other key infrastructures in the open source ecosystem to reduce the dwell time a new malicious component published remains active until it is confirmed malware and removed from the registry. And how organizations may benefit from the MEW system to have a much better protection against software supply chain attacks involving open source components.

  • [1] Anyway, users of the component need to check if the component tarball is cached or registered somewhere, for example in an internal registry, so the malady is eradicated.
  • [2] The packaged component includes a manifest that declares its contents and metadata, source or compiled code, installation scripts, and additional items such as test suites, according to a packaging format and typically in compressed form. This is called the “component tarball”.
  • [3] Even if the malicious actor can modify a published component due to a breach in the registry itself, a plain-old cryptographic digest can detect any change in the tarball after the analysis is done.
  • [4] Remember that some malicious components run at install time, so it can affect developer nodes that unwittingly run “npm install X” with X a malicious component.  

Open Source Malicious Packages: The Problem

Anatomy Of Malicious Packages: What Are The Trends?

Secure your Software Development and Delivery

with Xygeni Product Suite