Lack of Version Pinning and Dependency Confusion

In software development we depend on both own and third-party components or artifacts. A flexible Dependency Management is essential for modern software. Package managers like NPM, Mavenpip or NuGet are often used to specify software dependencies. These tools were designed with convenience and ease-of-use in mind, not security.

 

The problem

The problem is that flexibility and ease of use for developers calls the bad guys, who see software dependencies as irresistible charming for their business. The result: bad actors followed all possible attack paths shown here. Source: “Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks”

In this post we will focus on the use of open version declarations, in the sense that the downloaded version is not fixed but must belong to a certain range. At build time, the highest existing version compatible with the version range specified is chosen and downloaded / installed by the package manager.

Let’s illustrate open declarations in dependencies declarations for different package managers:

      • NPM: package.json

    {

     

       …

       “dependencies”: {
          …
          “accepts”: “>=1.3.8”,
          “lodash”: “~4.16.0”,
          …
       },
       …
    }

    The biggest existing version not below 1.3.8 for the accepts package will be installed, as well as the biggest ‘patch’ update for lodash in the 4.16.x range.

        • Maven: pom.xml

      ...
      <dependencies>
      ...
      <dependency>
           <groupId>commons-io</groupId>
           <artifactId>commons-io</artifactId>
           <version>RELEASE</version>
      </dependency>
      ...
      </dependencies>
      ...

      Last available release available for the commons-io (jar file) will be added as dependency.

          • Pip: setup.py

        ...
        setup(
            ...
            install_requires=['peppercorn', 'launchpadlib'],
            ...
        )
        ...

        Such open version schemes have a good and a bad side. The good one is that newer versions usually contain functional and quality improvements, bug fixes, and security patches, which are automatically upgraded. Please note that for most real-world project, fixes are not backported to previous minor releases, except perhaps for catastrophic security vulnerabilities. Open versions are also good for use in libraries, to reduce the number of versions that need to be installed when all dependencies are resolved.

        But open version ranges have a bad side. You don’t know exactly which versions will be installed at build time, and builds are not repeatable. And there is also a dark side with open versions. If a bad actor manages to publish a malicious component in the public repository with a high version compatible with your open range, your next build will include the malicious component, perhaps even executing malware in installation scripts that might be executed automatically. Obfuscating the attack payload is some of an art.

        This is known as the Lack of Version Pinning issue.

        Bad actors are always trying to put malicious versions of popular open-source packages. They may gain access to the keys for package repositories in a secret leak; they often use social engineering or hide a nested malicious dependency in an apparently useful pull request. Even a few authors themselves one day decide that the world is not fair and bit their clients with protestware in their own packages!

        Now imagine that you are working for an organization that uses internal components plus open- source ones.
        If a bad actor knows the name of such internal components, he/she can manage to publish a component with the same name in the public repository. Many package managers get first the public components, and if the version is properly chosen and the version in your declared dependency is open, boom! This issue is named Dependency Confusion.

        Let’s show an example. Suppose that in our NPM project we have a dependency on a private component:

            • NPM: package.json

          {
            "name": "my-project",

           

            …
            “dependencies”: {

              …
              “my-private-dep”: “>=1.0.0”,

              …

             }

             …

          }

           

          The attacker may create a high major version of my-private-dep (like 99.0.0) and publish it in the public npm repository, with its own fake account (the attacker does not need to do anything with my organization). The NPM package manager will install the malicious dependency, often with devastating outcomes.

          mohammad-rahmani-1bNQVGzuy0U-unsplash

           

          The solution

          To avoid these issues in our software build process, we should follow strict norms on how to declare component versions, that depend on the technology used. The important thing is that a specific version of a package, once published in a repository, should be immutable (to avoid breaking dependents, not only for security reasons).

          The general idea is to fixate (pin) versions, always checking that the fixed versions of the components (including ALL the transitive dependencies) are free of malware, and this is possible thanks to the lockfiles that many package managers offer. Let’s see how version pinning works for different package managers. There is a delicate tradeoff between frequent version updates for fixing known vulnerabilities and version pinning to avoid non-deterministic builds and potential supply chain attacks.

              • NPM:
                The npm or yarn package managers use different lockfiles (npm-shrinkwrap.json / package-lock.json or yarn.lock, respectively) that list fixed versions for all dependencies, direct and indirect. The lockfiles should be under version control, otherwise other developers / build nodes may end with different versions. Avoid npm install unless when in development you need to update the dependencies (e.g. to install security fixes). Use the more deterministic npm ci (Clean Install) in general, so the package manager will use the lockfile or terminate with error if no lockfile, or it does not match the package.json. If the versions listed were checked for malware, the lockfile ensures that nothing bad will happen at build time.

                For internal components it is recommended to create an NPM scope managed by the organization (like @myorg), and use that scope in the dependency (like @myorg/my-private-dep), which could have private visibility only. This blocks dependency confusion attacks, as only member of the organization with write access can publish packages under such scope.

              • Maven:
                Maven / Gradle do not have lockfiles (but see this StackOverflow article).

                Version ranges are not used with Maven/Gradle as much as with other ecosystems. Just avoid version ranges and LATEST or RELEASE meta-versions. Indirect versions should be checked also. The Versions Maven Plugin is a nice tool for version control.

                Please note that Maven always had the concept of organization scope (the groupId part of the dependency), and dependency confusion seems not to be a problem at all for that ecosystem.

              • Pip:
                In Python there are different tools for handling with lockfiles:

                – pipenv, which generates a Pipfile.lock lockfile.
                – poetry, which generates poetry.lock.
                – pip freeze, command that generates a requirements.txt which acts as a lockfile. Check if all dependencies use fixed versions with the == operator. Then pip install -r requirements.txt uses the fixed dependencies.

            Remind that the lockfiles above should be under version control, and that the build command chosen should use the lockfile.

            The usual package repository used with pip (PyPI) does not have naming scopes, and it is vulnerable to dependency confusion attacks. Avoiding dependency confusion in the python ecosystem is not easy, and some authors recommend using an internal repository to act as a proxy for public dependencies fetched from PyPI, but taking the private dependencies first from the internal repository (-index-url should point to the internal repository, not PyPI, and –extra-index- url should be removed).

            Some real attacks

            Getcookies attack: The actor dustin87 added an indirect dependency into the popular npm mailparser package to a malicious package with a RCE backdoor (gCOMMANDhDATAi):

            JSON.stringify(req.headers).replace(/g([a-f0-9]{4})h((?:[a-f0-9]{2})+)i/gi, (o, p, v) => {})

            Despite being deprecated (no reviewers !), mailparser still received about 64,000 weekly downloads. This was a case of a near-miss attack, as the RCE was not actually exercised.

            NPM published this post with details about the getcookies attack.

            Dependency Confusion:

            Alex Birsan discovered in 2021 the dependency confusion issue and published a post entitled “How I Hacked Into Apple, Microsoft and Dozens of Other Companies”.

            Remember that for npm the organization scope like @myorg should be reserved, and internal packages should be modified to use the scope.

            With pip, the common public registry PyPI does not have scopes/namespaces. Each private package could have a public package squat with the same name as the internal package, but empty, and perhaps generating an error when used, so it could be identified if it is accidentally fetched.

            Node-ipc:

            The package owner, when the Russia-Ukraine war begun, injected malicious code for removing random files, when installed at Russian and Belarusian hosts. The file ssl-geospec.js was doing such geographic distinction:

            Interestingly enough, other packages used open versions for the node-ipc dependency, like the popular Vue.js framework, and its maintainers received an urgent appeal to pin the node-ipc dependency on a safe version.

            This post contains more details on this sabotage, which goes a step further from other Protestware issues.

             

            Concluding Remarks

            Open versions should never be used in consolidated software projects. They make builds non- reproducible, and attackers may exploit them and manage to inject malware via attacks to the dependency trees like the aforementioned dependency confusion.

            Misconfigurations like open versions, lack of version pinning, or unscoped internal components should be avoided. The first thing is to detect such issues, perhaps even blocking the build when they are found, and have standarized a protocol for action.

            Automatic detection of flaws and misconfigurations in dependencies, reporting suspect dependencies that could be vulnerable to specific supply-chain attacks like dependency confusion, all with actionable fix tools, is one of the main goals of the Xygeni platform.

            To read more
            Ohm M., Plate H., Sykosch A., Meier M.: “Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks”. DIMVA 2020. Lecture Notes in Computer Science, vol 12223. Springer – 2020 (source of the dependency attack tree figure.)
            @adam-npm: “Reported malicious module: getcookies“. npm Blog (archived) – May 2, 2018.
            Alex Birsan: “How I Hacked Into Apple, Microsoft and Dozens of Other Companies”. Medium – Feb 9, 2021.
            Ax Sharma: “BIG sabotage: Famous npm package deletes files to protest Ukraine war” BleepingComputer – March 17, 2022

            Unifying Risk Management from Code to Cloud

            with Xygeni ASPM Security