What can go wrong with CI/CD pipelines?

Continuous integration and continuous delivery (CI/CD) pipelines are the foundation of any software organization that builds software in a “modern” way. Automation provides great power, but most developers miss the responsibility it entails.

Developer: Yeah, we take CI/CD security seriously and have strong control on code maintainers, review commits before merges; jobs and pipelines are maintained by senior staff, they take care of not leaking secrets in the pipelines. And the tool was installed by personnel who know the thing. What can go wrong?

Dear developer, CI/CD systems are complex. Its wide attack surface attracted malign actors. Better be wary and never overconfident.

Default configuration sometimes is kept and become the best friend for hackers. Critical flaws can be present in CI/CD pipeline sources, in the configuration of the system, or around the process and context of the pipeline and how it is triggered.

In this post we will put ourselves in the bad actors’ shoes. Imagine that we were reading the musings of M3M3N70 (Memento Mori?) and Swamp Rage somewhere in the dark web, probably in a non-western language, but never miss that evil is spread worldwide.

 

In the good old times it was so easy…

M3M3N70: Back to the good old times our business was sooo easy… Zero-days were low-hanging fruits, apps were wide open with easy to exploit vulns, and we could move laterally in a snap.

Swamp Rage: Fu#@Hell ! Some hare-brained out there yet,but things changed.The big guys put lots of dought on that AppSec shit.

M3M3N70: Yep. But the new fools are the devs. For us, it has been easier to go for the tools these guys use. The CI, in particular, is a gold mine! Cloud access tokens, SCM credentials, production database passwords, SSH private keys, other CI users’ credentials… Jumping from the boring dev things to the real meat was rather trivial.

Automation for building, testing and deploying software with a CI/CD tool often needs passing secrets to commands in steps. And often they are leaked, with infamous consequences.

Pipelines need secrets that sometimes are leaked

M3M3N70: The code monkeys slipped their AWS keys in a pipeline on a GitHub commit, later they removed the thing but not changed git history. For our scripts it was trivial to scan the history, grab the key and wreak havoc.

Perhaps the good old times was to find in Git history an .env file (the developer forgot to add it to .gitignore):

AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=wJalrXUtn...
AWS_REGION=us-east-1
APP_FOLDER=...
S3BUCKET=...

which was used in a GitHub workflow .github/deploy.yaml that contained something like this:

jobs:
build:
name: Automated build and deployment into AWS
runs-on: ubuntu-22.04
steps:
# Load environment from .env file
- id: dotenv
uses: falti/dotenv-action@v1.0.2

- name: Configure AWS Credentials
uses:
with:
args: --acl public-read --follow-symlinks --delete
aws-access-key-id: $
aws-secret-access-key: $
aws-region: $

# ... build steps skipped ...

- name: Upload compiled code for deployment
working-directory: $/packaged_app
run: aws s3 cp my-app.zip s3://$

- name: Deploy the app
run: aws deploy create-deployment ...

M3M3N70: wow! Those aws keys were working! We tested first an inocuos change in the app, then added the sting as those guys seemed unaware. Bingo! What a campaign…

The bad actor simply used the AWS keys to upload a modified application with malware and then ran the deploy command with such credentials. The leaked secrets, along with the information contained in the pipeline. “What a campaign !” probably means that Memento wrecked havoc into the poor victim.

What Memento tells us here is that once a secret leak happened, like AWS access keys in the example, you have to revoke the secret (rotate the above keys) immediately. There is always an exposure window between the leaking commit and the secret invalidation; rewriting Git history is hard (even the toughest authoritarian state tried such history rewriting, to no avail) and probably ineffective (our friends might have cloned before the repository with the secret leak commit). Rotate keys immediately, and pray while reading activity logs for the targeted account during the exposure window!

Probably organizations should ban using long term secrets in CI/CD pipelines, and replace them with temporal credentials. In the previous example with AWS keys in GitHub actions, it is safer to use an OpenID Connect (OIDC) provider to get short-lived credentials needed for actions.

Swamp Rage: You were so lucky! Leaking scripts with hardcoded keys was common practice in the old days, even on publicly accesible S3 buckets. All you needed to do is traverse the objects in the bucket and do some grepping to find interesting things.

Sometimes the area used for deployment (an AWS S3 bucket in this example) was open for read from outsiders, because of a configuration flaw (which went undetected). What Swamp Rage used was something like this:

aws s3 ls --recursive s3://<bucket_name>/<path>/ | \
awk '{print $4}' | \
xargs -I FNAME sh -c "echo FNAME; \
aws s3 cp s3://<bucket_name>/FNAME - | \
grep '<regex_pattern>'"

The bucket was probably created in an provisioning template which could be automatically scanned for security flaws.

mohammad-rahmani-d3Ysz1ziusM-unsplash

The tool default configuration was a toy for us

To give concrete examples let’s talk about Jenkins, one of the most popular CI tools.

Swamp Rage: Do you remember that “Enable Security” checkbox in Jenkins, and how many orgs chosen not to activate it for sake of convenience? And those “Anyone can do Anything” permission combos as default? And those pesky Jenkins plugins, like the GitHub OAuth plugin? The guy who configured it selected both “Grant READ permissions to all Authenticated Users” and “Use GitHub repository permissions”, giving us access to all their projects.

(Apologies, Jenkins, for putting you as example ðŸ˜‰

Keep adept (even addict) to security principles. One is the Secure by default principle: controls should default to the most secure settings possible. Security should be built into CI/CD tools and pipelines from the ground up, rather than being an afterthought. But user-friendliness and convenience often clashes with security.

For the Jenkins case, built-in authentication is too fragile: never use the built-in authentication mechanisms in Jenkins. Better opt for a third-party mechanism (SAML, LDAP, Google …), with Role-based Authorization Strategy (“RBAC”) plugin. And have extreme caution with the admin account.

Take care of how job and pipeline files in Jenkins are handled. The same with Configuration-as-Code plugin and its config files, which apply to the Jenkins configuration.

Moving from self-hosted CI/CD systems to cloud-based SaaS ones eliminates some potential risks allowing lateral movement within the organization network, but added others, like having to open external connections between existing internal systems and the externalized CI/CD tool.

Organizations should exercise due care in hardening the CI/CD system, starting with the most restrictive settings and gradually opening up with the minimal required permissions for the pipeline steps.

Configuring security in CI/CD tools could be complex feat. Many have plugins or extensions which have most of the vulnerabilities and need to be updated.

Security misconfiguration scanners for such complex tools, or benchmarks may help.

Injecting code in pipeline commands for fun and profit

M3M3N70: Have you ever used Untrusted Code Checkouts, that vulnerable actions and scripts vulnerable to command-injection?

This section shows that the pipeline itself could have coding mistakes that enable bad actors to inject arbitrary code execution in the pipeline without changing the pipeline source itself. For example using a PR

A first example of an unfortunate GitHub workflow:

# INSECURE. Provided as an example only.
on:
pull_request_target #1

jobs:
build:
name: Build and test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
ref: $ #2

- uses: actions/setup-node@v1
- run: |
npm install #3
npm build
# ... more steps ...

Combining pull_request_target workflow trigger with an explicit checkout of an untrusted PR is a dangerous practice that may lead to repository compromise.In the example, the unfortunate combination of:

  • pull_request_target event, which by default have write permission to the target repository and target repository secrets, even from external forks, and runs in the context of the target repository of the PR,
  • checkout the PR code from the source, untrusted repo,
  • trigger any script that may operate on PR controlled contents, like in the case of npm install, and
  • not using a condition to the triggering pull_request_target event to run only if some sort of ‘this PR was vetted’ label is assigned to the PR (external users cannot assign labels to the PR).

A second example takes untrusted input (from an issue, comment or pull request) as source for arguments passed to a pipeline command via expressions. This is the pipeline version of the OS command injection vulnerability.

- name: Check title
run: |
title="$"
if [[ ! $title =~ ^.*:\ .*$ ]]; then
echo "Bad issue title"
exit 1
fi

The run operation generates a temporary shell script based on the template, with $ substituted, making it vulnerable to shell command injection. An attacker with a fake GitHub account could create an issue with a title a"; bad_code_goes_here;#, and boom ! 

Swamp Rage: Oh, those guys were opening the door for command injection by simply opening an issue…

There were code execution vulnerabilities in GitHub actions, like gajira-comment, now fixed. Please read “Untrusted input in GitHub workflows” for full details.

The moral of the story: Never ever checkout and build PRs from untrusted sources without first reviewing the PR. ‘Untrusted’ here, unless under draconian authentication of provenance, could mean any potentially hijacked developer account.

 

Unintended malware deployment here!

Continuous deployment is the automation climax, but that climax could be frustrated by the lack of appropriate approval controls on the pipeline flow.

The risks of fully automated deployment from source commit to production systems include the potential for malicious code to be deployed to production environments without being detected, as well as the potential for errors in the deployment process to cause disruptions or outages.

In order to mitigate these risks, it is often recommended that organizations implement a “hard break” in their deployment process, which requires human approval before releases are deployed to end environments.

They are closing the doors

Swamp Rage: Those joyful default passwords in CI/CD tools are being wiped off. Accessing the /var/lib/jenkins/secrets/initialAdminPassword is now a dead track. Many tools are now providing 2FA, that Covid made popular, and even the laziest code monkey out there is using it!

M3M3N70: We are fighting 2FA, but it is not that easy. It is hard to spear-phish those guys, as “Scatter Swine” did with Twilio. With WebAuthn keys it is much more difficult. At least, we can try to steal cookies to bypass MFA, but need to break into developer’s box.

Multi-Factor Authentication is a good step in the right direction for limiting the risk of authentication secrets leaks. Most of the modern DevOps tools support MFA. And authentication keys under WebAuthn / U2F (see FIDO2 project) are perhaps the best option for MFA in DevOps, if properly managed.

Swamp Rage: The DevOps guys are waking up. They have the damned “least privilege” thing in their blood. And they are not code monkeys anymore. We now get caught red-handed by reviewers.

In fact, pipelines are now a bit more robust than a couple of years ago, with weak actions and scripts removed, and with additional security testing steps that even detected our droppers hidden in stealth commits and packages we hijacked.

Question for the reader: is the process of building the software from sources and deploying into production a risky business ? Can you see your DevOps in the stage of ol’ good times for the bad guys ?

Final Recommendations

Where to start with CI/CD pipelines ?

The first recommendation is simple here: Carefully review pipelines (they are critical resources) for security issues. Reviews are costly but necessary, and should be done properly. Reviewers should be aware of what to look at. Each step must be checked for flaws.

Perhaps a combination of expert reviewers armed with automated malicious code scanners could help.

The second recommendation is to train developers who write pipelines and maintain them on security. Things to consider:

  • How to properly handle authentication with internal and cloud services, avoiding the nuisance of handling long-term credentials.
  • How to limit pipelines to the exact set of resources it needs access to. The principle of least privilege shines again.
  • How to write the steps to make pipelines reproducible like version pinning, and avoiding command injection vulnerabilities.
  • How to approve deployments from the security perspective (they are others!): which security standards should be matched and how to add corresponding checks/gates in the pipelines.

A third recommendation is to configure the CI/CD system with due care. Strong authentication, no default passwords or insecure settings, minimal privileges… Take care of vulnerabilities in plugins and extensions installed. This could be the focus of following posts, please keep tuned.

The fourth recommendation is to leverage the CI/CD pipelines for security automation. Source code analysis (SAST), source composition analysis (SCA), secrets leaks scanning, anti-malware tools, container security scanners, or automated runtime detectors (DAST and malware) can be run routinely on the pipeline. And your organization may enforce standards about coverage on security scanning in CI/CD.

Remind, these tools yet do not remove expert review from the equation, otherwise you could have a false sense of security.

If you are adept to OWASP top-tens, a nice recent project is the OWASP Top 10 CI/CD Security Risk.

Disclaimer Note

(1) The examples in this post are using GitHub as SCM, AWS as cloud provider, and GitHub Actions or Jenkins as CI/CD tool. They are not weaker / safer than their alternatives. No bad press intention! These tools are powerful and need to be used appropriately.

(2) M3M3N70 and Swamp Rage are fictional characters. Any resemblance to persons or groups, living or death, is merely coincidental… or is it?

To read more

Unifying Risk Management from Code to Cloud

with Xygeni ASPM Security