This post is inspired by the recent supply chain attack on coa, a popular command-line parser package for NodeJS,. and the discussions that followed that – both on LinkedIn and in the local InfoSec Slack community. This attack was not the first one, and definitely won’t be the last one. It’s true that Node modules are small and bring a lot of sub-dependencies – but they are usually super-focused on what they are doing. The speed of development in Node is largely based on this modularity, which comes with a lot of inherent risks. So, how do you mitigate those risks?
It’s about time to introduce Dependency Hygiene.
Now, let’s dive deeper into those items. I’ll explain what do I mean by each one of them and how they are relevant in the mitigation of the modern supply chain attacks.
My primary working environment nowadays is NodeJS-based, so the examples will come from this world, but most of what I’m going to tell applies to other ecosystems as well.
I have pretty big monorepo, managed solely by Yarn Workspaces (no Lerna). One of the packages (“workspaces”) contains a set of 3rd party NodeJS libraries that we use as a shared layer for our Lambda functions, collected as dependencies in package.json of this package. Build script for this package is supposed to collect all dependencies in a zip file that will be later published by Terraform. Unfortunately, Yarn cannot build single workspace from the monorepo, so initially we opted to use NPM directly – copy package.json to a build folder, then run “npm install --production” there and zip the resulting node_modules tree.
My main problem with this approach (besides mixing the build tools) was that the build is not repeatable – each time we run npm install we could get newer compatible version of any dependent package, since the version is “locked” by Yarn in the top-level yarn.lock file and NPM (obviously) is not aware about it. So I decided to dive deeper and see how it can be solved in a better way.
It appears that while Yarn hoists all the dependencies to the node_modules of the top-level workspace, you can explicitly opt-out from this behavior for some dependencies – or, in my case, for all dependencies of the given workspace.
Yarn Workspaces configuration before:
Yarn Workspaces configuration after the change, assuming Lambda Layer dependencies are collected under common-lambda workspace:
Note that nohoist array should contain the workspace name (including namespace when applicable) and not the workspace folder.
After this change packages/common-lambda/node_modules will contain proper versions of all the dependencies to be packaged as Lambda Layer. Those dependencies will be updated automatically on yarn install and the node_modules folder can be packaged directly.