Time to Introduce Dependency Hygiene

This post is inspired by the recent supply chain attack on coa, a popular command-line parser package for NodeJS,. and the discussions that followed that – both on LinkedIn and in the local InfoSec Slack community. This attack was not the first one, and definitely won’t be the last one. It’s true that Node modules are small and bring a lot of sub-dependencies – but they are usually super-focused on what they are doing. The speed of development in Node is largely based on this modularity, which comes with a lot of inherent risks. So, how do you mitigate those risks?

It’s about time to introduce Dependency Hygiene.

TL;DR:
Use lock files.
Pin your dependencies.
Run audits.
Patch frequently – but don’t rush fresh upgrades in.

Now, let’s dive deeper into those items. I’ll explain what do I mean by each one of them and how they are relevant in the mitigation of the modern supply chain attacks.

My primary working environment nowadays is NodeJS-based, so the examples will come from this world, but most of what I’m going to tell applies to other ecosystems as well.

Continue reading “Time to Introduce Dependency Hygiene”

Do not follow HTTP redirects with Gaxios

Earlier today I was enhancing some relatively old piece of NodeJS code, so I decided to convert it from axios to gaxios along the way. Most of the work was pretty transparent, since much smaller and better maintained gaxios is pretty much a drop-in replacement for axios, which seems to be stuck in it’s v0.x days forever.

However, there was one request that stood apart.

This request actually resulted in HTTP redirect, and my goal was to intercept the target URL and extract some query string parameters. In axios, that was straight-forward – it was enough to allow at most 0 redirects, and also make sure that HTTP 302 is not treated as an error:

const axios = require('axios');

const response = await axios.request({
  ...
  maxRedirects: 0,
  validateStatus: (status) => (status === 302)
});
const url = response.headers.location;

Yet, the same configuration produced different outcome in gaxios – the request was failing with "max redirects reached" error. My attempt to adjust maxRedirects value to 1 along with keeping status validation in place was not successful either – gaxios just issued another request to the new target.

To my surprise, quick googling did not reveal any relevant results. I took that as a hint that this is not only doable, but likely so trivial that nobody had to ask – so I continued with the digging. One of the merged PRs caused me to realize that gaxios is basically a wrapper on top of the Fetch API, and the configuration object is passed down to fetch implementation with minimal modifications. From here, the solution was simple –

const { request } = require('gaxios');

const response = await request({
  ...
  redirect: 'manual',
  validateStatus: (status) => (status === 302)
});
const url = response.headers.location;

Enjoy!

Repeatable build for Lambda Layers with Yarn Workspaces

I have pretty big monorepo, managed solely by Yarn Workspaces (no Lerna). One of the packages (“workspaces”) contains a set of 3rd party NodeJS libraries that we use as a shared layer for our Lambda functions, collected as dependencies in package.json of this package. Build script for this package is supposed to collect all dependencies in a zip file that will be later published by Terraform. Unfortunately, Yarn cannot build single workspace from the monorepo, so initially we opted to use NPM directly – copy package.json to a build folder, then run “npm install --production” there and zip the resulting node_modules tree.

My main problem with this approach (besides mixing the build tools) was that the build is not repeatable – each time we run npm install we could get newer compatible version of any dependent package, since the version is “locked” by Yarn in the top-level yarn.lock file and NPM (obviously) is not aware about it. So I decided to dive deeper and see how it can be solved in a better way.

It appears that while Yarn hoists all the dependencies to the node_modules of the top-level workspace, you can explicitly opt-out from this behavior for some dependencies – or, in my case, for all dependencies of the given workspace.

Yarn Workspaces configuration before:

"workspaces": [
  "packages/*"
]

Yarn Workspaces configuration after the change, assuming Lambda Layer dependencies are collected under common-lambda workspace:

"workspaces": {
  "packages": [
    "packages/*"
  ],
  "nohoist": [
    "common-lambda/**"
  ]
}

Note that nohoist array should contain the workspace name (including namespace when applicable) and not the workspace folder.

After this change packages/common-lambda/node_modules will contain proper versions of all the dependencies to be packaged as Lambda Layer. Those dependencies will be updated automatically on yarn install and the node_modules folder can be packaged directly.

Optimization of Lodash styling and bundle size

Lodash is one of the most widely used libraries in modern web development. Whether your webapp is based on Angular, React, Vue or something more exotic – chances that you use Lodash are pretty high. Lodash provides tons of utility methods, making your code more fast and elegant.

However, those tons of useful methods come with a cost – Lodash “weights” more than 70 KB of minified code. This is a significant addition even for the heavy apps. Moreover, chances are you are using just a handful of methods from the whole library. Tree shaking provided by your favorite packing tool could help, but here it comes the next issue – there are several possible ways you can import Lodash in your ES6-based code, potentially making it “unshakable”:

// Method 1:
// import the whole library
import _ from 'lodash';

// Method 2:
// import only relevant methods
import { map, find, filter } from 'lodash';

// Method 3:
// import individual modules
import map from 'lodash/map';
import find from 'lodash/find';
import filter from 'lodash/filter';

Each one of those methods has own pros and cons (and they are described in depth in this excellent article from BlazeMeter), but in many cases personal choice of the developer plays more significant role than formal recommendations. In big projects dealing with the styling of Lodash imports alone may quickly become a mess, and a single “whole library” import will eliminate all your bundle size optimization efforts.

Fortunately, there is a way to handle both size and styling issues together – with the help of Lodash plugins for Eslint and Babel.

Eslint plugin for Lodash enforces consistent and recommended styling for Lodash usage. I’ve found rules coming from plugin:lodash/canonical configuration to be a perfect fit for our React application. This set enforces whole package imports with _ as the name for Lodash variable, enforces usage of Lodash methods over native counterparts when available, and much more. The only caveat I found was in the testing code, where it confused enzyme.find() with native method. Fortunately, we have separate eslint configuration for testing code, so it was pretty easy to tweak the offending rule without sacrificing the style of the application code.

Babel plugin for Lodash implicitly transforms recommended _ imports to per-module imports, reducing “tree-shaked” bundle size. For the app that I was experimenting with, applying both plugins effectively reduced the size of the app bundle by more than 42 KB – but your mileage may vary.

In addition, I spent some time looking at webpack plugin for Lodash, but ultimately decided to skip it. The plugin strips out significant parts of internal Lodash implementation, claiming that those parts are rarely used. While the plugin is configurable, maintaining this configuration will be a pain. Moreover, the code that is unit-tested may be significantly different from the code that is being packaged, especially around edge cases. In my opinion, potential gain in bundle size does not justify the risks of the implementation.

There is one more thing that is worth mentioning in this context. Don’t use Lodash chaining – neither explicit (with _.chain(arr)...) nor implicit (with _(arr)...), since this again will import the whole Lodash library. There are more details about it in this Medium post. In some cases, where chaining seems to be the most simple and elegant way of doing things, see if you can get away with flow and lodash/fp combination:

import { flow, map, compact } from 'lodash/fp';

const result = 
  flow(
    map(...),
    compact
  )(arr);

You can read more about functional programming with Lodash here.

Happy Lodashing!