AsyncLocalStorage and how to use it to reduce repetition of log data
AsyncLocalStorage
is a Node.js class that makes it possible to store global data for one specific function execution. It doesn’t matter if the function is synchronous or asynchronous, but in practice it’s more relevant for async functions.
Let’s start with an example to understand what it does and how it works.
import { AsyncLocalStorage } from "node:async_hooks";
const asyncLocalStorage = new AsyncLocalStorage<{ task: number }>();
const tasks = [1, 2, 3, 4, 5];
async function longRunningWork() {
const duration = Math.floor(Math.random() * 1000) + 500;
await new Promise((resolve) => setTimeout(resolve, duration));
const task = asyncLocalStorage.getStore()?.task;
console.log(`Task ${task} completed`);
}
async function main() {
await Promise.all(
tasks.map(async (task) => {
const context = { task };
return asyncLocalStorage.run(context, async () => {
await longRunningWork();
});
}),
);
}
main();
// Example output:
// Task 4 completed
// Task 2 completed
// Task 5 completed
// Task 3 completed
// Task 1 completed
The tasks are all executed at the same time, and when the log is written they get the value of task
that’s specific to that particular execution context.
That behavior is impossible to replicate without AsyncLocalStorage
. Global objects, in comparison, exist only once for the entire application. Adding that value to a global object would only retain the last one, resulting in logging Task 5 completed
five times.
const globalStorage = { task: 0 };
const tasks = [1, 2, 3, 4, 5];
async function longRunningWork() {
const duration = Math.floor(Math.random() * 1000) + 500;
await new Promise((resolve) => setTimeout(resolve, duration));
const task = globalStorage.task;
console.log(`Task ${task} completed`);
}
async function main() {
await Promise.all(
tasks.map(async (task) => {
globalStorage.task = task;
await longRunningWork();
}),
);
}
main();
// Output:
// Task 5 completed
// Task 5 completed
// Task 5 completed
// Task 5 completed
// Task 5 completed
Use-cases
Global data should be used sparingly. It can make it difficult to follow the data flow and the code harder to understand. In the example above, it’d be simpler to add task
as parameter to longRunningWork
.
Prime candidates are auxiliary logic, which have nothing to do with the core business logic, but need to pass data down the execution stack. Being able to pass that data outside the normal control flow (i.e. not having to pass as function parameters) frees the code from it.
Tracking request IDs is a good example. The framework can set it when the request is received, and it’ll be available throughout the handling of the request. If another framework function is called that needs the request id, it can access it, without the developer having to do extra work.
Passing repetitive log data
With structured logging it’s possible to include data within log entries. Log aggregation software stores that data and makes it possible to filter logs based on it.
In my specific case, for Lighthouse, it’s helpful that I can filter all logs that are specific to one URL.
To make that possible, the url
field must be added to the data object every time a log line is written.
It is possible to do that manually by passing the URL to every function that writes logs. This is error-prone (I might forget somewhere) and makes the code harder to read. Not every function would need the URL otherwise. When reading the code a team member might be confused about why the function computeReadingTime
requires the URL in addition to the text.
Another option is to create a child logger and pass that through. Child loggers can store additional data, which is added to every log. Every log library I know has that feature. But child loggers must also be passed as parameter to the function.
With AsyncLocalStorage
, there’s essentially a side-channel for data, which the log functions can use.
How I use AsyncLocalStorage
to pass data
import { AsyncLocalStorage } from "node:async_hooks";
const asyncLocalStorage = new AsyncLocalStorage();
export function withData<R>(data: Record<string, any>, callback: () => R): R;
export function withData<R, TArgs extends any[]>(
data: Record<string, any>,
callback: (...args: TArgs) => R,
...args: TArgs
): R;
export function withData<R, TArgs extends any[]>(
data: Record<string, any>,
callback: (...args: TArgs) => R,
...args: TArgs
): R {
const currentStore = asyncLocalStorage.getStore() ?? {};
const combinedData = { ...(typeof currentStore === "object" ? currentStore : {}), ...data };
return asyncLocalStorage.run(combinedData, callback, ...args);
}
const logger = pino();
export function info(message: string, data?: Record<string, any>) {
const store = asyncLocalStorage.getStore();
const storeData = typeof store === "object" ? store : {};
const combinedData = { ...storeData, ...data };
logger.info({ ...combinedData, message });
}
This is the code of my logUtils.ts
. The real one has additional functions for other log levels, but they’re essentially the same. The info
function merges data from asyncLocalStorage
with any additional data passed to it directly.
The withData
function receives a data
and callback
parameter, combines the data
with the object currently in store, and calls the callback
with the combined object. In the context of logging, I view data as contextual information. The further down the execution gets, the more specific the context should become. Therefore context should only be added, never removed.
withData
is a tiny wrapper around asyncLocalStorage.run
that combines existing data with new data. The type of withData
is the same as that of asyncLocalStorage.run
.
Below is an example demonstrating how it works.
withData({ a: 1 }, () => {
info("Log 1", { b: 2 }); // { message: "Log 1", a: 1, b: 2 }
withData({ c: 3 }, () => {
info("Log 2", { d: 4 }); // { message: "Log 2", a: 1, c: 3, d: 4 }
});
});