Kevin W. McConnell

Add CloudWatch monitoring from inside your CDK constructs

Whenever I start a new project, I like to make sure I have some good monitoring in place. Having a nice view of how the system behaves, from day one, makes it easier to see the impact and behaviour of each new feature as it’s added.

I usually do this with a combination of CloudWatch dashboards, alarms, and X-Ray tracing for observability.

A pattern that I’ve started to find useful when doing this, is to extend some of the CDK constructs to make them add their own dashboard widgets. By extending the construct’s class to have an additional dashboard prop, its constructor can then add some widgets to that dashboard in a standard way.

For example, let’s say you want a section on your app’s dashboard that shows how all your Lambda functions are being executed. You’d like one chart per function, and each chart should show the duration (p95 and average), as well as counts of the invocations and errors. One way you could set this up is to declare a custom LambdaFunction that adds the appropriate widget when constructed:

Here is an example from one of my projects that does this to the GoFunction construct:

typescript
export interface LambdaFunctionProps extends GoFunctionProps {
dashboard: Dashboard;
}
export default class LambdaFunction extends GoFunction {
constructor(scope: Construct, id: string, props: LambdaFunctionProps) {
super(scope, id, props);
props.dashboard.addWidgets(
new GraphWidget({
title: id,
left: [
this.metricInvocations(),
this.metricErrors(),
this.metricThrottles(),
],
right: [
this.metricDuration().with({ statistic: 'p95' }),
this.metricDuration().with({ statistic: 'avg' }),
],
leftYAxis: { min: 0 },
rightYAxis: { min: 0 },
width: 12,
})
);
}
}

I use this LambdaFunction throughout my app in place of the original GoFunction.

The nice thing about this pattern is that I can customise the dashboards to my liking, and it’s easy to change them later. If decide to chart the p99 as well, or use some metric math to show the errors as rates rather than counts, I can make that change in one place, and all the charts will update.

The same pattern works well for adding alarms. I don’t typically have the same alarms on all the functions, but it can be helpful to have some that are standardised. For example, alarms that track error rates exceeding a specific percentage, or rising too rapidly, can be useful to have by default. Passing an sns.Topic prop allows the construct to attach an action to the alarm; you can use that topic to route notifications to email, SMS or Slack channels.

I find this can be useful for any resource type that you’d want to monitor. So in addition to Lambda functions, it’s also handy for topics & queues, load balancers, storage volumes, Fargate services, and so on.

Posted August 6, 2021.