Serverless Cold Starts - Mitigation Techniques

Anvita Bajpai
Jan 10, 2021
5 min read

To start with : What is a Cold Start?

When running a serverless function, it will stay active (a.k.a., hot) as long as you're running it. Your container stays alive, ready and waiting for execution.

After a period of inactivity, your cloud provider will drop the container, and your function will become inactive, (a.k.a., cold).

A cold start happens when you execute an inactive function. The delay comes from your cloud provider provisioning your selected runtime container and then running your function.

In a nutshell, this process will considerably increase your execution time.

In other words, when a function in a cold state is invoked, the request will take additional time to be completed, because there’s a latency in starting up a new container. That’s the problem with cold starts: they make our application respond slower. In the “instant-age” of the 21st century, this could be a big problem.

Some of the Function-as-a-Service offerings of the Big-3 cloud providers are as follows :

AWS Lambda
Azure Functions
Google Cloud Functions

These are all dynamically scaled and billed-per-execution compute services.

When a new instance handles its first request, the response time increases, which constitutes a Cold Start.

When Does Cold Start Happen?

The very first cold start happens when the first request comes in after deployment.

After that request is processed, the instance stays alive to be reused for subsequent requests.

Function-as-a-Service: Behind the Scenes

Cloud providers keep a bunch of generic unspecialized workers in stock. Whenever a serverless application needs to scale up, be it from 0 to 1 instance, or from N to N+1 instances likewise, the runtime picks one of the spare workers and configures it to serve the named application:

This procedure takes time, so the latency of the application event handling increases. To avoid doing this for every event, the specialized worker is kept intact for some period. When another event comes in, this worker will stay available to process it as soon as possible. This situation is a warm start:

Thus, the cloud providers are trying to find the right balance between wasting resources to keep idle instances for too long and causing slowing down too many requests.

The strategy for reuse differs very between the cloud vendors:

Service Idle instance lifetime

AWS Lambda 5-7 min

Azure Functions Mostly between 20 - 30 min

Google Cloud 15 min

Functions

How to make them warm ♨

Ready to combat those cold starts? Here's how you do it.

Find out: where are the bottlenecks, and when?

To fix cold start problems, knowing your service performance bottleneck is essential. From small to big services, it's common to find one function that slows down your service logic because it doesn't run often enough to keep its container alive.

How to solve – or mitigate – cold start latency

The following strategies could help mitigate the impact of container startup latency on your serverless applications:

Monitor performance and log relevant indicators

It’s recommended to always log timestamps during the execution of a function and monitor duration outliers in your function’s invocations history. Whenever it performs worse than expected, go to the logs and identify which parts of your code contributed to the bad performance.

Increase memory allocation

It’s been observed that functions with more memory allocated tend to start up new containers faster. If the cost implication is not an issue for your use case, consider allocating more memory to the functions you need for the best startup performance.

Choose a faster runtime

Whenever possible, consider writing your serverless functions in a lightweight language.

Keep shared data in memory

Keep shared data in memory by loading outside the main event handler function.

Everything declared and executed outside the handler will remain in the container’s memory for as long as the container is kept alive. When it’s invoked again (from a warm state), the importing or fetching of data won’t need to run again and they can be used directly from memory, speeding up your code execution time.

Shrink package size

It’s important to clean up our package before deploying in production, removing everything that is not used or needed by our function to run. This will contribute to a shorter cold start time by reducing internal networking latency – the function will be fetching a smaller package file.

Keep a pool of pre-warmed functions

If you are still experiencing unbearable cold start latency times, the last resort is to set up regular jobs to keep a pool of pre-warmed functions. This works like this:

Configure your functions to identify warming calls to short-circuit and end the requests very quickly, without running the entire function code. This can be done by passing a pre-determined event to the function, such as: {“warm”: true}. When your function detects this event argument, just halt the execution as fast as you can.

Example: Forced warm up on AWS Lambda Functions

WarmUP would keep all the Lambdas hot by creating a scheduled event Lambda that invokes all the Lambdas you select in a configured time interval (default: 5 minutes) or a specific time, forcing your containers to stay alive.

This can be achieved by installing a WarmUP plug-in

Install via npm in the root of your Serverless service:

npm install serverless-plugin-warmup --save-dev

Add the plugin to the plugins array in your Serverless serverless.yml:

1plugins:2  - serverless-plugin-warmup

Add warmup: true property to all functions you want to be warm:

1functions:2  hello:3    warmup: true

In order for WarmUP to be able to invoke lambdas, you'll also need to set the following Policy Statement in iamRoleStatements:

1iamRoleStatements:2  - Effect: 'Allow'3    Action:4      - 'lambda:InvokeFunction'5    Resource: "*"

Add an early callback call when the event source is serverless-plugin-warmup. You should do this early exit before running your code logic, it will save your execution duration and cost.

1module.exports.lambdaToWarm = function(event, context, callback) {
2  /** Immediate response for WarmUP plugin */3  if (event.source === 'serverless-plugin-warmup') {
4    console.log('WarmUP - Lambda is warm!')
5    return callback(null, 'Lambda is warm!')
6  }
78  ... add lambda logic after
9}

Here are some (AWS Lambda) Cold Start Language Comparisons, 2019

Here is a comparison of the cold start time of competing languages on the AWS platform.

Below is the Cold Starts graph, which is segmented into three sections: functions run with 128mb, 1024mb & 3008mb of memory, as per a research performed in 2019.

Average Cold Start times - milliseconds

The average cold start times visualisation, helps illustrate that most languages now perform strongly. They also all seem to be more independent of the memory assigned to them unlike earlier years, with the exception of Java & .Net

Conclusion

Faas is an amazing serverless tool. If you want to run this effectively without running into the pitfalls like Cold Start, work out how to avoid the cold start issue by considering one of the means highlighted above.

Depending on your specific use case, always consider the fully loaded cost of running and scaling your service (not just the Faas invocation runtime). Finally, whilst avoiding optimising too early, if you start to hit timeout issues in your API then you know that now is time to investigate the bottle-neck and figure out if your system demands for optimisation and a potential cold start mitigation strategy to be put in place.