AWS Lambda: Cold boot and mean response times in Scala vs. Java

No Comments

AWS Lambda is a popular service for hosting microservice functions in the cloud without provisioning actual servers. It supports Node.js, Python, Go, C#, PowerShell and Java – more specifically: java-1.8.0-openjdk. As Scala 2.12 is compatible with JVM 8, we can also run Scala code serverless in the cloud! But does using Scala have any impact on the performance over using plain old Java? How are the cold start and mean response times? Let’s find out!

tl;dr: Mean response times are equal, cold start times are slower with Scala than with Java, but improve with increased memory.

Project structure

First we create two projects: one Java project using Maven and one Scala project using sbt to build completely independent JAR files. When using AWS Lambda, we have to supply all dependencies in a fat JAR and by splitting the projects, we have a minimal JAR for each Lambda function. Both build files contain dependencies to the AWS lambda libraries com.amazonaws » aws-lambda-java-core and com.amazonaws » aws-lambda-java-events to provide the application with the APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent and Context data structures. Those encapsulate the http request and response from an AWS API Gateway and provide a safe way to get the http request and provide a valid response. The API Gateway is the gate between the internet and our functions. The Scala JAR file additionally includes the Scala library.

lazy val root = (project in file("."))
    name := "aws_lambda_bench_scala",
    organization := "de.codecentric.amuttsch",
    description := "Benchmark Service for AWS Lambda written in Scala",
    licenses += "Apache License, Version 2.0" -> url(""),
    version := "0.1",
    scalaVersion := "2.12.8",
    assemblyJarName in assembly := "aws_lambda_bench_scala.jar",
    libraryDependencies ++= Seq(
      "com.amazonaws" % "aws-lambda-java-core" % "1.2.0",
      "com.amazonaws" % "aws-lambda-java-events" % "2.2.5",
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns=""

Lambda functions

Next, we implement the actual handler functions in both Scala and Java. They just return a http 200 response and don’t do any processing to see the actual impact of the language, rather than from some any arbitrary computations.

package de.codecentric.amuttsch.awsbench.scala
import{APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent}
class ScalaLambda {
  def handleRequest(event: APIGatewayProxyRequestEvent, context: Context): APIGatewayProxyResponseEvent = {
    new APIGatewayProxyResponseEvent()
public class JavaLambda {
    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent event, Context context) {
        return new APIGatewayProxyResponseEvent()

The bytecode of the functions are almost similar. The only difference is how Scala and Java handle the 200 argument of withStatusCode. Java uses java.lang.Integer.valueOf, whereas Scala makes use of its implicit conversation scala.Predef.int2Integer.

After building the fat JARs with sbt assembly and mvn package, we see the first big difference: the Scala JAR is almost 10 times larger than the Java one – 5.8MB vs 0.7MB. This is due to the included Scala library, which is around 5 MB large.


Now we have to deploy the services to the cloud. For this we use Serverless, a toolkit for building serverless applications. We can define our two functions in a YML configuration file and define a separate API Gateway http endpoint for each of them. With only one command we can deploy our serverless application to the cloud.

service: lambda-java-scala-bench

  name: aws
  runtime: java8
  region: eu-central-1
  logRetentionInDays: 1

  individually: true

    handler: de.codecentric.amuttsch.awsbench.scala.ScalaLambda::handleRequest
    reservedConcurrency: 1
      artifact: scala/target/scala-2.12/aws_lambda_bench_scala.jar
    - http:
        path: scala
        method: get
    reservedConcurrency: 1
      artifact: java/target/aws_lambda_bench_java-0.1.jar
    - http:
        path: java
        method: get

After defining the name of our service, we set the provider to AWS and the runtime to java8. Since we use separate JAR files for our services, we have to set the individually key to true in the package section. Otherwise Serverless will look for a gobal package. In the functions themselves we set the handler, package and a http event. We do not take concurrent execution into consideration, so we limit the number of simultaneously active Lambdas to one using the reservedConcurrency key. We use the default memorySize of 1024 MB.

Now we deploy our stack with serverless deploy. After successful execution we get our service information containing the URLs to our functions:

  GET -
  GET -

Using curl, we can test if they are available and return a 200 http response: curl -v


The next step is to build a benchmark. For this we use Gatling, a load testing tool written in Scala. It is easy to build a load test and export a graphical report after the execution. For our case we are interested in two metrics: response time on cold and warm Lambdas. AWS kills inactive Lambda instances after some (not specified) time to free up resources. Afterwards, when the function is triggered, the JVM has to start up again which takes some time. So we create a third project and build a test case:

package de.codecentric.amuttsch.awsbench
import ch.qos.logback.classic.{Level, LoggerContext}
import io.gatling.core.Predef._
import io.gatling.http.Predef._
import org.slf4j.LoggerFactory
import scala.concurrent.duration._
class LambdaBench extends Simulation {
  val context: LoggerContext = LoggerFactory.getILoggerFactory.asInstanceOf[LoggerContext]
  // Suppress logging
  val baseFunctionUrl: String = sys.env("AWS_BENCH_BASE_URL")
  val httpProtocol = http
    .acceptEncodingHeader("gzip, deflate")
    .userAgentHeader("Mozilla/5.0 (X11; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0")
  val scalaScenario = scenario("ScalaScenario")
  val javaScenario = scenario("JavaScenario")
    scalaScenario.inject(constantConcurrentUsers(1) during(120 seconds)),
    javaScenario.inject(constantConcurrentUsers(1) during(120 seconds))

First we suppress some logging as Gatling logs every request to the console. We get our endpoint URL from the environment variable AWS_BENCH_BASE_URL and define a http protocol. In there we set the base URL, some headers and the user agent. It is later used for executing the specific requests. Next, we define two scenarios that point to the scala and Java http endpoint of our serverless application. In the last step we set up both scenarios and constantly have one open active request in the duration of 120 seconds. Now we can start sbt and run the benchmark using gatling:test. We have to make sure the Lambdas are cold, otherwise we won’t get any cold boot timings. We can either wait for a few minutes or remove and redeploy the stack. As soon as it finishes running, it prints a text report and provides us with a URL to the graphical report:

Benchmark with 1024MB RAM

benchmark result AWS runtime 1024MB

Each function was called around 3100 times within the two-minute time span. The time in the max column is the time of the first request when the Lambda function was cold. We can observe that the time until the first response is around 1.6 times as long for Scala as it is for Java. This observation holds true for multiple runs. The mean response time for both Scala and Java is around 38 ms.

Assigning 2048 MB RAM improved the startup time by ~300ms for the Scala and ~200ms for the Java functions. The mean function response time improved only slightly and is negligible:

Benchmark with 2048MB RAM

benchmark results AWS runtime 2048MB


Scala works great with AWS Lambda as it can be compiled to compatible Java 8 bytecode. You can use all the great features of the language when programming Serverless applications. The startup time for a cold function is a bit longer than the Java counterpart, but improves when the function memory is increased. This test only focuses on the overhead of using the Scala runtime on top of the JVM. The results may vary on production grade functions that actually perform CPU- or network-intensive tasks and depend heavily on the implementation and the used libraries.

You can find the code of the projects and the benchmark here: GitLab

Fullstack software engineer at codecentric with a focus on React, Typescript and testing.


Your email address will not be published. Required fields are marked *