Scala Arrays – functional vs imperative

15.2.2016 | 5 minutes of reading time

The Scala collections , which are part of the standard library, are known for their vast amount of high-level functional operations like map, flatMap, filter, sliding or groupBy, just to name a handful. These not only allow for high developer productivity – just imagine implementing something like groupBy yourself every time you need it – but usually also give us reasonable or even excellent performance. This proves true in particular when dealing with concurrent programming, because the default collections in Scala are immutable and using immutable objects instead of synchronization or defensive copies results in increased performance at large.

Nevertheless there are some situations when we have to pay a substantial penalty for using these nice high-level and thread-safe collections. Luckily Scala is a multi-paradigm language geared to real-world applications and hence lets us pick the right tool among several for the job at hand: In these situations, when collections and functional programming don’t give us the performance we need, we can use arrays and imperative programming.

The Use Case: Calculate Hex-Code for Bytes

Let’s take a look at the following use case: Write a function that takes an array of bytes – which is represented as an Array[Byte] in Scala source code and as a native JVM array after compilation – and returns an array of UTF-8 characters representing the concatenated hex-codes of all the bytes.

To give an example: Array[Byte](0, 1, 15, 16) should be transformed to Array[Char](0, 0, 0, 1, 0, F, 1, 0). As each byte corresponds to two hex characters, the resulting array has twice the size of the input.

Now you might ask how this is related to collections. Well, Scala allows us to treat an array as a collection of type scala.collection.Seq, i.e. like a sequence. Therefore we can apply all these high-level functional operations to arrays, e.g.:

1scala> Array(1, 2, 3).map(_ + 1)
2res0: Array[Int] = Array(2, 3, 4)

Of course we could use an existing library, e.g. Apache Commons Codec , but maybe we don’t want to depend on an external library for such a simple task or we just want to have some fun and hack some Scala ourselves.

Implementation Design

Scala allows us to add extension methods to any existing type without subclassing, simply by defining an implicit class that wraps a value of the type to be extended. By using a value class via extending AnyVal the Scala compiler is able to avoid creating instances of the wrapper and instead inline everything, so there’s negligible runtime overhead.

As we want to be able to call toHex and/or toHexString on any byte array, our implementation looks like this:

1implicit class ByteArrayOps(val bytes: Array[Byte]) extends AnyVal {
2  def toHexString: String = new String(toHex)
3  def toHex: Array[Char] = ???
4}

We are going to provide two implementations, one using a high-level and functional approach and one using imperative programming tuned for arrays.

Benchmarking

As we are interested in the performance of the different implementations, we obviously have to run some benchmarks. For the JVM, JMH is the de-facto standard for micro-benchmarks and since my former coworker Konrad Malawski has created sbt-jmh – an sbt-plugin for JMH – we can easily run JMH benchmarks from sbt, which is the build tool of our choice. All we have to do is add the following line to the plugins definition of our sbt project, which resides under project/plugins.sbt:

addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.2.6")

Functional Approach

As mentioned, we can treat an array as a sequence and hence we can use the aforementioned method flatMap to transform the given Array[Byte]:

1def toHex: Array[Char] = bytes.flatMap { byte =>
2  val high = digits((byte & 0xF0) >>> 4) // digits is Array('0', '1', ..., 'E', 'F')
3  val low = digits(byte & 0x0F)
4  Array(high, low)
5}

For each byte from the given input we calculate the high and low hex character simply by using a suitable bit mask, some bit shifting when needed and a lookup of the appropriate hex character. Then we return a new array consisting of the two hex characters which is the reason why we have to use flatMap instead of map.

For anybody familiar with the Scala collections, this approach should look straightforward. In any case it should be obvious that this implementation creates a lot of intermediate arrays, one for each element of the given input. With a little pondering it should also become clear that flatMap cannot preallocate a single array for the return value, but instead has to create an intermediate result for each step. Hence we expect this approach to create a lot of intermediate arrays and involve a substantial amount of copying, two factors which might negatively impact performance. But let’s wait and see what the benchmarks tell us.

Imperative Approach

Now, instead of transforming the given input, we preallocate the result – which we can do because we know its size for this special case – and use a loop to index into the input and result:

1def toHex: Array[Char] = {
2  val hex = Array.ofDim[Char](bytes.length * 2) // 2 hex chars for each byte
3  var n = 0
4  while (n < bytes.length) {
5    hex(n * 2) = digits((bytes(n) & 0xF0) >>> 4)
6    hex(n * 2 + 1) = digits(bytes(n) & 0x0F)
7    n += 1
8  }
9  hex
10}

Of course this code is harder to understand, because instead of simply declaring what needs to be done it expresses in great detail how to compute the result. On the other hand we may expect better performance, because we don’t allocate any arrays except for the final result and perform all element access via index which is known to be very fast for arrays.

Benchmark Results

Using the sbt-jmh plugin we can run some benchmarks. For a byte array of size 1.024 we get the following results:

jmh:run -wi 10 -i 10 -f 2 -t 1
...
[info] Benchmark                        Mode  Cnt       Score      Error  Units
[info] Benchmarks.benchmarkImperative  thrpt   20  544482.362 ± 4692.282  ops/s
[info] Benchmarks.benchmarkNaive       thrpt   20   40273.748 ±  733.762  ops/s

Of course we all know that we have to be very careful when interpreting results of micro-benchmarks. Nevertheless these results clearly show that the imperative approach is about one order of magnitude faster than the functional one, which matches our earlier assumptions.

Conclusion

We have shown that in some situations an imperative approach using arrays can be much more performant than using the functional collection API. Of course this is not to promote imperative programming, but instead to show the flexibility of Scala and the freedom to pick the right tools.

The full source code is on GitHub . As always, comments are welcome.

Was this post helpful?

Likes

Blog author

Heiko Seeberger

Do you still have questions? Just send me a message.

fromHeiko Seeberger

akka-testkit richtig verwenden

Das Testen von Aktoren unterscheidet sich vom „traditionellen“ Testen von Objekten oder Funktionen. Erstens ist asynchroner Nachrichtenaustausch der einzige Weg, um mit Aktoren zu interagieren. Das bedeutet, dass wir nicht einfach eine Methode oder Funktion...

Reactive Programming
Scala

18.9.2017 | 4 Minuten Lesezeit

Heiko Seeberger

Phantom Types in Scala

Inspired by a recent conversation with my former colleague Brendan McAdams and my current coworker Markus Hauck , I decided to put together a quick post about phantom types, a topic perfectly suited for demonstrating the power of the type system of ...

Scala

5.2.2016 | 5 Minuten Lesezeit

Heiko Seeberger

Getting started with Akka Cluster

In a previous part of this series about Akka we introduced the core abstraction provided by Akka: actors. Now we want to take a look at how these can be used in a cluster, i.e. in a distributed system. Cluster Membership The akka-cluster module provides...

Reactive Programming

11.1.2016 | 5 Minuten Lesezeit

Heiko Seeberger

Monads demystified

In this short post I want to take a look at monads from a pragmatic perspective, i.e. why and how monads can be useful for developers. I won’t talk about any theory, but instead show code examples in Scala. I’ll even call things monad which don’t fully...

Functional programming
Scala

8.12.2015 | 3 Minuten Lesezeit

Heiko Seeberger

Introduction to Akka Actors

In the first part of this series we gave you a high-level overview of Akka – now we are going to take a deep dive into the realm of Akka actors as provided by the akka-actor module which lay the foundations for all other Akka modules. As we believe...

Reactive Programming

16.8.2015 | 11 Minuten Lesezeit

Heiko Seeberger

A Map of Akka

The amazing Akka project was started by Jonas Bonér in 2009 with the aim to bring the actor model , which has proven to deliver an availability of six nines (99.9999%) and even more, to the JVM. Akka, which is open source and available under the Apache...

Scala
Reactive Programming

26.7.2015 | 8 Minuten Lesezeit

Heiko Seeberger

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

akka-testkit richtig verwenden

Reactive Programming
Scala

18.9.2017 | 4 Minuten Lesezeit

Heiko Seeberger

Scala und Spring Boot – geht das gut?

Scala ist eine der populärsten alternativen Programmiersprachen für die JVM. Funktionale Programmierung, Typinferenz, eine mächtige Collections-Bibliothek und asynchrone und parallele Ausführung sind Kernmerkmale dieser Sprache. Sie hat sich insbesondere...

Container
Microservices
Scala
Spring

4.7.2017 | 8 Minuten Lesezeit

Björn Jacobs

Datenlookup in Spark Streaming

Bei der Verarbeitung von Streaming-Daten reichen die Rohdaten aus den Events häufig nicht aus. Meist müssen noch zusätzliche Daten hinzugezogen werden, beispielsweise Metadaten zu einem Sensor, von dem im Event nur die ID mitgeschickt wird.In diesem ...

Softwarearchitektur
Scala
Big Data
Data
Streaming

1.6.2017 | 7 Minuten Lesezeit

Matthias Niehoff

Einführung in Akka Http Path Directives

In vielen Web-Frameworks werden die Pfade einer Webanwendung mithilfe von speziellen Konfigurationsdateien oder per Annotations definiert. Akka bietet eine ausgezeichnete DSL, um Pfade programmatisch festzulegen. Dabei werden Prinzipien wie DRY und Separation...

Reactive Programming
DSL
Scala

17.10.2016 | 4 Minuten Lesezeit

Christian Börner-Schulte

IoT-Analyse-Plattform

Internet of Things (IoT) oder auch Industrie 4.0 ist heute in aller Munde. Aber welche Herausforderungen stellen sich eigentlich bei der Verarbeitung großer Datenmengen? Eine Variante kann sein, Daten zu sammeln und später im Batch-Betrieb zu verarbeiten...

Cloud
IoT
NoSQL
Scala
Big Data

13.7.2016 | 14 Minuten Lesezeit

Achim Nierbeck

Lazy Vals in Scala: Ein Blick hinter die Kulissen

Scala erlaubt die Nutzung des Keywords lazy in Verbindung mit val, um die Initialisierung bei Bedarf auszuführen. Bedarfsauswertung für val hört sich gut an, allerdings hat die konkrete Implementierung in scalac dem Scala Compiler ein paar sehr subtile...

Scala

24.2.2016 | 9 Minuten Lesezeit

Markus Hauck

Die Essenz objektfunktionaler Programmierung und das praktische Potential...

Die Begriffe „objektfunktional“ und „objektfunktionale Programmierung“ hört man immer wieder im Kontext der Softwareentwicklung. Aber wie sieht der objektfunktionale Ansatz aus und welche Vorteile hat er? Ist die Objektorientierung oder der funktionale...

Softwarearchitektur
Softwareentwicklung
Scala

30.8.2015 | 6 Minuten Lesezeit

Martin Lau

Extreme startup at codecentric

Jeder Entwickler hat seine Lieblingssprache oder sein Lieblingsframework und eine Menge hitziger Diskussion werden deshalb ausgetragen. Darum entschieden wir uns vor ein paar Wochen, dass es an der Zeit ist herauszufinden, welcher der beste/schnellste...

Reactive Programming
Open Source
Functional programming
Startup
JavaScript
Java
Scala
Spring

24.6.2015 | 7 Minuten Lesezeit

Benedikt Ritter

Das Scala-Typsystem: Parametrisierte Typen und Varianzen, Teil 1

Scala wurde 2004 veröffentlicht und wird an der EPFL und von Typesafe weiterentwickelt. Das passiert einerseits gefördert durch Forschungsgelder der europäischen Union und andererseits unterstützt durch industrielle Investoren . Scala hat in den ...

Scala

6.3.2015 | 5 Minuten Lesezeit

Andreas Schroeder

Drei Tage ScalaDays 2014 im Überblick

Der Fokus der ScalaDays 2014 in Berlin lag auf Vereinfachung, Reactive Streams und Event Sourcing mit Akka Persitence. Vom 16. bis 18. Juni trafen sich dazu Entwickler und konnten 59 Vorträgen in mehreren Tracks zuhören. Die Vorträge waren ein spannende...

Scala

4.7.2014 | 3 Minuten Lesezeit

Lukas Pustina

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Send

Scala Arrays – functional vs imperative

The Use Case: Calculate Hex-Code for Bytes

Implementation Design

Benchmarking

Functional Approach

Imperative Approach

Benchmark Results

Conclusion

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

akka-testkit richtig verwenden

Phantom Types in Scala

Getting started with Akka Cluster

Monads demystified

Introduction to Akka Actors

A Map of Akka

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

akka-testkit richtig verwenden

Scala und Spring Boot – geht das gut?

Datenlookup in Spark Streaming

Einführung in Akka Http Path Directives

IoT-Analyse-Plattform

Lazy Vals in Scala: Ein Blick hinter die Kulissen

Die Essenz objektfunktionaler Programmierung und das praktische Potential...

Extreme startup at codecentric

Das Scala-Typsystem: Parametrisierte Typen und Varianzen, Teil 1

Drei Tage ScalaDays 2014 im Überblick

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten

Contact

Send

Scala Arrays – functional vs imperative

The Use Case: Calculate Hex-Code for Bytes

Implementation Design

Benchmarking

Functional Approach

Imperative Approach

Benchmark Results

Conclusion

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

akka-testkit richtig verwenden

Phantom Types in Scala

Getting started with Akka Cluster

Monads demystified

Introduction to Akka Actors

A Map of Akka

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

akka-testkit richtig verwenden

Scala und Spring Boot – geht das gut?

Datenlookup in Spark Streaming

Einführung in Akka Http Path Directives

IoT-Analyse-Plattform

Lazy Vals in Scala: Ein Blick hinter die Kulissen

Die Essenz objektfunktionaler Programmierung und das praktische Potential...

Extreme startup at codecentric

Das Scala-Typsystem: Parametrisierte Typen und Varianzen, Teil 1

Drei Tage ScalaDays 2014 im Überblick

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten