Overview

Phantom Types in Scala

1 Comment

Inspired by a recent conversation with my former colleague Brendan McAdams and my current coworker Markus Hauck, I decided to put together a quick post about phantom types, a topic perfectly suited for demonstrating the power of the type system of Scala.

Phantom types are called this way, because they never get instantiated. Really? So what are they good for? Simply to encode type constraints, i.e. prevent some code from being compiled in certain situations.

Let’s look at an example: Software developers are known for their vast need of coffee. So let’s define the class Hacker that defines the two methods hackOn and drinkCoffee.

class Hacker {
 
  def hackOn: Hacker = {
    println("Hacking, hacking, hacking!")
    new Hacker
  }
 
  def drinkCoffee: Hacker = {
    println("Slurp ...")
    new Hacker
  }
}

In the spirit of functional programming and immutable objects, both methods return a new instance of Hacker.

Of course the current implementation doesn’t encode the following constraint which is imposed by the very nature of human beings in general and software developer in particular: After hacking a Hacker can’t continue doing so, but needs to drink some coffee. Put another way, we should not be able to call hackOn again on the Hacker returned by a previous call of hackOn:

scala> val hacker = new Hacker
hacker: Hacker = Hacker@3fb4f649
 
scala> hacker.hackOn.hackOn // This should not compile, but it does!
Hacking, hacking, hacking!
Hacking, hacking, hacking!
res0: Hacker = Hacker@2ff5659e

We could solve this problem by introducing two subclasses of Hacker – e.g. ExhaustedHacker and CaffeinatedHacker – which only define the respective methods returning the other subclass. But sometimes this is not a viable option, e.g. if we need subclassing to implement some other concern. Just think of a class hierarchy of animals or such: In this case the former approach would lead to an explosion of classes.

Therefore let’s implement the two states a hacker can be in using type parameters. First we define the possible states in the companion object:

object Hacker {
 
  sealed trait State
  object State {
    sealed trait Caffeinated extends State
    sealed trait Decaffeinated extends State
  }
}

Here we use sealed to prevent the possible states from being extended. Notice that we don’t define any concrete subclasses, hence nobody will ever be able to instantiate State, State.Caffeinated or State.Decaffeinated.

Next we add a type parameter to Hacker and constrain it to a subtype of Hacker.State using an upper bound:

class Hacker[S <: Hacker.State] {
  import Hacker._
 
  def hackOn: Hacker[State.Decaffeinated] = {
    println("Hacking, hacking, hacking!")
    new Hacker
  }
 
  def drinkCoffee: Hacker[State.Caffeinated] = {
    println("Slurp ...")
    new Hacker
  }
}

Unfortunately we’re not yet there:

scala> val hacker = new Hacker[Hacker.State.Caffeinated]
hacker: Hacker[Hacker.State.Caffeinated] = Hacker@33e5ccce
 
scala> hacker.hackOn
Hacking, hacking, hacking!
res0: Hacker[Hacker.State.Decaffeinated] = Hacker@34340fab
 
scala> hacker.hackOn.hackOn  // This should not compile, but it still does!
Hacking, hacking, hacking!
Hacking, hacking, hacking!
res1: Hacker[Hacker.State.Decaffeinated] = Hacker@3b6eb2ec

While calling hackOn returns a decaffeinated hacker, we’re still able to call hackOn a second time. This is not a surprise, because we haven’t constrained calling any of the two methods yet.

Type Bounds

Let’s fix that with a neat trick: In order to only allow calling hackOn on an instance of Hacker[State.Caffeinated], we have to make the compiler fail, if the Hacker is parameterized with any other State. Therefore we parameterize hackOn and constrain the parameter with an upper and a lower bound:

def hackOn[T >: S <: State.Caffeinated]: Hacker[State.Decaffeinated]

When the compiler tries to infer a type for T, it has to find one that is a supertype of S which represents the hacker’s state and at the same time a subtype of State.Caffeinated. Obviously this cannot work if S is State.Decaffeinated, because neither State.Decaffeinated nor any of its supertypes is a subtype of State.Caffeinated.

So essentially by sandwiching the type parameter T of the method in between the parameter S of the class and the desired state we constrain calling the method to such instances which are parameterized with the desired state. Let’s complete the code for Hacker, thereby making the constructor private and providing factory methods for Hackers in each of the two states:

object Hacker {
 
  sealed trait State
  object State {
    sealed trait Caffeinated extends State
    sealed trait Decaffeinated extends State
  }
 
  def caffeinated: Hacker[State.Caffeinated] = new Hacker
  def decaffeinated: Hacker[State.Decaffeinated] = new Hacker
}
 
class Hacker[S <: Hacker.State] private {
  import Hacker._
 
  def hackOn[T >: S <: State.Caffeinated]: Hacker[State.Decaffeinated] = {
    println("Hacking, hacking, hacking!")
    new Hacker
  }
 
  def drinkCoffee[T >: S <: State.Decaffeinated]: Hacker[State.Caffeinated] = {
    println("Slurp ...")
    new Hacker
  }
}

Now we can only call hackOn on a Hacker[State.Caffeinated], i.e. after calling drinkCoffee and vice versa:

scala> Hacker.caffeinated.hackOn.drinkCoffee.hackOn.drinkCoffee
Hacking, hacking, hacking!
Slurp ...
Hacking, hacking, hacking!
Slurp ...
res0: Hacker[Hacker.State.Caffeinated] = Hacker@6b3b6b21
 
scala> Hacker.caffeinated.hackOn.hackOn
<console>:14: error: type arguments [Hacker.State.Decaffeinated] do not conform to method hackOn's type parameter bounds [T >: Hacker.State.Decaffeinated <: Hacker.State.Caffeinated]

Unfortunately the error messages we get from this sandwiching trick are not immensely helpful: type arguments [Hacker.State.Decaffeinated] do not conform to method hackOn’s type parameter bounds [T >: Hacker.State.Decaffeinated <: Hacker.State.Caffeinated]

Implicit Evidence

Therefore we can use another approach: We replace the respective type bounds with an implicit parameter of type S =:= State.Caffeinated or S =:= State.Decaffeinated:

def hackOn(implicit ev: S =:= State.Caffeinated): Hacker[State.Decaffeinated] = {
  println("Hacking, hacking, hacking!")
  new Hacker
}
 
def drinkCoffee(implicit ev: S =:= State.Decaffeinated): Hacker[State.Caffeinated] = {
  println("Slurp ...")
  new Hacker
}

This makes the compiler look for an implicit value of type =:= parameterized with the proper types. Predef defines an implicit method which provides such instances if the left and right hand sides are of the exact same type:

object =:= {
   implicit def tpEquals[A]: A =:= A = singleton_=:=.asInstanceOf[A =:= A]
}

Therefore we get the same effect like before, i.e. we can only call hackOn if Hacker is parameterized with State.Caffeinated and drinkCoffee if Hacker is parameterized with State.Decaffeinated:

scala> Hacker.caffeinated.hackOn.drinkCoffee.hackOn.drinkCoffee
Hacking, hacking, hacking!
Slurp ...
Hacking, hacking, hacking!
Slurp ...
res0: Hacker[Hacker.State.Caffeinated] = Hacker@3461511d
 
scala> Hacker.caffeinated.hackOn.hackOn
<console>:14: error: Cannot prove that Hacker.State.Decaffeinated =:= Hacker.State.Caffeinated.

As you can see, we get a much better error message than before: Cannot prove that Hacker.State.Decaffeinated =:= Hacker.State.Caffeinated

Conclusion

We have shown that phantom types can be used to encode type constraints. Sometimes these can be implemented in a different way, e.g. via subclassing, but when that’s not possible, they are a powerful tool. Some examples for real world use cases – not saying that the affinity of software developers for coffee isn’t real – are state machines and builders.

If you’re interested in the full source code, check out the project on GitHub. Any questions or feedback are highly appreciated.

Tags

Kommentare

  • Markus Hauck

    9. February 2016 von Markus Hauck

    In case you still have some difficulties understanding the “sandwich”, here is a little more explanation:

    When the compiler tries to instantiate the T type parameter of the method, it has to find a type that is:
    1) a supertype of S, which represents the hacker’s current state (either State.Caffeinated or State.Decaffeinated)
    2) a subtype of State.Caffeinated

    To understand this, keep in mind that the inheritance relation is reflexive, this means that for a type A, both A <: A and A >: A. If we also take into account that the S parameter of Hacker is either “State.Caffeinated” or “State.Decaffeinated” we can have the following concrete type bounds:

    1) S = State.Decaffeinated, then the bound on hackOn is [T >: State.Decaffeinated <: State.Caffeinated] 2) S = State.Caffeinated, then the bound on hackOn is [T >: State.Caffeinated <: State.Caffeinated]For 1) there is no possible way to instantiate T without violating the type bounds. On the other hand in case 2) there is a single way to instantiate T and the compiler will happily infer it (which?)

Comment

Your email address will not be published. Required fields are marked *