Depth-subtyping and Mutation

November 26, 2018,
keywords: object-oriented programming, type safety

This is the blog version of a talk I did for the PLSE seminars at UofA.

In this post I am going to discuss the problems with adding mutation to an object-oriented programming language with depth-subtyping and propose a type-system which safely supports mutation while keeping a notion of depth-subtyping.

I will avoid formal definitions for this post and instead motivate ideas through examples. Many of these ideas will be familiar for anyone who has programmed in an object-oriented language before. I will also be very loose with terminology. For example, I might say objects are subtypes when I mean their types are subtypes.

What is Typing?

Before I can discuss depth-subtyping or even subtyping, I am going to discuss what I mean by typing. For this post I am going to focus on typing for values, rather than typing for program fragments.

So what are values? Values are things such as true, 1, 1.3, and the object {x: 2, y: 3}. This last example is an object with fields x and y which contain the numbers 2 and 3.

If you are familiar with object-oriented languages you will recognize the above values as having the types Bool, Int, Float, and {x: Int, y: Int}.

For this post, we are going to think of typing as the following:

The type of a value represents the operations we are allowed to perform on the value.

For example, if we have an Int, we are allowed to add it to another Int, but we are not allowed to add it to an object:

  • 1 + 3 is allowed
  • 1 + {x: 2, y: 3} is not allowed

What is Subtyping?

Let’s start with some examples:

  • Nat <: Int. We often think of natural numbers as being a subtype of integers.
  • {x: Int, y: Int, z: Int} <: {x: Int, y: Int}. If we have a object with fields x, y, and z are subtypes of objects with fields x and y.

But what does subtyping mean? Given the above notion of typing:

We say that T is a subtype of U (written T<:U) if all operations allowed on values of type U are allowed on values of type T.

Width-subtyping

Before we discuss depth-subtyping, let’s discuss a contrasting idea: width-subtyping. Let’s go back to the last example:

{x: Int, y: Int, z: Int} <: {x: Int, y: Int}

Any operations allowed on {x: Int, y: Int} are also allowed on {x: Int, y: Int, z: Int}; we just ignore the field z. This idea is known as width-subtyping: objects which are wider in fields are subtypes.

Most languages which support subtyping support width-subtyping in some way or the other. For example, in Java we can extend an object type with more fields.

What is Depth-subtyping?

Depth-subtyping is the idea that if we have subtypes T<:U, then objects containing Ts are subtypes of objects containing Us: {field: T} <: {field: U}. Contrasted with width-subtyping, instead of going wider in fields, we go deeper in fields.

For example, since we had Nat<:Int, we have PointNat<:PointInt where PointNat and PointInt are defined as:

    PointNat = {x: Nat, y: Nat}
    PointInt = {x: Int, y: Int}

And it can be depth-subtyping all the way down: objects containing {field: T} are subtypes of objects containing {field: U}, and so on.

We want Depth-subtyping!

Depth-subtyping can be very useful. Consider the following function:

    Int diff(PointInt inp){
      return (inp.x - inp.y)
    }

diff takes an object which contains Ints in the field x and y, and subtracts the x value from the y value and returns the answer.

Now suppose we have a PointNat p.

    PointNat p = new PointNat(3,4)

It is completely safe to pass p to the function diff, and depth-subtyping would allow us to reuse the diff function for PointNats. Without depth-subtyping we would have to rewrite the function for PointNats.

The diff function only relies on the input object having fields x and y containing values which can be treated as Ints. In general, it is safe to pass a PointNat to a function which expects its input to be a PointInt… unless the input is being mutated.

Mutation is Bad for Depth-subtyping!

To see why mutation is bad for depth-subtyping, we are going to introduce some more types.

    Point3D = {x: Int, y: Int, z: Int}
    Container[T] = {field: T}

Usual width-subtyping implies that Point3D <: PointInt. Consider the following program. We will explain it shortly.

1.    Point3D p3d = new Point3D(1, 2, 3)
2.    PointInt p2d = new PointInt(1, 2)
3.    Container[Point3D] c = new Container(p3d)
4.    Container[PointInt] c2 = c
5.    c2.field := p2d
6.    Point3d p = c.field
7.    return p.x + p.y + p.z

We first create points p3d and p2d, the first being a 3d-point (Point3D) and the second being a 2d-point (PointInt). Then, we create a 3d-point container, c. Next, we use depth-subtyping to cast c to a 2d-point container c2. Then we mutate the container with p2d using the reference c2. Now we use the original reference c, which is treated as a 3d-point container, to read its field as a 3d-point. Then we read the fields of the object and sum them up.

If we run this code, it will throw a runtime exception when p.z is executed! This is because, we when we read c.field, it actually contains a 2d-point, so it does not have a z field. So trying to read the non-existent z field throws a runtime exception.

Fun fact: If we used Array instead of Container, Java would compile the above code! The Array upcasting in Java was well intentioned, it would allow a form of depth-subtyping for arrays, but it is ultimately unsafe.

What went Wrong?

So what went wrong? The problem was that the operation “assign 2d-point”:

  • should be allowed for 2d-point containers, but
  • should not be allowed for 3d-point containers.

So not all operations allowed for 2d-point containers should be allowed for 3d-point containers, even though 3d-points are subtypes of 2d-points.

This might lead us to believe that we cannot allow depth-subtyping in languages with field mutation. Fortunately, this turns out not be true! We can recover depth-subtyping, or at least a notion thereof, if we use something which I call bounded field-typing.

Bounded Field-typing

In bounded field-typing, fields have bounded types such as the following:

    {fld: T..U}

We call the lower bound T the setter type of the field fld, and U the getter type. Usually, the setter type is a subtype of the getter type, T<:U.

If x has the above object type, reading the field fld produces a value of type U. Under bounded field-typing, field reads produce the getter type. For mutation, the assignment x.fld:=y is allowed if y has type T; i.e. we allow assignments of the setter type.

Recovering Depth-subtyping

In the bounded setting, we allow depth-subtyping for getter-types:

  • S<:U implies
  • {fld: T..S}<:{fld: T..U}.

{fld: T..U} allows us to read a U from the field fld. Since all operations on Us are allowed on Ss, we are allowed to read an S from a {fld: T..S} and treat it as an U.

We also allow depth-subtyping for setter-types, but in the other direction, i.e. subtyping is contravariant in the setter type.

  • S<:T implies
  • {fld: T..U}<:{fld: S..U}.

Notice how, S and T appear in opposite directions of the <: symbol. If we are allowed to write a value which allows the operations of T to be performed on it, then it is safe to write a value which allows more operations.

In the bounded setting, the previous types such as PointInt and PointNat are written with the setter and getter types being equal:

    PointNat = {x: Nat..Nat, y: Nat..Nat}
    PointInt = {x: Int..Int, y: Int..Int}

Safe Depth-subtyping

In the bounded setting, PointNat is not a subtype of PointInt. Since we do not have that Int<:Nat, their setter types are not in the subtyping relationship required for depth-subtyping. However, since Nat<:Int we have:

    PointNat <: {x: Nat..Int, y: Nat..Int}

Using the same subtyping relationship for the setter types for PointInt, we have:

    PointInt <: {x: Nat..Int, y: Nat..Int}

Using this, we can write a function diff which works for both PointInts and PointNats.

    Int diff({x: Nat..Int, y: Nat..Int} inp){
      return (inp.x - inp.y)
    }

At this point, you will notice that the above, diff function is not quite general enough. If we had a value of the following type, we would not be able to use diff on it, even integers are neither subtypes nor supertypes of natural numbers.

    PointEvenInt = {x: EvenInt..EvenInt, y: EvenInt..EvenInt}

For this reason, under bounded field-typing it is very useful to have a bottom type, . There is no way to create a value of type , but we assume that all operations are allowed on values of type . More importantly, ⊥<:T for any type T. This allows us to write diff as the following.

    Int diff({x:..Int, y:..Int} inp){
      return (inp.x - inp.y)
    }

Notice that we use for the setter types of our input parameter. We can now call the diff function with a PointEvenInt as its input, or any object as long as it contains field x and y which contain values which can be treated as Ints. So we have effectively recovered what we wanted from depth-subtyping!

3d-point Containers are not 2d-point Containers

Under bounded field-typing, we still have that 3d-points (Point3D) are subtypes of 2d-points (PointInt), since width-subtyping is still safe. As we discussed in the previous example, 3d-point Containers are not 2d-point Containers because they do not allow the operation “Assign 2d-point”. This is now reflected at the type level.

1.    Point3D p3d = new Point3D(1, 2, 3)
2.    PointInt p2d = new PointInt(1, 2)
3.    Container[Point3D] c = new Container(p3d)
4.    Container[PointInt] c2 = c
5.    c2.field := p2d
6.    Point3d p = c.field
7.    return p.x + p.y + p.z

Here the assignment in line 4 is not allowed and caught by the type system. To make the assignment valid, we can change the container type to a bounded type.

1.    Point3D p3d = new Point3D(1, 2, 3)
2.    PointInt p2d = new PointInt(1, 2)
3.    Container[Point3D] c = new Container(p3d)
4.    Container[Point3d..PointInt] c2 = c
5.    c2.field := p2d
6.    Point3d p = c.field
7.    return p.x + p.y + p.z

But now, the assignment in line 5 is not valid since it is only safe to assign 3d-points to c2. If we change the original Container of c type to PointInt, then the assignment is allowed.

1.    Point3D p3d = new Point3D(1, 2, 3)
2.    PointInt p2d = new PointInt(1, 2)
3.    Container[PointInt] c = new Container(p3d)
4.    Container[PointInt] c2 = c
5.    c2.field := p2d
6.    Point3d p = c.field
7.    return p.x + p.y + p.z

But now, the field read in line 6 is not allowed, since we are only allowed to assume that the field contains a PointInt. No matter what we do, the type system does not allow us to assign a 2d-point and read it back as a 3d-point.

Bounded field-typing really is safe.

This concludes the post. I hope this shed some light into why the “getter and setter methods” meme in Java exists. I’m going to leave you guys by saying that bounded field-typing really is safe! I formally proved it to be safe in an extension of the DOT calculus called κDot (officially kappa-dot, informally kay-dot). You can find the Coq code at here and the accompanying paper which I presented at the Scala Symposium here.

Updates

A friend pointed out that this post is being discussed on Reddit.