This is the blog version of a talk I did for the PLSE seminars at UofA.

In this post I am going to discuss the problems with adding mutation to an object-oriented programming language with depth-subtyping and propose a type-system which safely supports mutation while keeping a notion of depth-subtyping.

I will avoid formal definitions for this post and instead motivate ideas through examples. Many of these ideas will be familiar for anyone who has programmed in an object-oriented language before. I will also be very loose with terminology. For example, I might say objects are subtypes when I mean their types are subtypes.

# What is Typing?

Before I can discuss depth-subtyping or even subtyping, I am going to discuss what I mean by typing. For this post I am going to focus on typing for values, rather than typing for program fragments.

So what are values? Values are things such as `true`

, `1`

, `1.3`

, and the object `{x: 2, y: 3}`

. This last example is an object with fields `x`

and `y`

which contain the numbers `2`

and `3`

.

If you are familiar with object-oriented languages you will recognize the above values as having the types `Bool`

, `Int`

, `Float`

, and `{x: Int, y: Int}`

.

For this post, we are going to think of typing as the following:

The type of a value represents the operations we are allowed to perform on the value.

For example, if we have an `Int`

, we are allowed to add it to another `Int`

, but we are not allowed to add it to an object:

`1 + 3`

is allowed`1 + {x: 2, y: 3}`

is**not**allowed

# What is Subtyping?

Let’s start with some examples:

`Nat <: Int`

. We often think of natural numbers as being a subtype of integers.`{x: Int, y: Int, z: Int} <: {x: Int, y: Int}`

. If we have a object with fields`x`

,`y`

, and`z`

are subtypes of objects with fields`x`

and`y`

.

But what does subtyping mean? Given the above notion of typing:

We say that`T`

is a subtype of`U`

(written`T<:U`

) if all operations allowed on values of type`U`

are allowed on values of type`T`

.

# Width-subtyping

Before we discuss depth-subtyping, let’s discuss a contrasting idea: **width-subtyping**. Let’s go back to the last example:

`{x: Int, y: Int, z: Int} <: {x: Int, y: Int}`

Any operations allowed on `{x: Int, y: Int}`

are also allowed on `{x: Int, y: Int, z: Int}`

; we just ignore the field `z`

. This idea is known as width-subtyping: objects which are wider in fields are subtypes.

Most languages which support subtyping support width-subtyping in some way or the other. For example, in `Java`

we can extend an object type with more fields.

# What is Depth-subtyping?

Depth-subtyping is the idea that if we have subtypes `T<:U`

, then objects containing `T`

s are subtypes of objects containing `U`

s: `{field: T} <: {field: U}`

. Contrasted with width-subtyping, instead of going wider in fields, we go deeper in fields.

For example, since we had `Nat<:Int`

, we have `PointNat<:PointInt`

where `PointNat`

and `PointInt`

are defined as:

And it can be depth-subtyping all the way down: objects containing `{field: T}`

are subtypes of objects containing `{field: U}`

, and so on.

# We want Depth-subtyping!

Depth-subtyping can be very useful. Consider the following function:

`diff`

takes an object which contains `Int`

s in the field `x`

and `y`

, and subtracts the `x`

value from the `y`

value and returns the answer.

Now suppose we have a `PointNat p`

.

It is **completely safe** to pass `p`

to the function `diff`

, and depth-subtyping would allow us to reuse the diff function for `PointNat`

s. Without depth-subtyping we would have to rewrite the function for `PointNat`

s.

The `diff`

function only relies on the input object having fields `x`

and `y`

containing values which can be treated as `Int`

s. In general, it is safe to pass a `PointNat`

to a function which expects its input to be a `PointInt`

… unless the input is being mutated.

# Mutation is Bad for Depth-subtyping!

To see why mutation is bad for depth-subtyping, we are going to introduce some more types.

Usual width-subtyping implies that `Point3D <: PointInt`

. Consider the following program. We will explain it shortly.

```
1. Point3D p3d = new Point3D(1, 2, 3)
2. PointInt p2d = new PointInt(1, 2)
3. Container[Point3D] c = new Container(p3d)
4. Container[PointInt] c2 = c
5. c2.field := p2d
6. Point3d p = c.field
7. return p.x + p.y + p.z
```

We first create points `p3d`

and `p2d`

, the first being a 3d-point (`Point3D`

) and the second being a 2d-point (`PointInt`

). Then, we create a 3d-point container, `c`

. Next, we use depth-subtyping to cast `c`

to a 2d-point container `c2`

. Then we mutate the container with `p2d`

using the reference `c2`

. Now we use the original reference `c`

, which is treated as a 3d-point container, to read its field as a 3d-point. Then we read the fields of the object and sum them up.

If we run this code, it will throw a runtime exception when `p.z`

is executed! This is because, we when we read `c.field`

, it actually contains a 2d-point, so it does not have a `z`

field. So trying to read the non-existent `z`

field throws a runtime exception.

Fun fact: If we used `Array`

instead of `Container`

, `Java`

would compile the above code! The `Array`

upcasting in Java was well intentioned, it would allow a form of depth-subtyping for arrays, but it is ultimately unsafe.

# What went Wrong?

So what went wrong? The problem was that the operation “assign 2d-point” is:

- should be allowed for 2d-point containers, but
- should not be allowed for 3d-point containers.

So not all operations allowed for 2d-point containers should be allowed for 3d-point containers, even though 3d-points are subtypes of 2d-points.

This might lead us to believe that we cannot allow depth-subtyping in languages with field mutation. Fortunately, this turns out not be true! We can recover depth-subtyping, or at least a notion thereof, if we use something which I call **bounded field-subtyping**.

# Bounded Field-typing

In bounded field-typing, fields have bounded types such as the following:

` {fld: T..U}`

We call the lower bound `T`

the *setter type* of the field `fld`

, and `U`

the *getter type*. Usually, the setter type is a subtype of the getter type, `T<:U`

.

If `x`

has the above object type, reading the field `fld`

produces a value of type `U`

. Under bounded field-typing, field reads produce the getter type. For mutation, the assignment `x.fld:=y`

is allowed if `y`

has type `T`

; i.e. we allow assignments of the setter type.

# Recovering Depth-subtyping

In the bounded setting, we allow depth-subtyping for getter-types:

`S<:U`

implies`{fld: T..S}<:{fld: T..U}`

.

`{fld: T..U}`

allows us to read a `U`

from the field `fld`

. Since all operations on `U`

s are allowed on `S`

s, we are allowed to read an `S`

from a `{fld: T..S}`

and treat it as an `U`

.

We also allow depth-subtyping for setter-types, but in the other direction, i.e. subtyping is covariant in the setter type.

`S<:T`

implies`{fld: T..U}<:{fld: S..U}`

.

Notice how, `S`

and `T`

appear in opposite directions of the `<:`

symbol. If we are allowed to write a value which allows the operations of `T`

to be performed on it, then it is safe to write a value which allows more operations.

In the bounded setting, the previous types such as `PointInt`

and `PointNat`

are written with the setter and getter types being equal:

# Safe Depth-subtyping

In the bounded setting, `PointNat`

is not a subtype of `PointInt`

. Since we do **not** have that `Int<:Nat`

, their setter types are not in the subtyping relationship required for depth-subtyping. However, since `Nat<:Int`

we have:

Using the same subtyping relationship for the setter types for `PointInt`

, we have:

Using this, we can write a function `diff`

which works for both `PointInt`

s and `PointNat`

s.

At this point, you will notice that the above, `diff`

function is not quite general enough. If we had a value of the following type, we would not be able to use `diff`

on it, even integers are neither subtypes nor supertypes of natural numbers.

For this reason, under bounded field-typing it is very useful to have a bottom type, `⊥`

. There is no way to create a value of type `⊥`

, but we assume that all operations are allowed on values of type `⊥`

. More importantly, `⊥<:T`

for any type `T`

. This allows us to write `diff`

as the following.

Notice that we use `⊥`

for the setter types of our input parameter. We can now call the `diff`

function with a `PointEvenInt`

as its input, or any object as long as it contains field `x`

and `y`

which contain values which can be treated as `Int`

s. So we have effectively recovered what we wanted from depth-subtyping!

# 3d-point Containers are not 2d-point Containers

Under bounded field-typing, we still have that 3d-points (`Point3D`

) are subtypes of 2d-points (`PointInt`

), since width-subtyping is still safe. As we discussed in the previous example, 3d-point Containers are not 2d-point Containers because they do not allow the operation “Assign 2d-point”. This is now reflected at the type level.

```
1. Point3D p3d = new Point3D(1, 2, 3)
2. PointInt p2d = new PointInt(1, 2)
3. Container[Point3D] c = new Container(p3d)
4. Container[PointInt] c2 = c
5. c2.field := p2d
6. Point3d p = c.field
7. return p.x + p.y + p.z
```

Here the assignment in line 4 is not allowed and caught by the type system. To make the assignment valid, we can change the container type to a bounded type.

```
1. Point3D p3d = new Point3D(1, 2, 3)
2. PointInt p2d = new PointInt(1, 2)
3. Container[Point3D] c = new Container(p3d)
4. Container[Point3d..PointInt] c2 = c
5. c2.field := p2d
6. Point3d p = c.field
7. return p.x + p.y + p.z
```

But now, the assignment in line 5 is not valid since it is only safe to assign 3d-points to `c2`

. If we change the original Container of `c`

type to `PointInt`

, then the assignment is allowed.

```
1. Point3D p3d = new Point3D(1, 2, 3)
2. PointInt p2d = new PointInt(1, 2)
3. Container[PointInt] c = new Container(p3d)
4. Container[PointInt] c2 = c
5. c2.field := p2d
6. Point3d p = c.field
7. return p.x + p.y + p.z
```

But now, the field read in line 6 is not allowed, since we are only allowed to assume that the field contains a `PointInt`

. No matter what we do, the type system does not allow us to assign a 2d-point and read it back as a 3d-point.

# Bounded field-typing really is safe.

This concludes the post. I hope this shed some light into why the “getter and setter methods” meme in Java exists. I’m going to leave you guys by saying that bounded field-typing really is safe! I formally proved it to be safe in an extension of the DOT calculus called *κ*Dot (officially kappa-dot, informally kay-dot). You can find the Coq code at here and the accompanying paper which I presented at the Scala Symposium here.