Recently I have been working through a compiler book on the side, and there were two places where I wanted custom types but did not want to define them globally. The types were going to be used locally within a function, and it felt odd to define them globally when no other function was going to use them. Even defining them locally in the same file and then then hiding it in the interface file felt odd, I just needed the types for single functions, nothing else in the module!
This is what I mean by Conjuring Types from the Ether: having access to types without having to declare them! Generally you want lightweight syntax for creating values of these types, otherwise it would be worth the effort to declare them.
Both the places where I had the problem are simply cases of Boolean Blindness, and both stem from me trying to keep pattern matching simple!
Boolean Blindness
The first case where I wanted to conjure types from the ether is a more classic version of Boolean Blindness. I had some code like this:
let global_function_exposed_in_module params =
...let check_function args = ... in
match check_function params with
true -> ??
| false -> ?? |
As I was trying to fill in the above ??
, I realized that I had forgotten what
true
meant and what false
meant.
And the actualy nome of check_function
was weird enough that I couldn’t figure
out if true
meant success or not, check_function
is just a place holder name
I’m using for this post.
This is exactly Boolean Blindness! As Bob Harper put it I have “blinded” myself by “reducing the information” I had “at hand to a bit, and then trying to recover that information later by remembering the provenance of that bit”.
The JavaScript/TypeScript Solution:
At this point, if I were using a in a language like JavaScript, I would go and
modify check_function
to return the strings "Success"
or "Fail"
.
This is somewhat natural in JavaScript because you can case match on strings.
TypeScript with it’s literal types improves things!
The case match on "Success"
and "Fail"
would be compile time checked to be
exhaustive!
(Disclaimer: I haven’t actually checked if literal types can be conjured from
the ether or not.)
The Racket/Lisp Solution:
In Racket, or any Lisp with pattern matching, the situation is ever so slightly
improved because you have access to symbols.
You can return the symbols 'Success
or 'Fail
, and the situation is again
improved by using typed racket for exhastivity checking.
But overall this is very similar to the JavaScript/TypeScript case.
The OCaml Solution: I was programming in OCaml, and I was really happy that OCaml has something similar to symbols, called Polymorphic Variants. Generally, Polymorphic Variants can carry around data other pieces of data just like like Algebraic Data Types, and have interesting structural subtyping between themselves. These properties weren’t that useful to me, but what was useful was being able to conjure the Polymorphic Variants from the Ether without having to declare them globally.
let global_function_exposed_in_module params =
...let check_function args: [`Success|`Fail] = ... in
match check_function params with
| `Success -> ?? | `Fail -> ??
The use of the type [``Success|``Fail]
is contained completely within this
function, and I also get a exhaustivity check!
Position Blindness?
This second case where I wanted to conjure types from the other involved tuples
and records.
I was traversing the AST, and creating sets of two kinds of variables.
Initially my code looked like the following, collect_exp
, and collect_dec
are mutually recursive.
let collect_exp (env: (var Set.t * var Set.t)) params:
Set.t * var Set.t =
var
...and collect_dec (env: (var Set.t * var Set.t)) params:
Set.t * var Set.t =
var
...let (collect_a?, collect_b?) = collect_exp args in
...
Midway through collect_dec
I forgot what kind of variable collect_exp
returned in which position.
Normally, would know which position is what based on the differences in the
types in the two positions, but here both the types are the same.
This again is another kind of boolean blindness. To illustrate this, I’m gong to switch to Standard ML syntax.
let x = collect_exp args in
let collect_a = x#1? in
let collect_b = x#2? in
...
To choose between variable kind a
and variable kind b
, I am relying on the
boolean 1|2
!
The solution here in an untyped setting is to modify the code to use
dictionaries/records instead of tuples.
So collect_exp
would return {kind_a: var Set.t, kind_b: var Set.b}
.
We are no longer blind to any positions, because we have names instead, just
like names in variants/symbols.
But declaring a type like the following feels overkill.
type record_used_in_foo_function_only =
Set.t
{ kind_a: var Set.t
; kind_b: var }
In OCaml, you can conjure object types from the ether, but the syntax there is a little heavy weight. For example, here’s how you would create an object representing the above record.
let my_object =
object
val kind_a = Set.empty
val kind_b = Set.empty
end
I consider the use of the keyword object
and the delimiter end
to be fairly
heavy.
Plus they cannot be easily destruct like tuples and records, and copy syntax for
objects is also somewhat ugly.
For records you can do { x with kind_a = Set.union y z }
.
In the end, I ended up just keeping the code as is and just going back to
collect_exp
to figure out which position is what, but I really wish I could
conjure a record type from the ether.
This is possible in other languages.
The examples I can think of easily right now are records in
Flix and
Elm.
In these languages you can destruct records with types conjured from the ether
using record style patern matching, and they also have lightweight copy syntax.
let {kind_a = y; kind_b = z} = collect_exp args in
An aternative that I did consider but then decided against is changing the tuples to contain polymorphic variants.
let collect_exp env params: ([`A of var Set.t], [`B of var Set.t]).
As I mentioned above, polymorphic variants can carry data, and here the variant
`A
caries a variable set with it.
Position Blindness in Function Arguments
This kind of positional blindness also happens when passing arguments to
functions of the same type.
For example, I can never remember which of the arguments of memcpy
is the
source and which is the destination without looking at the manpage.
The declared type is memcpy(void *, void *, size_t)
.
Most modern languages solve this by using named arguments.
Some languages like (iirc)
Swift, are
strict about not letting you pass named arguments positionally, but other
languages like
C#
allow for flexibility.
Alternatively, you can also ask that the arguments be records or structs.
E.g. what if the argument to memcpy(MemCpy)
was a struct?
typedef struct {
void * dest, src;
size_t size;
} MemCpy;