Type Checker and Interpreter for System F

System F is a type system wtih parametric polymorphism based on Simple Typed Lambda Calculus (STLC).

Compared with STLC, System F adds universal quantification over types. For example, a polymorphic identity function can be written like this $ \Lambda \alpha . \lambda x:\alpha . x $, and its has a forall type $ \forall \alpha. \alpha \rightarrow \alpha $. Or using type rules to state it: $ \vdash \Lambda \alpha . \lambda x:\alpha . x : \forall \alpha . \alpha \rightarrow \alpha$.

The capital lambda $\Lambda$ here is type abstraction, which introduces a fresh type variable $\alpha$. Just like applying a value to a normal lambda abstraction, we can also apply a concrete type to a type abstraction and obtain a specialized type. For example, apply type $num$ to this polymorphic identity function $(\Lambda \alpha . \lambda x:\alpha . x)[num]$ will derive a function $ \lambda x : num . x $, and it has function type $num \rightarrow num$.

This article shows the implementation of type checker and interpreter for a System F language with explicit type annotations (type inferences will be another topic). The language it supports has S-Expression based grammar, integer and boolean as primitive types, type level functions and type variables, as well as lambda abstractions. Here we use LAMBDA as type abstraction, and use @ for type-level application.

As a demostration, here is an example of a polymorphic identity function and plus arithmetic operations:

{let {[id : {forall {a} {a -> a}}
      [LAMBDA [a] {lambda {[x : a]} x}]]}
  {+ {[@ id num] 1} {{[@ id {num -> num}] {lambda {[x : num]} x}} 2}}}

; After type applications:

{+ {{lambda {[x : num]} x} 1} 
   {{{lambda {[x : {num -> num}]} x} {lambda {[x : num]} x}} 2}}

It is stating that there is an type-level identity function id with type $ \forall \alpha . \alpha \rightarrow \alpha$, we first apply type $num$ to id and obtain a specialized identity function with type $ num \rightarrow num $ and instantly apply it with number 1; then we apply type $num \rightarrow num$ to id and obtain a specialized identity function with type $(num \rightarrow num) \rightarrow (num \rightarrow num)$ and instantly apply it with {lambda {[x : num]} x}, the result comes out is also a function {lambda {[x : num]} x}. Later we apply this {lambda {[x : num]} x} with number 2. Finally, it is just (+ 1 2).

Interestingly, since we are just studying explicitly typed System F without any type reconstruction, the type checker and interpreter we implemented support Rank-$n$ types for any $n$. This means polymorphic functions (before type instantiation) can be used as first-class values. For example, this is a valid program but can not be written in SML which only supports Rank-1 types:

{let {[f : {forall {a} {a -> {forall {b} {{a -> b} -> b}}}}
  [LAMBDA [a] {lambda {[x : a]}
    [LAMBDA [b] {lambda {[g : {a -> b}]} {g x}}]}]]}
{[@ {[@ f num] 3} bool] {lambda {[x : num]} true}}}

In the above program, f has type $\forall \alpha . \alpha \rightarrow \forall \beta . (\alpha \rightarrow \beta) \rightarrow \beta$. You can see more examples in the last of this post.

The type checker and interpreter are based on lecture of CS6510 at University of Utah, but re-implemented in Racket.

Parsing

At the very beginning, we define some structures as AST which represent Values, Expressions and Types in our language.

;; Implementation of System F
#lang racket

(struct NumV (n) #:transparent)
(struct BoolV (b) #:transparent)
(struct ClosureV (arg body env) #:transparent)
(struct PolyV (body env) #:transparent)

(struct NumE (n) #:transparent)
(struct BoolE (b) #:transparent)
(struct IdE (id) #:transparent)
(struct PlusE (l r) #:transparent)
(struct MultE (l r) #:transparent)
(struct LamE (arg arg-type body) #:transparent)
(struct AppE (fun arg) #:transparent)
(struct TyLamE (arg body) #:transparent)
(struct TyAppE (tyfun tyarg) #:transparent)

(struct NumT () #:transparent)
(struct BoolT () #:transparent)
(struct FunT (arg result) #:transparent)
(struct VarT (name) #:transparent)
(struct ForallT (name tbody) #:transparent)

The parsers are rather straight forward, just transforming S-Expressions to the structures we defined above.

;; Parser

(define (parse s)
  (match s
    [(? number? x) (NumE x)]
    ['true (BoolE #t)]
    ['false (BoolE #f)]
    [(? symbol? x) (IdE x)]
    [`(+ ,l ,r) (PlusE (parse l) (parse r))]
    [`(* ,l ,r) (MultE (parse l) (parse r))]
    [`(let ([,var : ,ty ,val]) ,body)
     (AppE (LamE var (parse-type ty) (parse body)) (parse val))]
    [`(lambda ([,var : ,ty]) ,body)
     (LamE var (parse-type ty) (parse body))]
    [`(LAMBDA [,tvar] ,body) (TyLamE tvar (parse body))]
    [`(@ ,tyfun ,tyarg) (TyAppE (parse tyfun) (parse-type tyarg))]
    [`(,fun ,arg) (AppE (parse fun) (parse arg))]
    [else (error 'parse "invalid expression")]))

(define (parse-type t)
  (match t
    ['num (NumT)]
    ['bool (BoolT)]
    [(? symbol? tvar) (VarT tvar)]
    [`(,tyarg -> ,tyres) (FunT (parse-type tyarg) (parse-type tyres))]
    [`(forall (,tvar) ,t) (ForallT tvar (parse-type t))]
    [else (error 'parse-type "invalid type")]))

Environment and Type Environment

Then we define a make-lookup function to generate lookup functions of evaluation environment and type environment. We also have a type-var-lookup function, which is used for looking up type variable in type environment.

;; Environment & Type Environment

(define (make-lookup error-hint isa? name-of val-of)
  (λ (name vals)
    (cond [(empty? vals) (error error-hint "free variable")]
          [else (if (and (isa? (first vals))
                         (equal? name (name-of (first vals))))
                    (val-of (first vals))
                    ((make-lookup error-hint isa? name-of val-of) name (rest vals)))])))

(struct Binding (name val) #:transparent)
(define lookup (make-lookup 'lookup Binding? Binding-name Binding-val))
(define ext-env cons)

(struct TypeVar (id) #:transparent)
(struct TypeBinding (name type) #:transparent)
(define type-lookup (make-lookup 'type-lookup TypeBinding? TypeBinding-name TypeBinding-type))
(define type-var-lookup (make-lookup 'type-var-lookup TypeVar? TypeVar-id TypeVar-id))
(define ext-tenv cons)

Type Checker

The type checker is basically similar with a type checker for STLC, except the type abstraction TyLamE and type application TyAppE. For type abstraction TyLamE, we generate a forall type ForallT, and also type-check the body with an extended type environment. For type application, we need to firstly check the if the type argument is valid; then type-check the type function and expect a ForallT type, and substitute the formal type argument with the actual type argument in the body.

;; Type Checker

(define (type-error e ty)
  (error 'type-error "~a should has type: ~a" e ty))

(define (type-check e tenv)
  (match e
    [(NumE n) (NumT)]
    [(BoolE b) (BoolT)]
    [(IdE n) (type-lookup n tenv)]
    [(PlusE l r) (type-check-nums l r tenv)]
    [(MultE l r) (type-check-nums l r tenv)]
    [(LamE arg-name arg-type body)
     (type-var-check arg-type tenv)
     (FunT arg-type (type-check body
                                (ext-tenv (TypeBinding arg-name arg-type) tenv)))]
    [(AppE fun arg)
     (match (type-check fun tenv)
       [(FunT arg-type res-type)
        (if (equal? arg-type (type-check arg tenv))
            res-type
            (type-error arg arg-type))]
       [else (type-error fun "function")])]
    [(TyLamE n body)
     (ForallT n (type-check body (ext-tenv (TypeVar n) tenv)))]
    [(TyAppE tyfun tyarg)
     (type-var-check tyarg tenv)
     (match (type-check tyfun tenv)
       [(ForallT n body) (type-subst n tyarg body)]
       [else (type-error tyfun "polymorphic function")])]))

(define (type-check-nums l r tenv)
  (match* ((type-check l tenv) (type-check r tenv))
    [((NumT) (NumT)) (NumT)]
    [((NumT) _) (type-error r (NumT))]
    [(_ _) (type-error l (NumT))]))

(define (type-var-check arg-type tenv)
  (match arg-type
    [(NumT) (values)]
    [(BoolT) (values)]
    [(FunT arg res)
     (type-var-check arg tenv) (type-var-check res tenv)
     (values)]
    [(VarT id) (type-var-lookup id tenv) (values)]
    [(ForallT id ty) (type-var-check ty (ext-tenv (TypeVar id) tenv))]))

(define (type-subst what for in)
  (match in
    [(NumT) (NumT)]
    [(BoolT) (BoolT)]
    [(FunT arg res) (FunT (type-subst what for arg)
                          (type-subst what for res))]
    [(VarT n) (if (equal? what n) for in)]
    [(ForallT n body)
     (cond [(equal? what n) (ForallT n body)]
           [(free-type-var? n for)
            (define new-n (gen-name n 1 for body))
            (define new-body (type-subst n (VarT new-n) body))
            (type-subst what for (ForallT new-n new-body))]
           [else (ForallT n (type-subst what for body))])]))

(define (gen-name n i for body)
  (let ([new-n (string->symbol (string-append (symbol->string n)
                                              (number->string i)))])
    (if (or (free-type-var? new-n for)
            (free-type-var? new-n body))
        (gen-name n (+ i 1) for body)
        new-n)))

(define (free-type-var? n ty)
  (match ty
    [(NumT) #f]
    [(BoolT) #f]
    [(FunT a r) (or (free-type-var? n a) (free-type-var? r a))]
    [(VarT n^) (equal? n^ n)]
    [(ForallT n^ body)
     (if (equal? n n^) #f (free-type-var? n body))]))

Interpreter

The interpreter is a recursive function defined on expression structures we define at the beginning. Two interesting things: 1) evaluating a type abstraction expression TyLamE, we obtain a PolyV value, which will be used later; 2) applying a type variable to a PolyV value, we simply evaluate the body of PolyV value because it is guaranteed to be a well-typed program after type checking.

;; Interpreter

(define (interp expr env)
  (match expr
    [(IdE x) (lookup x env)]
    [(NumE n) (NumV n)]
    [(BoolE b) (BoolV b)]
    [(PlusE l r) (NumV (+ (NumV-n (interp l env))
                          (NumV-n (interp r env))))]
    [(MultE l r) (NumV (* (NumV-n (interp l env))
                          (NumV-n (interp r env))))]
    [(LamE n t body) (ClosureV n body env)]
    [(AppE fun arg) (match (interp fun env)
                      [(ClosureV n body env*)
                       (interp body (ext-env (Binding n (interp arg env)) env*))]
                      [else (error 'interp "not a function")])]
    [(TyLamE n body) (PolyV body env)]
    [(TyAppE tyfun tyarg)
     (match (interp tyfun env)
       [(PolyV body env*) (interp body env*)]
       [else (error 'interp "not a polymorphic function")])]))

(define mt-env empty)
(define mt-tenv empty)
(define (run prog)
  (define prog* (parse prog))
  (type-check prog* mt-tenv)
  (interp prog* mt-env))

Examples in System F

To run with some programs in System F, we can just use run function.

For examples:

(run '{let {[x : num 4]}
        {let {[y : num 5]}
          {{{lambda {[x : num]}
              {lambda {[y : num]}
                {+ x y}}} x} y}}})

(run '{let {[id :  {forall {a} {a -> a}}
            [LAMBDA [a] {lambda {[x : a]} x}]]}
            {+ {[@ id num] 1} {{[@ id {num -> num}] {lambda {[x : num]} x}} 2}}}

Also, we can encode booleans and numbers in System F. Here are some examples for boolean encodings:

(define Bool '{forall [a] {a -> {a -> a}}})
(define True '{LAMBDA [a] {lambda {[x : a]} {lambda {[y : a]} x}}})
(define False '{LAMBDA [a] {lambda {[x : a]} {lambda {[y : a]} y}}})
(define And `{lambda {[x : ,Bool]} {lambda {[y : ,Bool]} {{[@ x ,Bool] y} ,False}}})
(define Bool->Num `{lambda {[x : ,Bool]} {{[@ x num] 1} 0}})

(run `{let {[t : ,Bool ,True]}
        {let {[f : ,Bool ,False]}
          {let {[and : {,Bool -> {,Bool -> ,Bool}} ,And]}
            {,Bool->Num {{and t} f}}}}})

(run `{let {[t : ,Bool ,True]}
        {let {[f : ,Bool ,False]}
          {let {[and : {,Bool -> {,Bool -> ,Bool}} ,And]}
            {,Bool->Num {{and t} t}}}}})