### A Soundness Proof of STLC by Definitional Interpreters in Agda

Proving type soundness of the simply typed lambda-calculus (STLC) is a common and fun exercise in programming language study. This post is basically a literature Agda file that does this excercise. Different from Wright and Felleisen (1994)’s syntatic approach that proves the soundness via the progress and preservation lemma, here I show a simpler proof using definitional interpreters (Amin and Rompf 2017).

The idea is to first encode the result of a step-indexed definitional interpreters with two Maybe (or, option) monads: the first level represents whether the interpreter produces a result within a given step constraint, the second level represents whether the interpreter produces a meaningful value or an error when terminated.

Then, using this notion, we may express the type soundness theorem as: for any well-typed program e of type \tau, if the interpreter terminates within a step bound, the result must be a value of the same type \tau. In other words, there is no undefined or erroneous behavior during the runtime.

The following code/proof works under Agda 2.6.0.1 and standard library 1.2.

#### 1` `Imports

First, we import some datatypes from the Agda standard library.

```
open import Data.Sum
open import Data.Nat
open import Data.Maybe
open import Data.Nat.Properties
open import Data.Unit hiding (_≤_)
open import Data.List hiding (length)
open import Data.Empty using (⊥; ⊥-elim)
open import Data.Bool hiding (_<_; _≤_)
open import Data.Product renaming (_,_ to ⟨_,_⟩)
import Relation.Binary.PropositionalEquality as Eq
open Eq using (_≡_; refl; sym; cong; inspect)
```

#### 2` `A Few Helper Functions/Lemmas

The first helper function is the length function for list. There is indeed a length function provided by the standard library, which computes the length through foldr, and it complicates the proof a little bit. Here, we use a slightly simpler version that directly recurs over the input list xs.

```
length : ∀ {X : Set} (xs : List X) → ℕ
length [] = 0
length (x ∷ xs) = suc (length xs)
```

The next one case_of_ is just a nicer syntax for pattern-matching over some data, of which the second argument f can be a case lambda term.

```
case_of_ : ∀ {a b} {A : Set a} {B : Set b} → A → (A → B) → B
case x of f = f x
```

Then we may use two simple lemmas about natural numbers and ≤: one states that n is always less or equal than 1 + 1 + n; the other states that if 1 + n ≡ m, then n must be less or equal than m.

```
n≤ssn : ∀ (n : ℕ) → n ≤ suc (suc n)
n≤ssn zero = z≤n
n≤ssn (suc n) = s≤s (n≤ssn n)
sn≡m→n≤m : ∀ (n m : ℕ) → (suc n ≡ m) → n ≤ m
sn≡m→n≤m zero m eq = z≤n
sn≡m→n≤m (suc n) (suc (suc n)) refl = s≤s (n≤1+n n)
```

#### 3` `Definitions of STLC

The definitions of identifiers, environments, as well as type environments are quite standard. Identifiers are using the de Bruijin notation; environments and type environments are both lists.

```
data Value : Set
data Type : Set
Id = ℕ
Env = List Value
TEnv = List Type
```

We assume there is a base type \mathbb{B}, and we elect to use booleans for that purpose. Function types are denoted by _↠_.

```
data Type where
𝔹 : Type
_↠_ : Type → Type → Type
```

Literals such as tt and ff are both terms. Variables take the indices to construct terms. Applications need two terms, one is the function being applied, the other is its operand. When building lambda terms, we only need the body term, as the variables are represented by the de Bruijin notations.

```
data Term : Set where
tt : Term
ff : Term
var : Id → Term
app : Term → Term → Term
lam : Term → Term
```

Booleans are immediate values; functions with their defining environments are also values.

```
data Value where
boolv : Bool → Value
clov : Env → Term → Value
```

##### 3.1` `Accessing Environments

Environment lookups are conducted over the list representation. Given the variable, represented by a natural number, we count the list from right to left, starting from index 0, to find the right position of this variable. For example, [1,2,3] ⇓ 0 ≡ 3, and [1,2,3] ⇓ 2 ≡ 1. The lookup function _⇓_ returns a value of Maybe X, where X is the element type of the list, indicating that it is possible to fail looking up.

```
_⇓_ : ∀ {X : Set} (xs : List X) (n : ℕ) → Maybe X
[] ⇓ n = nothing
(x ∷ xs) ⇓ n = if n ≡ᵇ (length xs) then just x else xs ⇓ n
```

##### 3.2` `Type Judgement under Contexts

The first two cases are judgement about boolean literals: they are well-typed under any context Γ.

The third case ty-var states that if x is defined with type τ under context Γ, then var x is well-typed of τ.

The ty-app case is inductive: if term e1 has type τ1 ↠ τ2 under context Γ, and term e2 has type τ1 under the same context Γ, then we may conclude that app e1 e2 has type τ2 under this context Γ.

The last one ty-abs extends the context in its premise: if e has type τ2 under the extended context τ1 :: Γ, where τ1 is the argument type, then lam e is function of type τ1 ↠ τ2.

```
data _⊢_∶_ : TEnv → Term → Type → Set where
ty-tt : ∀ (Γ : TEnv) → Γ ⊢ tt ∶ 𝔹
ty-ff : ∀ (Γ : TEnv) → Γ ⊢ ff ∶ 𝔹
ty-var : ∀ (Γ : TEnv) (x : Id) (τ : Type)
→ Γ ⇓ x ≡ just τ
→ Γ ⊢ var x ∶ τ
ty-app : ∀ (Γ : TEnv) (e1 e2 : Term) (τ1 τ2 : Type)
→ Γ ⊢ e1 ∶ (τ1 ↠ τ2)
→ Γ ⊢ e2 ∶ τ1
→ Γ ⊢ app e1 e2 ∶ τ2
ty-abs : ∀ (Γ : TEnv) (e : Term) (τ1 τ2 : Type)
→ (τ1 ∷ Γ) ⊢ e ∶ τ2
→ Γ ⊢ lam e ∶ (τ1 ↠ τ2)
```

##### 3.3` `Types of Values

Besides the judgements of well-typed terms, we also have well-typed values judgement in form of v : τ. The first case of boolean values is simple: any value of boolv v has type 𝔹. The second case states that if ρ and Γ are well-formed environments, and term e has type τ2 under τ1 ∷ Γ, then we can conclude the closure value clov ρ e has function type τ1 ↠ τ2.

```
data wf-env : Env → TEnv → Set
data _∶_ : Value → Type → Set
data _∶_ where
ty-bool-val : ∀ (v : Bool) → (boolv v) ∶ 𝔹
ty-clo-val : ∀ (ρ : Env) (Γ : TEnv) (e : Term) (τ1 τ2 : Type)
→ wf-env ρ Γ
→ (τ1 ∷ Γ) ⊢ e ∶ τ2
→ (clov ρ e) ∶ (τ1 ↠ τ2)
```

##### 3.4` `Well-formed Environments

The well-formedness datatype relates a value environment and a type environment, stating that they must be consistent and extended in lock-step fashion.

```
data wf-env where
wf-nil : wf-env [] []
wf-cons : ∀ (v : Value) (τ : Type) (ρ : Env) (Γ : TEnv)
→ v ∶ τ
→ wf-env ρ Γ
→ wf-env (v ∷ ρ) (τ ∷ Γ)
```

Also, there is a lemma asserts that if well-formed environment ρ and type environment Γ has the same length:

```
wf-length : ∀ (ρ : Env) (Γ : TEnv)
→ wf-env ρ Γ
→ length ρ ≡ length Γ
wf-length [] [] wf-nil = refl
wf-length (v ∷ ρ) (τ ∷ Γ) (wf-cons v τ ρ Γ x wf) =
cong suc (wf-length ρ Γ wf)
```

#### 4` `The Interpreter

Now, we may define an interpreter for the simply typed λ-calculus. In fact, the interpreter only use the term definition – it does not use any judgement of well-type terms/values or well-formed environments.

Since we are programming in a total language, all function must terminate, we implement the interpreter eval with a step counter n as the decreasing term. If n is 0, nothing is returned, indicating that the interpreter has consumed all “fuel”. Otherwise, we can evaluate at least one more step. The cases for literal boolean values and closures (at the bottom) are straightforward. For variables, we need to loop up the environment. Note that the expression ρ ⇓ x is already type Maybe Value, so we just need to wrap another just to conform the return type of eval.

The function application is the interesting case. We now have terms e₁ and e₂, and we need to recursively invoke eval on both. Here to extract the result of recursive calls, we use Agda’s with-abstraction – essentially it performs pattern matching at the left-hand side of the equation =. For the first recursive call eval n ρ e₁, we need to consider 4 cases.

The first one nothing indicates that the interpreter has terminated too early when evaluating e₁, so we cannot do anything further. The second case just nothing indicates that there is already a run-time type error when evaluating; and the third case just (boolv x) indicates we now have a type error for this application. In either case, we cannot proceed and just nothing is returned. In the last case, e₁ is evaluated to a closure body and ρ′, then we may proceed to the evaluation of e₂, which is similar to the former one. When having a value of e₂, we do not need to destruct it further, as the argument can be any value. So, the final thing to do is to recursively call eval on the body expression with an extended environment.

```
eval : (n : ℕ) (ρ : Env) (e : Term) → Maybe (Maybe Value)
eval zero ρ e = nothing
eval (suc n) ρ tt = just (just (boolv true))
eval (suc n) ρ ff = just (just (boolv false))
eval (suc n) ρ (var x) = just (ρ ⇓ x)
eval (suc n) ρ (app e₁ e₂) with eval n ρ e₁
... | nothing = nothing
... | just nothing = just nothing
... | just (just (boolv x)) = just nothing
... | just (just (clov ρ′ body)) with eval n ρ e₂
... | nothing = nothing
... | just nothing = just nothing
... | just (just v) = eval n (v ∷ ρ′) body
eval (suc n) ρ (lam e) = just (just (clov ρ e))
```

#### 5` `Soundness

Finally, we can show the soundness proof. Formally, the safety theorem states that given two well-formed environments ρ and Γ, if a term e has type τ under context Γ, i.e., well-typed, and if the evaluation of e under ρ terminates, i.e., producing a result res, then this result res must be in the form of just v for some value v. Meanwhile, this value v has the same type τ as the expression before evaluation. In other word, the interpreter can neither produce a value of type other than τ, nor produce nothing (i.e., just nothing).

The return type of this function is a existential (Sigma) type, where we have to give a witness of such value v of value Value.

```
safety : ∀ (n : ℕ) (e : Term) (τ : Type) (ρ : Env)
(Γ : TEnv) (res : Maybe Value)
→ Γ ⊢ e ∶ τ
→ wf-env ρ Γ
→ eval n ρ e ≡ just res
→ Σ[ v ∈ Value ] (just v ≡ res) × (v ∶ τ)
```

To prove it, we first perform pattern matching (induction) over the “fuel” n. Agda is smart enough to figure out that n cannot be 0, as this contradicts with our hypothesis eval n ρ e ≡ just res. So we just need to consider the inductive case n = suc n′. Similar to the interpreter, literal values are straightforward to handle. We just need to construct a pair, where the left-hand side is the value, and the right-hand side is also a pair that shows res must be just v and v : τ.

```
safety (suc n) tt 𝔹 ρ Γ res wt wf refl =
⟨ boolv true , ⟨ refl , ty-bool-val true ⟩ ⟩
safety (suc n) ff 𝔹 ρ Γ res wt wf refl =
⟨ boolv false , ⟨ refl , ty-bool-val false ⟩ ⟩
safety (suc n) (lam e) τ ρ Γ res (ty-abs Γ e τ1 τ2 wt) wf refl =
⟨ (clov ρ e) , ⟨ refl , ty-clo-val ρ Γ e τ1 τ2 wf wt ⟩ ⟩
```

To show that variable reference is sound, we need to use a lemma index-safe-ex, which gives us the safety of ρ ⇓ x from Γ ⇓ x, if ρ and Γ are related by the well-formedness relation. We show the proof of this lemma at the end of this section.

```
safety (suc n) (var x) τ ρ Γ res (ty-var Γ x τ Γ⇓x) wf refl
with index-safe-ex ρ Γ x τ wf Γ⇓x
... | ⟨ v , ⟨ ρ⇓x , v∶τ ⟩ ⟩ = ⟨ v , ⟨ sym ρ⇓x , v∶τ ⟩ ⟩
```

Again, the most interesting and tricky case is all about function applications. As the interpreter pattern-matches eval n ρ e1 first, here we do the same thing using the with-abstraction. We also use inspect to extract the equivalence proof term, shown as Eq.[_] below. Fortunately, Agda knows that eval n ρ e1 cannot be nothing, because it contradicts with the hypothesis. Then we have three cases to consider:

just nothing: this case should be a contradiction. We may use a recursive call of safety over e1 to help Agda realize that. Note that in the λ term of the case, we directly write down the absurd pattern ().

just (boolv b): this case is also a contradiction. We use the same fashion as above to derive conflict.

- just (clov ρ′ bd): now e1 is evaluated to a closure value. We may proceed to destruct the evaluation of e2, which has two cases to consider:
nothing: Note that we have also matched ev (i.e., the evidence of eval n ρ e ≡ just res) at the left-hand side of the clause. If eval n ρ e2 is nothing, then ev cannot be a valid evidence of eval n ρ e ≡ just res. Therefore, we use () to denote this is a contradiction (at left-hand side of =).

just v2: e2 is evaluated to some potential value v2, which has type Maybe value. Now we have everything to apply our inductive hypothesis of e1 and e2 and build up the safety proof of applications. The inductive hypotheses give us evidences of e1 and e2 evaluating to well-typed values, where just v2′ ≡ v2. After a few preparation of the needed values, we can apply safety recursively over the function body bd with the extended environments v2’ ∷ ρ’ and τ1 ∷ Γ₁. Then the theorem is proved.

```
safety (suc n) (app e1 e2) τ ρ Γ res (ty-app Γ e1 e2 τ1 τ e1wt e2wt) wf ev
with eval n ρ e1 | inspect (eval n ρ) e1
... | just nothing | Eq.[ eqv1 ] =
case safety n e1 (τ1 ↠ τ) ρ Γ nothing e1wt wf eqv1 of
λ{ () }
... | just (just (boolv b)) | Eq.[ eqv1 ] =
case safety n e1 (τ1 ↠ τ) ρ Γ (just (boolv b)) e1wt wf eqv1 of
λ{ ⟨ (boolv b) , ⟨ refl , () ⟩ ⟩ }
... | just (just (clov ρ’ bd)) | Eq.[ eqv1 ]
with eval n ρ e2 | inspect (eval n ρ) e2 | ev
... | nothing | v2eq | ()
... | just v2 | Eq.[ eqv2 ] | w
with safety n e1 (τ1 ↠ τ) ρ Γ (just (clov ρ’ bd)) e1wt wf eqv1
| safety n e2 τ1 ρ Γ v2 e2wt wf eqv2
... | ⟨ clov ρ’ bd , ⟨ refl , ty-clo-val ρ’ Γ₁ bd τ1 τ wf’ app-res-tp ⟩ ⟩
| ⟨ v2’ , ⟨ refl , v2’t ⟩ ⟩ =
let ρ* = v2’ ∷ ρ’
Γ* = τ1 ∷ Γ₁
wf* = wf-cons v2’ τ1 ρ’ Γ₁ v2’t wf’
in safety n bd τ ρ* Γ* res app-res-tp wf* w
```

To prove the soundness above, we use a lemma stating that given two well-formed environments ρ and Γ, if we can extract a type τ at Γ ⇓ i, then we can also extract a value v at the same position from ρ. This value v has the same type τ. The proof of this lemma is shown below.

```
index-safe-ex : ∀ (ρ : Env) (Γ : TEnv) (i : ℕ) (τ : Type)
→ wf-env ρ Γ
→ Γ ⇓ i ≡ just τ
→ Σ[ v ∈ Value ] (ρ ⇓ i ≡ just v) × (v ∶ τ)
index-safe-ex (v₁ ∷ ρ) (τ₁ ∷ Γ) i τ (wf-cons v₁ τ₁ ρ Γ v₁∶τ₁ wf) Γ⇓i
rewrite wf-length ρ Γ wf
with i ≡ᵇ length Γ | Γ⇓i
... | false | w = index-safe-ex ρ Γ i τ wf w
... | true | refl = ⟨ v₁ , ⟨ refl , v₁∶τ₁ ⟩ ⟩
```

#### Bibliography

Nada Amin and Tiark Rompf. Type Soundness Proofs with Definitional Interpreters. In Proc. Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, 2017. https://doi.org/10.1145/3009837.3009866 |

A.K. Wright and M. Felleisen. A Syntactic Approach to Type Soundness. Inf. Comput. 115(1), pp. 38–94, 1994. https://doi.org/10.1006/inco.1994.1093 |