An Interpreter for a Tiny Object-Oriented Language

Kepp Calm, image from http://www.keepcalm-o-matic.co.uk/p/keep-calm-and-object-oriented-program/

This post talks about implementing an Object-Oriented language with S-Expression syntax. Though it is a small langauge, but with features of object, class, and inheritance. It sounds like complicated, but actually just takes more than a hundred lines of code in Racket (including a parser).

Object-Oriented language is a big topic, but here I just want to focus on core concepts and their implementation such as object, class and inheritance.

Object and Class

The most essential idea of object, is 1) an object cloud have some properties (or fields), 2) we can send some messages to an object (or say method invocation), then 3) the object may do something (or nothing) as a respond to that message.

For example, we can view a number as an object. 1 is an integer object, as well as 2. We can send a message plus to object 1 with an argument object 2, and 1 should response another integer object as 3.

Conceptually, we can imagine an object as a map, which maps the fields or methods name to value. Here our value cloud be numbers, and also cloud be functions.

Class is an abstraction of object. We define some fields and methods in the class, but without any concrete data (at least in our tiny language).

Grammar

We will use get statement to get a property from an object, and use send statement to invoke a method of an object, and use new statement to instantiate a class.

The tiny object-oriented language is defined as below:

<expr> := <symbol>
        | <number>
        | (<op> <expr> <expr>)
        | (lambda <var> <expr>)
        | this
        | (get <expr> <symbol>)
        | (new <symbol> <expr>*)
        | (send <expr> <symbol> <expr>)
        | (expr expr)

<op> := + | *

For example, this piece of Java code does some simple things: 1) create a new Point object, 2) add its x coordinates with 3, 3) create another Point object, 4) and get its x coordinates.

(new Point(1, 2)).moveX(3)

(new Point(2, 4)).get("x")

In our OO language, it can be expressed as the follows:

(send (new Point (1 2)) moveX 3)

(get (new Point(2 4)) x)

And the class definition part:

<class> := (class <symbol>
             (<field>*)
             (<method>*))

<method> := (<symbol> <expr>)

<field> := <symbol>

In the body of a class method, we use keyword this to point to the object itself. And we can access a field value through get this field-name, or call its method through send this msg arg.

So, for the previous example, we can define a Point class:

(class Point
  (x y)
  ((moveX (lambda (offset) (new Point ((+ (get this x) offset)
                                       (get this y)))))
   (add (lambda (p) (new Point ((+ (get this x) (get p x))
                                (+ (get this y) (get p y))))))))

We can see that the Point class has two methods, moveX and add. The add method just add coordinates of two points.

Implementation

Our implementation is based on a simple call-by-value interpreter with function and closure. To support object-oriented features, we have some new structs: This, Get, New and Send. And to support Class, we add Method and Class structs to the interpreter. To store field values of an object, we just use a linear associated list.

The parse function simply does the lexical and syntax parsing, which transform the input language to our intermediate representation structure (defined in the head of our Racket code). And parse-class and parse-method methods are used for parsing class and method definitions.

The main part of interpreter is interp function. It takes four arguments: the expression we need to evaluate, the environment, the defined classes list, and current this object.

When interpreting a New struct, we firstly find the definition of class that we wanted to instantiate. And evaluate all arguments of constructor, then check the number of arguments and the number of formal parameters whether are same, if so we create an Object value which has the name and all arugment values comes from new statement.

For the Get struct, we first evaluate the obj part, and check if it is an real Object. That is because in a untyped language setting, a user may pass a number or a lambda to send, so we have to do some runtime checks. Then we find the class definition of object. In the class definition we stored all the field names as a list, then we cloud simply do a linear search to find the value of field.

Dealing with the Send struct is very similar with Get. But the difference is after we get the definition of class we need to get the definition of method. Since the body of method is merely a Lambda structure, so here we can construct a new App construct with method body and argument, then pass the App to interp function recursively. Another thing here is we need to change the this-val of interp to make sure it is pointing to the current object.

The complete code is as follows:

#lang racket

;; Intermediate representation of language
(struct Id (x) #:transparent)
(struct Num (n) #:transparent)
(struct Plus (l r) #:transparent)
(struct Mult (l r) #:transparent)
(struct Lambda (arg-name body) #:transparent)
(struct This () #:transparent)
(struct Get (obj field-name) #:transparent)
(struct New (class-name args) #:transparent)
(struct Send (obj method-name arg) #:transparent)
(struct App (fun arg) #:transparent)

;; Constructors of method and class
(struct Method (name body) #:transparent)
(struct Class (name field-names methods) #:transparent)

;; Value
(struct Closure (arg body env) #:transparent)
(struct Object (class-name field-vals) #:transparent)

; Environment operations
(define mt-env '())
(define ext-env cons)
(define bind (lambda (n v) (cons n v)))
(define bind-name car)
(define bind-val cdr)

; symbol * (listof binding) -> Value
(define lookup
  (lambda (x env)
    (cond [(null? env) (error 'lookup "free variable")]
          [(equal? x (bind-name (car env)))
           (bind-val (car env))]
          [else (lookup x (cdr env))])))

; symbol -> Expr
(define (parse exp)
  (match exp
    ['this (This)]
    [(? number? x) (Num x)]
    [(? symbol? x) (Id x)]
    [`(+ ,l ,r) (Plus (parse l) (parse r))]
    [`(* ,l ,r) (Mult (parse l) (parse r))]
    [`(lambda (,arg) ,body) (Lambda arg (parse body))]
    [`(get ,obj ,name) (Get (parse obj) name)]
    [`(new ,class ,args) (New class (map parse args))]
    [`(send ,obj ,msg ,arg) (Send (parse obj) msg (parse arg))]
    [`(,fun ,arg) (App (parse fun) (parse arg))]))

; symbol -> Class
(define parse-class
  (lambda (c)
    (match c
      [`(class ,class-name ,args ,meths)
       (Class class-name args (map parse-method meths))])))

; symbol -> Method
(define parse-method
  (lambda (m)
    (match m
      [`(,method-name ,method-body)
       (Method method-name (parse method-body))])))

; Expr * Env * (listof Class) * This -> Value
(define interp
  (lambda (exp env clzz this-val)
    (let ([interp* (lambda (e) (interp e env clzz this-val))])
      (match exp
        [(Num n) n]
        [(Id x) (lookup x env)]
        [(This) this-val]
        [(Plus l r) (+ (interp* l) (interp* r))]
        [(Mult l r) (* (interp* l) (interp* r))]
        [(Lambda arg-name body) (Closure arg-name body env)]
        [(App fun arg)
         (let ([fun-v (interp* fun)]
               [arg-v (interp* arg)])
           (cond [(Closure? fun-v)
                  (interp (Closure-body fun-v)
                          (ext-env (bind (Closure-arg fun-v) arg-v)
                                         (Closure-env fun-v))
                          clzz this-val)]
                 [else (error 'interp "not a function")]))]
        [(New clz-name field-exprs)
         (let ([clz (find-class clz-name clzz)]
               [field-vals (map interp* field-exprs)])
           (if (= (length field-vals) (length (Class-field-names clz)))
               (Object clz-name field-vals)
               (error 'interp "wrong number of fields")))]
        [(Get obj field-name)
         (let ([obj-v (interp* obj)])
           (cond [(Object? obj-v)
                  (let ([clz (find-class (Object-class-name obj-v) clzz)])
                    (get-field field-name
                               (Class-field-names clz)
                               (Object-field-vals obj-v)))]
                 [else (error 'interp "not an object")]))]
        [(Send obj method-name arg)
         (let ([obj-v (interp* obj)])
           (cond [(Object? obj-v)
                  (letrec ([clz (find-class (Object-class-name obj-v) clzz)]
                           [mth (get-method method-name (Class-name clz) clzz)])
                    (interp (App (Method-body mth) arg) env clzz obj-v))]
                 [else (error 'interp "not an object")]))]))))

; symbol * symbol * (listof Class) -> Method
(define get-method
  (lambda (method-name class-name clzz)
    (letrec ([clz (find-class class-name clzz)]
             [mths (map (lambda (m) (cons (Method-name m) m))
                        (Class-methods clz))]
             [result (assoc method-name mths)])
      (if (equal? result #f) (error 'get-method "can not found")
          (cdr result)))))

; symbol * (listof symbol) * (listof Value) -> Value
(define get-field
  (lambda (name names vals)
    (let [(result (assoc name (map cons names vals)))]
      (if (equal? result #f) (error 'get-field "can not found")
          (cdr result)))))

; ('a -> symbol) -> symbol * (listof 'a) -> 'a
(define make-find
  (lambda (get-name)
    (lambda (name xs)
      (cond [(null? xs) (error 'find (string-append "not found " (symbol->string name)))]
            [(equal? name (get-name (car xs))) (car xs)]
            [else ((make-find get-name) name (cdr xs))]))))

; symbol * (listof Class) -> Class
(define find-class
    (make-find Class-name))

(define Point '{class Point
                 {x y}
                 {{moveX {lambda {offset} {new Point {{+ {get this x} offset}
                                                      {get this y}}}}}
                  {add {lambda {p} {new Point {{+ {get this x} {get p x}}
                                               {+ {get this y} {get p y}}}}}}}})

(define pre-defined-clzz
  (list (parse-class Point)))

(define interp-prog
  (lambda (exp)
    (interp (parse exp) mt-env pre-defined-clzz (void))))

; Test

(interp-prog '{* 2 3})                          ; => 6
(interp-prog '{get {new Point {3 4}} y})        ; => 4
(interp-prog '{{lambda {x} {+ x 1}} 2})         ; => 3
(interp-prog '{new Point {1 2}})                ; => (Object 'Point '(1 2))
(interp-prog '{send {new Point {1 2}} moveX 1}) ; => (Object 'Point '(2 2))

(interp-prog '{send
               {send {new Point {2 2}} moveX 2}
               moveX -2})                       ; => (Object 'Point '(2 2))

(interp-prog '{send
               {new Point {1 2}}
               add
               {new Point {3 4}}})              ; => (Object 'Point '(4 6))