演算空明诀 — 用 Emacs Lisp 实现 Simply Untyped λ Calculus

最近为了了解PL,开始看 Types and Programming Languages (下称TAPL)。知乎上说最好自己实现一遍解释器才会对λ演算有深入的了解,所以我就实现了一遍。实现过以后好像没有对λ演算醍醐灌顶, de Bruijn index 倒确实比以前明白了。

首先是Term和Context的(伪)定义:

;; Term = (Var index context-size)
;;      | (Abs term bound-var-name)
;;      | (App term term)
;; Context = [(name, nil)]

和书里不同的是去除了info,Context里的Binding用nil代替,其他一样。由于Emacs Lisp没有与Haskell、OCaml类似的类型系统,我就用S表达式代替(所以周围加了括号)这样正好方便类型匹配(pattern matching是这么说么)。类型匹配用的是Stefan Monnier写的pcase库。

下面是几个方便函数。

(defun var-name (context-elm)
  "Get name of CONTEXT-ELM."
  (car context-elm))

(defun add-var (var context)
  "Add VAR (string) to CONTEXT"
  (cons (list var nil) context))

(defun valuep (term)
  (eq (car term) 'Abs))

打印

把de Bruijin index转成变量名(x, y, etc)的函数。

(defun index2name (context index)
  "Translate INDEX to variable name in CONTEXT."
  (var-name (nth index context)))

(defun unique-name (context name)
  "Return a list (CONTEXT’ NAME’).
If NAME is in CONTEXT, generate a unique one (NAME’) and push to
CONTEXT (CONTEXT’), otherwise return (CONTEXT NAME)."
  (cl-labels ((f (context-elm) (equal (var-name context-elm) name))
              ;; f returns t if CONTEXT-ELM has the same name as NAME
              (or2 (a b) (or a b))) ; or is a special form
    (if (and context (cl-reduce #'or2 (mapcar #'f context)))
        ;; exists another NAME
        (unique-name context (concat name "'"))
      ;; NAME is unique, push to context
      (list (add-var name context) name))))

接下来就是打印函数。用pcase匹配 term 处理三种情况。打印的时候尽量不用括号,除非application里左边是abstraction,或者右边是application。

(defun show (term &optional context raw)
  "Display TERM in CONTEXT."
  (pcase term
    ;; variable
    (`(Var ,index ,size)
     (if (eq (length context) size)
         (if raw (number-to-string index)
           (index2name context index))
       (error "Bad index: %s, term: %s context: %s" index term context)))
    ;; abstraction
    (`(Abs ,body ,x)
     (pcase-let ((`(,context1 ,x1) (unique-name context x)))
       (format "λ%s.%s" (if raw "" x1) (show body context1 raw))))
    ;; application
    (`(App ,t1 ,t2)
     (let ((t1t (show t1 context raw))
           (t2t (show t2 context raw)))
       ;; add parenthesis for abs and app, but not var
       (format "%s%s"
               (if (eq (car t1) 'Var) t1t (concat "(" t1t ")"))
               (if (eq (car t2) 'App) (concat "(" t2t ")") t2t))))
    (_ (error "No matching pattern: %s" term))))

在scratch buffer里试试:

(show '(Var 0 1) '(("x" nil)))
"x"
(show '(App (Var 0 2) (Var 1 2)) '(("x" nil) ("y" nil)))
"xy"
(show '(Abs (App (Var 0 2) (Var 1 2)) "x")
      '(("y" nil)))
"λx.xy"

解析

手写内部结构有点痛苦,正好Emacs Lisp有方便的S表达式,写个简单的parser很容易。

;; Term := Abs | App | Var | (Term)
;; Abs  := λ Var Term
;; App  := Term Term
;; Var  := Symbol

(defun parse (term &optional context)
  "Parse TERM into internal structure.
CONTEXT is current context."
  (pcase term
    (`(λ ,var . ,body)
     `(Abs ,(parse body (add-var (symbol-name var) context))
           ,(symbol-name var)))
    (`(,t1 ,t2) `(App ,(parse t1 context)
                      ,(parse t2 context)))
    (`(,stuff) (parse stuff context))
    ((pred symbolp)
     (cl-labels ((f (a b) (equal a (var-name b))))
       `(Var ,(or (cl-position (symbol-name term) context :test #'f)
                  (error "No previous declaration: %s" term))
             ,(length context))))
    (_ (error "No matching pattern: %s" term))))

同样,在scratch buffer里试试:

(show (parse '(λ x x)))
"λx.x"
(show (parse '(λ x (λ y (x y)))))
"λx.λy.xy"
(show (parse '(((λ x (λ y (x y))) (λ x x)))))
"(λx.λy.xy)λx.x"
(show (parse '(λ x (λ x (x x)))))
"λx.λx'.x'x'"
(show (parse '(λ x ((λ x x) x))))
"λx.(λx'.x')x"

还可以看看de Bruijin index

(show (parse '(λ x ((λ x x) x))) nil t)
"λ.(λ.0)0"
(show (parse '(λ x (λ y (x y)))) nil t)
"λ.λ.10"
(show (parse '(λ x (λ y (λ z (x y) z)))) nil t)
"λ.λ.λ.(21)0"

Shift & substitution

书里用了一个本地的匿名函数(所以省略了 d ),我觉得分开写更清楚。基本上把书里shift和substitution的规则转化成代码就好了。

(defun shift (d term)
  (shift1 0 d term))

(defun shift1 (c d term)
  "Shift up D for indexes >= C in TERM."
  (pcase term
    ;; when we shift up, that always means we are put into another
    ;; layer of abstraction, so size + 1
    (`(Var ,idx ,size)
     (if (>= idx c) `(Var ,(+ idx d) ,(+ size d))
       `(Var ,idx ,(+ size d))))
    ;; go into one more layer of abstraction, c+1
    (`(Abs ,body ,var) `(Abs ,(shift1 (1+ c) d body) ,var))
    ;; simply recurs
    (`(App ,t1 ,t2) `(App ,(shift1 c d t1)
                          ,(shift1 c d t2)))
    (_ (error "No matching pattern: %s" term))))

(defun subst (j s term)
  "Substitute J with S in TERM.
I.e. “[j → s] t”."
  (pcase term
    (`(Var ,index ,size)
     (if (= index j) s term))
    (`(Abs ,body ,var)
     `(Abs ,(subst (1+ j) (shift 1 s) body) ,var))
    (`(App ,t1 ,t2)
     `(App ,(subst j s t1) ,(subst j s t2)))
    (_ (error "No matching pattern: %s" term))))

(defun beta (t1 v)
  "Perform β-reduction.
T1 is body of function, V is argument."
  (pcase t1
    (`(Abs ,body ,_) (shift -1 (subst 0 (shift 1 v) body)))
    (_ (error "No matching pattern: %s" t1))))

TAPL把shift和substitution抽象成一个,不太好理解。所以我就没费劲,还是写成两个。

Evaluation

同样,基本上是TAPL里三个evaluation规则的直译:

(defun eval1 (term &optional step context)
  "Evaluate TERM in CONTEXT.
STEP, if non-nil, is the number of steps to evaluate."
  (let* ((step (or step 1.0e+INF)))
    ;; t := var | abs | app
    ;; v := abs
    (if (<= step 0)
        term
      (pcase term
        (`(Var . ,_) term)
        (`(Abs . ,_) term)
        ;; first reduce t1 to a value
        ((and `(App ,t1 ,t2) (guard (not (valuep t1))))
         (eval1 `(App ,(eval1 t1 step context) ,t2) (1- step) context))
        ;; then reduce t2 to a value
        ((and `(App ,t1 ,t2) (guard (not (valuep t2))))
         (eval1 `(App ,t1 ,(eval1 t2 step context)) (1- step) context))
        ;; finally apply
        ((and `(App ,t1 ,t2) (guard (valuep t1)) (guard (valuep t2)))
         (eval1 (beta t1 t2) (1- step) context))
        ;; no more to apply
        (_ term)))))

如果是Typed lambda calculus,遇到最后没有下一步可走的情况应该报错,但是这里只是停止。来试几个简单的例子:

(show (parse '((λ x x) (λ y y))))
"(λx.x)λy.y"
(show (eval1 (parse '((λ x x) (λ y y)))))
"λy.y"

(show (parse '(((λ x x) (λ n (λ m m n))) ((λ y y) (λ z z)))))
"((λx.x)λn.λm.mn)((λy.y)λz.z)"
(show (eval1 (parse '(((λ x x) (λ n (λ m m n))) ((λ y y) (λ z z)))) 1))
"(λn.λm.mn)((λy.y)λz.z)"
(show (eval1 (parse '(((λ x x) (λ n (λ m m n))) ((λ y y) (λ z z)))) 2))
"(λn.λm.mn)λz.z"
(show (eval1 (parse '(((λ x x) (λ n (λ m m n))) ((λ y y) (λ z z)))) 3))
"λm.mλz.z"

感觉没问题,如果读者有发现错误,请不吝赐教。

编辑于 2019-12-29

文章被以下专栏收录