picopass: let's break nanopass down

1. Introduction

In the excellent library nanopass, we can define a language via define-language and make pattern matching on it by using define-pass or nanopass-case, the most important thing nanopass given here is the generated code for these syntaxes. This is extremely useful because usually, a compiler will have many uninterested paths, especially in multi-passes solutions, we have to write tons of code to handle them. However, Do I only get these from nanopass, a framework that cannot extend? No!

2. Breakdown

I break the problem into two parts

  1. a mark to help match recursion
  2. add missing cases that might occur in the domain

2.1. Mark recursion target

First, let's look at the usage

(struct add (l r))

(define (eval-+ e)
  (record eval-+)
  (match e
    [(add (R l) (R r)) (+ l r)]
    [(? number? v) v]))

(eval-+ (add 1 2))
  • record records the recursion function
  • R will let the pattern be re-applied with the recursion function

    This means if we have a pattern x, it will bind to (recur-function x)

The following is the implementation, quite simple actually.

(define-for-syntax recur-on? #f)
(define-syntax (record stx)
  (syntax-parse stx
    [(_ x:id) (set! recur-on? #'x)
	      #'(void x)]))

(define-match-expander R
  (λ (stx)
    (syntax-parse stx
      [(_ v:id)
       #`(app #,recur-on? v)])))
  • record puts it's target id into *phase 1*(compile time)
  • R pattern relies on app pattern

2.2. Generate missing cases

Here, we need to define what's the domain. For me, it's a set of structures I care about in the current pass. That's why I write the below code.

(define-for-syntax (struct->clause id-stx)
  (syntax-parse id-stx
     (define fields
       (reverse (struct-field-info-list (syntax-local-value id-stx))))
     #`[(s #,@(map (λ (x) `(R ,x)) fields))
	(s #,@fields)]]))

(define-syntax-parser rmatch
  [(_ target:expr
      #:recur on:id
      #:complete (s:id ...)
      c* ...)
       (record on)
       (match target
	 #,@(stx-map struct->clause #'(s ...))
	 c* ...))])

Here rmatch redo all we just done, and generates cases from the structure identifier list given by #:complete. What we care about here is struct->clause, this function extracts fields from the structure and generates R pattern for them, then preserves the structure. The following is how will you use it.

(struct add (l r))
(struct mul (l r))

(define (eval-+ e)
  (rmatch e
	  #:recur eval-+
	  #:complete (mul)
	  [(add (R l) (R r)) (+ l r)]
	  [(? number? v) v]))

(define (eval-* e)
  (rmatch e
	  #:recur eval-*
	  #:complete (add)
	  [(mul (R l) (R r)) (* l r)]
	  [(? number? v) v]))

((compose eval-*
 (mul (add 1 2)

This example has trouble of course since it's an evaluate function, breaking it into two functions will make invoke order that must correspond to the given AST structure. However, this also shows how the idea works.

3. Conclusion

You now get a workable macro rmatch, which can

  1. recur on any pass function you like to do so
  2. handle any structures you don't care yet

What can you do to get more?

  1. (? number? v) pattern should also be generated, extend #:complete part to get more possible input.
  2. #:complete takes a mark of collection/language. For example, #:complete λ0 generates all possible patterns for language λ0
  3. matching on multiple targets, and still cover all cases (should be a cartesian product now). In type checker, such use cases are quite usual, with this one can write

    (rmatch (tm ty)
      #:recur tyck
      #:complete (...)
      [((? number? v) Number) 'ok]
  4. handle typed/racket to get help from the type checker
  5. generated pattern should only recur on the interested type, here we didn't hit the problem, since our target is Expr itself and eval-+ is a function with domain Expr.

More ideas? Welcome to show your thought here, have a nice day.

Date: 2022-09-17 Sat 00:00

Author: Lîm Tsú-thuàn