Understand macro and its strategies by tracking typed/racket

typed/racket is huge and seems impossible to understand, so I want to make try about this situation. This article will give you a systematic way to learn how to break down a complex macro system, by showing you a short tracking into typed/racket.

The following code is all based on commit: 20acc4.

First, please download or browse the typed/racket repository and open your DrRacket editor. In the beginning, we start from a tiny example.

#lang typed/racket

(define a : Integer 1)

Put the code into DrRacket and click the Macro Stepper button, you will see a window showing some information like the following.

(#%plain-module-begin (define a : Integer 1))

→ [Macro transformation]

(#%plain-module-begin (begin (: a Integer) (define a 1)))

Thus, you know the very first transformation is:

(define <name> : <Typ> <exp>)
; ===>
(: <name> <Typ>)
(define <name> <exp>)

You can try to rewrite the code like this, and put the cursor on to a in (: a Integer), it refers to a in (define a 1). Don't you feel amazing? They are separate codes, but related to each other? Let's explore the next part to figure out this. Keep clicking the Macro Stepper, you will get the following.

(#%plain-module-begin
 (define-values () (:-helper (: a Integer) #f a Integer))
 (define a 1))

→  [Macro transformation]

(#%plain-module-begin
 (define-values () (:-helper (: a Integer) #f a Integer))
 (define-values (a) 1))

:-helper is the target we should care about, it will expand to :-internal. However, the definition isn't that clear, thus I made a small pruned version for it.

#lang racket
(require syntax/parse/define
         (for-syntax racket/syntax))

(define-syntax-parser :
  [(_ n:id ty)
   (syntax-property
    (syntax/loc #'n
      (begin (quote-syntax (:-internal n ty) #:local)))
    'disappeared-use (syntax-local-introduce #'n))])
(define-syntax-parser def
  #:datum-literals (:)
  [(_ n:id : ty exp)
   (with-syntax ([tmp-name (generate-temporary)])
     #'(begin
         (: n ty)
         (define n exp)))])

(def a : Int 0)
(: b Int)
(define b "abc")

This : skip :-helper stage and produces :-internal, in a quote-syntax! Another important thing here is the 'disappeared-use, this will tell racket where is the #'n came from, and (syntax-local-introduce #'n) refers to the context to know it(It's easy to determine because we generate (define n exp)). Keep tracking :-internal, we will see syntax-class type-declaration is trying to recognize it.

(define-internal-classes
  ...
  [type-declaration
    (:-internal id:identifier type)]
  ...)

This syntax-class is used in typed-racket-lib/typed-racket/typecheck/tc-toplevel.rkt tc-toplevel/pass1 for recording information based on :-internal, them will be used later.

;; syntax? -> (listof def-binding?)
(define (tc-toplevel/pass1 form)
  (parameterize ([current-orig-stx form])
    (syntax-parse form
      #:literals (values define-values #%plain-app begin define-syntaxes make-struct-type)
      ...
      [t:type-declaration
       (register-type/undefined #'t.id (add-extracted-props-to-lexical-env
                                        (-id-path #'r.name)
                                        (parse-type #'t.type)))
       (register-scoped-tvars #'t.id (parse-literal-alls #'t.type))
       (list)]
      ...)

The next is the type checking code, in typed-racket-lib/typed-racket/typecheck/tc-toplevel.rkt tc-toplevel/pass2 typed/racket extract type information what it just recorded, then invoke tc-expr/check to check #'expr.

;; typecheck the expressions of a module-top-level form
;; no side-effects
;; syntax? -> (or/c 'no-type tc-results/c)
(define (tc-toplevel/pass2 form [expected (-tc-any-results #f)])
  (do-time (format "pass2 ~a line ~a"
                   (if #t
                       (substring (~a (syntax-source form))
                                  (max 0 (- (string-length (~a (syntax-source form))) 20)))
                       (syntax-source form))
                   (syntax-line form)))
  (parameterize ([current-orig-stx form])
    (syntax-parse form
      #:literal-sets (kernel-literals)
      ...
      [(define-values (var:typed-id^ ...) expr)
       (let ([ts (attribute var.type)])
         (when (= 1 (length ts))
           (add-scoped-tvars #'expr (lookup-scoped-tvars (stx-car #'(var ...)))))
         (tc-expr/check #'expr (ret ts)))
       'no-type]
      ...)

Finally, we can see the typed-racket-lib/typed-racket/typecheck/tc-expr-unit.rkt tc-expr/check, as the following code. If you have interest, you can keep digging, but I'm going to jump out and introduce the big picture of the idea.

;; form : what expression are we typechecking?
;; expected : what is the expected tc-result (can be #f)
;; existential? : do we want to create an existential
;;   identifier for this expression if it does not
;;   have a non-trivual object? This is useful when
;;   the type of other expressions can depend on
;;   the specific type of this term.
(define (tc-expr/check form expected [existential? #f])
  (parameterize ([current-orig-stx form])
    ;; the argument must be syntax
    (unless (syntax? form)
      (int-err "bad form input to tc-expr: ~a" form))
    (define result
      ;; if there is an annotation, use that expected type for internal checking
      (syntax-parse form
        [exp:type-ascription^
         (add-scoped-tvars #'exp (parse-literal-alls (attribute exp.value)))
         (tc-expr/check/internal #'exp (parse-tc-results (attribute exp.value)))]
        [_ (reduce-tc-results/subsumption
            (tc-expr/check/internal form expected))]))
    ;; if 'existential?' is true, then it means this expression should be
    ;; given an existential identifier as an object if it has no object
    (define adjusted-result
      (cond
        [existential?
         (match result
           [(tc-result1: t ps (not (? Object?)))
            (ret t ps (-id-path (gen-existential-id)))]
           [_ result])]
        [else result]))
    (add-typeof-expr form adjusted-result)
    (cond-check-below adjusted-result expected)))

Back to tc-toplevel/pass1, I can tell the big picture now, the following code type-check will check the module and invoke tc-toplevel/pass1 for every top-level form.

;; actually do the work on a module
;; produces prelude and post-lude syntax objects
;; syntax-list -> (values syntax syntax)
(define (type-check forms0)
  ...)

In the very beginning, we already know the transformation. Hence, what typed/racket does, is convert your code into two parts:

The type information and others meta data be quoted by quote-syntax.
The code will be executed in the final and need to be checked, with the information collects above.

This is all, hope you learn something and realize what can the interceptor point like #%module-begin can bring, and maybe have some ideas about how to use these!