Tuesday, June 23, 2009

Serializable Closures in PLT Scheme

PLT Scheme supports an extensible serialization system for structures. A structure is serializable if it has a prop:serializable property. There are many properties in PLT Scheme for other extensions, such as applicable structures and custom equality predicates.

The PLT Web application development framework uses these features to provide serializable continuations through a number of source transformations and a serializable closure structure.

Warning: This remainder post refers to features only available in the latest SVN revision of PLT Scheme.

I've recently made these closures more accessible to non-Web programs through web-server/lang/serial-lambda. Here's a demo:

#lang scheme
(require web-server/lang/serial-lambda
         scheme/serialize)

(define f
  (let ([z 5])
    (serial-lambda
     (x y)
     (+ x y z))))

(define (test-it)
  (printf "~S~n" (f 1 2))
  (let ([fs (serialize f)])
    (printf "~S~n" fs)
    (let ([df (deserialize fs)])
      (printf "~S~n" df)
      (printf "~S~n" (df 1 2)))))

> (test-it)
8
((2) 1 ((#"/Users/jay/Dev/svn/plt/collects/web-server/exp/test-serial.ss" . "lifted.6")) 0 () () (0 5))
#(struct:7a410aca70b31e88b4c2f0fe77fa7ffe:0 #)
8

Now, let's see how it is implemented. web-server/lang/serial-lambda is thin wrapper around web-server/lang/closure, which has two syntax transformer functions: define-closure! which defines the closure structure and make-closure which instantiates the closure. (The two tasks are separated to easily provide a user top-level definition syntax for named closures with different free identifires, rather than simply anonymous lambdas with fixed free identifiers.)

make-closure does the following:

  1. Expands the procedure syntax using local-expand, so it can use free-vars to compute the free identifires.
  2. Uses define-closure! to define the structure and get the name for the constructor.
  3. Instantiates the closure with the current values of the free identifiers.

The more interesting work is done by define-closure!. At a high-level, it needs to do the following:

  1. Create a deserialization function.
  2. Create a serialization function that references the deserializer.
  3. Define the closure structure type that references the serializer.
  4. Provide the deserializer from the current module so that arbitrary code can deserialize instances of this closure type.

These tasks are complicated in a few ways:

  1. The deserializer needs the closure structure type definition to create instances and the serializer needs the closure structure type to access their fields.
  2. The serializer needs the syntactic identifier of the deserializer so that scheme/serialize can dynamic-require it during deserialization.
  3. The deserializer must be defined at the top-level, so it may be provided.
  4. All this may occur in a syntactic expression context.

Thankfully, the PLT Scheme macro system is powerful to support all this.

The only complicated piece is allowing the deserializer and serializer to refer to the closure structure constructor and accessors. This is easily accomplished by first defining lifting boxes that will hold these values and initializing them when the structure type is defined. This is safe because all accesses to the boxes are under lambdas that are guaranteed not to be run before the structure type is defined.

An aside on the closure representation. The closure is represented as a structure with one field: the environment. The environment is represented as a thunk that returns n values, one for each of the free identifiers. This ensures that references that were under lambdas in the original syntax, remain under lambdas in the closure construction, so the serializable closures work correctly inside letrec. This thunk is applied by the serializer and the free values are stored in a vector. The closure also uses the prop:procedure structure property to provide an application function that simply invokes the environment thunk and binds its names, then applys the original procedure syntax to the arguments.

An aside on the serializer. The deserializer is bound to lifted identifier which is represented in PLT Scheme as an unreadable symbol. Version 4.2.0.5 added support for (de)serializing these.

Monday, June 1, 2009

PLT Scheme v4.2

PLT Scheme version 4.2 is now available from

  http://plt-scheme.org/
Internally, this version includes a conversion from C++ to Scheme for part of the GUI toolbox --- specifically, 25k lines of code that implement the general text and pasteboard editor. This conversion is a start on a larger reimplementation of the GUI toolbox. Although we believe that this change will help make PLT Scheme better in the long run, we must expect bugs in the short term due to porting errors. Users should therefore be aware of the change, even though the new implementation is meant to behave the same as previous versions.
  • A new statistical profiler is now available; see the "profiler" manual for more information. Currently, the profiler supports only textual output, but future plans include a GUI interface and DrScheme integration.
  • The world teachpack is now deprecated. Its replacement universe has a new interface that uses strings instead of symbols and characters.
  • Web-server: Native continuations in the stateless servlet language support capturing continuations from untransformed contexts; soft state library for stateless servlets.
  • DrScheme's Stepper can now jump to a selected program expression.
  • New in scheme/base: hash-has-key?, hash-ref!, in-sequences, in-cycle. New in scheme: count, make-list (from scheme/list), const (from scheme/function).
  • Some performance improvements, including faster start-up for small programs. The latter is a result of delaying module invocations at higher phases (compile time, meta-compile time, etc.) until compilation is demanded at the next lower phase; this on-demand instantiation is per-phase, as opposed to per-module within a phase.
[Note that mirror sites can take a while to catch up with the new downloads.] Feedback Welcome.

Monday, May 25, 2009

Typed Scheme 2.0

Typed Scheme version 2.0 is now available from SVN.

One persistent limitation of Typed Scheme has been that while this expression works as expected:

(if (number? x) (add1 x) 7)

The simple transformation of making x a part of a structure breaks Typed Scheme's ability to reason about the code. So this expression doesn't typecheck:

(if (number? (car x)) (add1 (car x)) 7)

With the newest version of Typed Scheme, now available in SVN, both of these will now work. In general, Typed Scheme can now follow paths into arbitrary immutable structures, including pairs.

This is part of a more general reworking of underlying mechanisms of the Typed Scheme typechecker, which makes it both simpler and more flexible. I hope that it will be possible, sing this new foundation to add additional features that make more programs easy to express in Typed Scheme.

Of course, these changes mean that Typed Scheme may be more unstable, so if you notice any new bugs, please let us know.

Unfortunately, this won't be available in the upcoming 4.2 release, but it will be in the release after that.

If you have any questions or comments or feature requests for Typed Scheme, please let us know.

Sunday, May 24, 2009

Explicit Renaming Macros; Implicitly

It's been one too many times that I hear respectable Schemers talk about how they like explicit renaming macros — not because they're more powerful, but because using them is close to using simple defmacros. In this post I'll show how you can easily write ER-like macros in PLT, just so I won't need to explain the same thing once again. Disclaimers:

  • If you're not interested in ER-macros, then you shouldn't read this.
  • I'm not claiming that ER macros are not respectable, I'm just surprised at the knee jerk reaction to syntax-case.
  • This is not an attempt at providing some portable library or even a PLT library. The intention is to show that syntax-case-style macros are "as convenient" as ER macros, if you really want to get down to that level.
  • This is also not an attempt at any kind of formal claim of equivalence in any direction, only a demonstration that you can get the same kind of convenience.
  • The bottom line here should be just the convenience point, addressed at people who like ER macros for that, and who think that syntax-case macros are somehow magical or that you lose the ability to play with S-expressions.
The important fact here is that while PLT's syntax-case macro system does not give you raw S-expressions, what you get is a simple wrapper holding them. A macro is a syntax transformer: a function that consumes a syntax value and returns one. For example:
  (define-syntax (foo stx)
    #'123)
is a macro that always expands to 123 (with #'123 being the usual shorthand for (syntax 123)). A syntax object in PLT Scheme (the input to macro functions) is an S-expression, with some lexical information added — this includes the lexical context (in an opaque form), source location, and a few more things. To be more precise, a syntax value is a nested structure of wrappers holding lists and pairs, holding more wrappers, with identifiers at the leaves, where an identifier is a wrapper holding a symbol. It's easy to strip off all wrappers using syntax->datum if you like to work with S-expressions, but you don't want to strip it off of identifiers, since that will lose the important gravy. (In fact, the defmacro library works by stripping off all information, even from identifiers, then reconstructing it by trying to match names in the output form with the original input.) Instead of throwing away all information, what we want to do is preserve identifiers. We can use syntax->list for this: this is a function that takes a syntax value that contains a list, and strips off the top-level extra information leaving you with a list of syntaxes for the sub-expressions (returning #f if the input syntax does not hold a list). Once we have such a list, we can do the usual kind of processing with it, and when we're done turn the result back into a syntax using datum->syntax (which "borrows" the original input expression's information). For example,
  (define-syntax (add1 stx)
    (let ([+ #'+])
      (datum->syntax stx `(,+ 1 ,@(cdr (syntax->list stx))) stx)))
That's a very simple example though; if you try something a little more complicated, you quickly find out that all this unwrapping is inconvenient:
  (define-syntax (mylet stx)
    (let ([*lambda #'lambda])
      (datum->syntax
       stx
       `((,*lambda ,(map (lambda (x) (car (syntax->list x)))
                         (syntax->list (cadr (syntax->list stx))))
                   ,@(cddr (syntax->list stx)))
         ,@(map (lambda (x) (cadr (syntax->list x)))
                (syntax->list (cadr (syntax->list stx)))))
       stx)))
(Note also the *lambda that is used to avoid the lambda expressions used in the meta-code.) What can help here is some helper function that receive a syntax value as its input, and turn all wrapped lists into real lists recursively, but will leave identifiers intact:
  (begin-for-syntax
    (define (strip stx)
      (let ([maybe-list (syntax->list stx)])
        ;; syntax->list returns #f if the syntax is not a list
        (if maybe-list (map strip maybe-list) stx))))
But as long as we're writing a syntax utility, we can make it do a litte more work: feed the resulting tree to your transformer, grab its result, and do the necessary datum->syntax voodoo on it:
  (begin-for-syntax
    (define (er-like-transformer transformer)
      (define (strip stx)
        (let ([maybe-list (syntax->list stx)])
          ;; syntax->list returns #f if the syntax is not a list
          (if maybe-list (map strip maybe-list) stx)))
      (lambda (stx)
        (datum->syntax stx (transformer (strip stx)) stx))))
With this utility defined, the above macro becomes much easier to deal with:
  (define-syntax mylet
    (er-like-transformer
     (lambda (exp)
       (let ((vars  (map car (cadr exp)))
             (inits (map cadr (cadr exp)))
             (body  (cddr exp)))
         `((,(syntax lambda) ,vars ,@body)
           ,@inits)))))
...and this is almost identical to the explicit renaming version of the macro; for example, compare it with the sample code in the MIT-Scheme manual. The only change is that (rename 'lambda) is replaced with (syntax lambda). Obviously, this is very close, but doesn't show intentional captures. So I just grabbed the loop example from the same page, and did the same change — only this time I used #'foo instead of (syntax foo) (and I also changed the one-sided if to a when). The resulting macro works fine:
  (define-syntax loop
    (er-like-transformer
     (lambda (x)
       (let ((body (cdr x)))
         `(,#'call-with-current-continuation
           (,#'lambda (exit)
            (,#'let ,#'f () ,@body (,#'f))))))))
  
  (define-syntax while
    (syntax-rules ()
      ((while test body ...)
       (loop (when (not test) (exit #f))
             body ...))))
  
  (let ((x 10))
    (while (> x 0)
      (printf "~s\n" x)
      (set! x (- x 1))))
This is pretty close to a library, and indeed, as I was writing this text I found a post by Andre van Tonder on the Larceny mailing list that uses a very similar approach and does make a library out of it. This is done by adding two arguments to the expected ER-transformation function — one is a rename function (since the above method uses syntax it is limited to immediate identifiers), and the other is always passed as free-identifier=?. Such a compatibility library is, however, not the purpose of this post. Finally, there is still a minor issue with this — PLT has an implicit #%app which is used wherever there are parentheses that stand for a function application — and in this code they are used unhygienically. This is usually not a noticeable problem, and if it is, you can add explicit #%apps. It might also be possible to find a more proper solution (e.g., use a hash table to keep track of lists that were disassembled by the client transformer), but at this point it might be better to just use the more natural syntax-case anyway.

Monday, May 18, 2009

The Two State Solution: Native and Serializable Continuations in the PLT Web Server

One of the annoyance of the stateless Web application language that comes with the PLT Web Server is that you can't call third-party higher-order library procedures with arguments that try to capture serializable continuations. (I know, you try to do that all the time.) For example:

(build-list
 3
 (lambda (i)
   (call-with-serializable-current-continuation
    (lambda (k) (serialize k)))))

The problem is that the stateless language performs a transformation on your program to extract the continuations into a serializable representation. If you really need to do this, we've developed a compromise called "The Two State Solution": one state on the client and the other on the server. Only the third-party parts of the continuation (in this case, the code inside build-list) are stored on the server; everything else is shipped to the client. You just need to annotate your code slightly to indicate where the transition is:

(serial->native
 (build-list
  3
  (lambda (i)
    (native->serial
     (call-with-serializable-current-continuation
      (lambda (k) (serialize k)))))))

serial->native signals the transition to the third-party and native->serial signals the transition back.

It is still a little annoying to find when you've called these third-party higher-order library procedures with arguments that try to capture serializable continuations, so there's a simple macro that provides a transitioning wrapper for you:

(define-native (build-list/native _ ho) build-list)

expands to:

(define (build-list/native fst snd)
  (serial->native
   (build-list
    fst
    (lambda args
      (native->serial
       (apply snd args))))))

This new feature is documented in the online manual, of course.

Soft State in the PLT Web Server

Many Web applications and network protocols have values in the continuation that are necessary to complete the computation, but that may be recomputed if they are not available. This is "soft state".

For example, a Web application may cache a user's preferences from a database and deliver it to a Web browser as a hidden value; when the value is returned to the application in subsequent steps, it is used to customize the display. However, if the preferences were not available (or were corrupted in some way), the application could retrieve them from the database.

When using the PLT Web Server's native continuations, this roughly corresponds to the use of a weak box: a box that the GC is allowed to erase the contents of. When using the PLT Web Server's serializable continuations it roughly corresponds to a weak box and a weak hash table (that holds its keys weakly) to give the box a serializable value as an identifier.

This programming pattern is a bit difficult to get right. So, a library that implements it is now provided with PLT Scheme: web-server/lang/soft.

Here's a trivial example:

#lang web-server

(provide interface-version start)
(define interface-version 'stateless)

(define softie
  (soft-state
   (printf "Doing a long computation...~n")
   (sleep 1)))

(define (start req)
  (soft-state-ref softie)
  (printf "Done~n")
  (start
   (send/suspend
    (lambda (k-url)
      `(html (body (a ([href ,k-url]) "Done")))))))

Scheme Workshop: deadline NOT extended!

We're holding the line on our submission deadline; it's still June 5, so that gives you about three weeks to write something awesome.

Re-posting the entire CfP on a blog seems a bit tacky, so instead I'll just post the link:

http://blog.plt-scheme.org/2009/01/cfp-scheme-workshop-2009.html

We look forward to your submissions!