This is part 2 of a series that started here.
First, a quick correction: I claimed that Lisp macros didn't have the ability change the basic syntax of Lisp. I believed that because I hadn't yet learned about "read" macros. Using "read" macros in Lisp you can indeed adjust some pretty basic syntax rules. That can't be done in JaM yet, but it's something that can probably be added in later.
So let's look more closely at the first stages of what JaM does.
As stated previously, JaM.include("foo.js")
fetches a
JavaScript source file as text. It then passes that text to
JaM.eval( ... )
. The eval
function does a
lexical analysis of the text (currently using code from JSLint) to generate a
JavaScript array of token objects.
A JaM token object is basically just a JSLint token object. For each word read from the JavaScript source, it contains the word itself, the line number and column where it was seen, and other bits of meta-data. Here's a couple of examples:
{ value: "function", line: 99, character: 16, reserved: true, identifier: true } |
{ value: "'bar'", line: 134, character: 32, type: '(string)' } |
I'm not sure if all this will be needed or useful in the long run, but JSLint produces it, and I have no good reason to throw any of it away. So far the "value" has been the only part that is has regularly been useful in the macros I've written.
Next, eval
uses a few simple grouping rules to generate a
tree of nested arrays of token objects. For example, the JavaScript
expression "alert('hi!');" would be translated into the following
tree:
[ { value: "alert", ... }, [ { value: "(", ... } [ [ { value: "'hi!'", ... } ] ], { value: ")", ... } ], { value: ";", ... } ]
These grouping rules have to be simple and loose because they must correctly parse not only normal JavaScript but also the code which will be the input to all our macros. (This is the stage where we could be using Lisp-like "read" macros instead of it being hardcoded as it is in JaM currently).
For example, if we plan to define a unless
macro, JaM at
this stage does not know what the word unless
means, but
it needs to generate an appropriate tree anyway. So for this
expression:
unless( false ) { alert( 'hi' ); }
JaM generates the tree below. For brevity I'll show just the value of each token object:
[ "unless", ["(",[["false"]],")"], ["{", [ [ "alert", ["(",[["'hi'"]],")"], ";" ], [] ], "}"] ]
This nested tree of token objects is the data structure that JaM macros operate on. What exactly a macro might do with with this structure will be covered in the next installment.