Tuesday, 17 January 2012

Packrat Parser combinators for haXe.

Hi!

Last time I presented you Monax (http://lambdabrella.blogspot.com/2012/01/monax-crossplatform-monads-via-haxe.html), a compiler macro that brings monadic syntactic sugar (ala haskell) to haXe and enables defining your own monads (and hooks to optimize their compositions).

I said I had several other projects waiting to be released.

So I'm proud to release now Parsex, a parser combinator which started as a side project effort to ease working with external DSLs and to which I've decided to bring some additional love by making it a packrat parser (this is largely motivated by the awesome packrat parsers which are part of the awesome Scala language libraries).

You may found some information about packrat parsers here:
http://bford.info/packrat/

The important aspect of it is that a packrat parsers memoize their results in a context (based on parsing position), lowering backtracking cost *a lot*.

Thanks to this memoization, there's no more real need to separate the lexer and the parser (it's usually the case only for performance reasons).

We're also able to detect left recursive grammars and handle them properly.

This is at the time of writting, the first general purpose Parser API in haXeland.


Here's an exmaple from the test to parse a Json array:


Combinatorial style:

static var jsonArrayP =   leftBracketP._and(jsonValueP.repsep(commaP).and_(rightBracketP)).then(JsArray);

Monadic style (using monax and which perform slighty faster):

static var jsonArrayP =
  ParserM.dO({
    leftBracketP; // Parses '['
    jsons <= jsonValueP.repsep(commaP); // Parses json values separated by some commas
    rightBracketP; // Parses ']'
    ret(JsArray(jsons)); // returns the result wrapped in a JSArray enum (case classes in Scala)
  });

I'll not enter into the details as it speaks for itself.

Speed wise, it's not as fast as hand written parser; the additional cost may be aleviated with future versions of haXe with better expression generation (JS) and more inlining control.

But I think it's the fastest path for developpement speed.

Available on my github: https://github.com/sledorze/Parsex

and released on haxelib (the haXe libraries repository): http://lib.haxe.org/p/Parsex

Enjoy!

2 comments:

  1. Sent it over to a haskell friend :P

    I did actually develop last year a lexer/parser generator for haXe/C++ for LALR(1) parsers. Granted that's not as nice as a parsec parser :P

    example: https://github.com/deltaluca/flappy/blob/master/UserClient/daide/lexer.hlx and https://github.com/deltaluca/flappy/blob/master/UserClient/daide/parser.hlr

    ReplyDelete
  2. Oh! didn't know about this one..
    So we're more than a bunch of Haskellers out there! ;)

    ReplyDelete