Tuesday 17 January 2012

Packrat Parser combinators for haXe.

Hi!

Last time I presented you Monax (http://lambdabrella.blogspot.com/2012/01/monax-crossplatform-monads-via-haxe.html), a compiler macro that brings monadic syntactic sugar (ala haskell) to haXe and enables defining your own monads (and hooks to optimize their compositions).

I said I had several other projects waiting to be released.

So I'm proud to release now Parsex, a parser combinator which started as a side project effort to ease working with external DSLs and to which I've decided to bring some additional love by making it a packrat parser (this is largely motivated by the awesome packrat parsers which are part of the awesome Scala language libraries).

You may found some information about packrat parsers here:
http://bford.info/packrat/

The important aspect of it is that a packrat parsers memoize their results in a context (based on parsing position), lowering backtracking cost *a lot*.

Thanks to this memoization, there's no more real need to separate the lexer and the parser (it's usually the case only for performance reasons).

We're also able to detect left recursive grammars and handle them properly.

This is at the time of writting, the first general purpose Parser API in haXeland.


Here's an exmaple from the test to parse a Json array:


Combinatorial style:

static var jsonArrayP =   leftBracketP._and(jsonValueP.repsep(commaP).and_(rightBracketP)).then(JsArray);

Monadic style (using monax and which perform slighty faster):

static var jsonArrayP =
  ParserM.dO({
    leftBracketP; // Parses '['
    jsons <= jsonValueP.repsep(commaP); // Parses json values separated by some commas
    rightBracketP; // Parses ']'
    ret(JsArray(jsons)); // returns the result wrapped in a JSArray enum (case classes in Scala)
  });

I'll not enter into the details as it speaks for itself.

Speed wise, it's not as fast as hand written parser; the additional cost may be aleviated with future versions of haXe with better expression generation (JS) and more inlining control.

But I think it's the fastest path for developpement speed.

Available on my github: https://github.com/sledorze/Parsex

and released on haxelib (the haXe libraries repository): http://lib.haxe.org/p/Parsex

Enjoy!

Tuesday 3 January 2012

Monax - crossplatform Monads via haXe.

Monax


Motivation


If, like me, you code on NodeJs; you may find it is easy to make spaghetti code. Even properly written, a library smell too much..

Another problem is how to make sure you did not forgot a use case while writting your code.. ?
It would be nice to have a rock solid safe code composition capability; requiering less test code.

Every seasonned statically typed functionnal programmer will tell you to build up an abstraction for that and I'm pretty sure they mean Monads!
I really invite you to read more about them.

Now, that's a good idea but not all languages provides the so usefull syntactic sugar that make Monads really very clean to read / write.

I've missed too much Monads in haXe so I've decided to unleash macro capabilities to get them working - now, it is here, for your pleasure:

Monax on gitHub!


Introducing the Monax macro library


This library, built on top of haXe macros give the user the possibility to define specific Monad instance and get automatically access to a haskell like sugared syntax.
About monad: Monads on Wikipedia

The syntactic sugar rely on the existence 'ret' and 'flatMap' in the Monad instance.
It also requiers the existence of 'map' to provide automatic optimisations.


Learn the (not so) hard way - Defining a Monad instance.


class OptionM {
    
  @:macro public static function dO(body : Expr) return // the function to trigger the Monad macro.
    Monad.dO("OptionM", body, Context)

  inline public static function ret(x : T) return // creates an element
    Some(x)
  
  inline public static function map < T, U > (x : Option, f : T -> U) : Option {
    switch (x) {
      case Some(x) : return Some(f(x));
      default : return None;
    }
  }

  inline public static function flatMap(x : Option, f : T -> Option) : Option {
    switch (x) {
      case Some(x) : return f(x);
      default : return None;
    }
  }
}

Using a Monad instance (Option here)


OptionM.dO({
          value <= ret(55);
          value1 <= ret(value * 2);
          ret(value1 + value);
        });


Due to optimisations; this code reduce to Some(165), those optimisations are applied by default.
One can define his own optimisations (feeding the Monad.dO method its last parameter)

@:macro public static function dO(body : Expr) return
    Monad.dO("OptionM", body, Context, myCustomTransformation)

(please refer to the default function code to make sense of how to make optimization - macro knowledge requiered).

or using no optimisation at all:

@:macro public static function dO(body : Expr) return
    Monad.dO("OptionM", body, Context, Monad.noOpt)


Next..


Now I've this up and running; I plan to release a packrat parser combinator which is waiting on my github - accompagnied with his own Monad instance.

Later, I'll also release some abstractions for NodeJs.

Saturday 22 October 2011

Compile time generation of Lenses in haxe.

After my previous work around externs generation; I started to think about something that worried me a lot while programming in haXe (it also applies to other languages).


let me put it simply with this snippet:

typedef Toto = {
    var name : String
  }

  var totoNames = arrayOfTotos.map(function (x) return x.name);

As a programmer I am lazy like hell (I often think it's a quality).
I would like to be able not to write the function, I want it to be generated for me automatically so I may write:

var totoNames = arrayOfTotos.map(User_.name_.get);

Also it makes sense to provide a setter the same way we provided a getter.

something like:

arrayOfTotos.map(callback(User_.name_.set, "georges"));

(note that callback is a way to perform partial application in haxe; I personnaly think a special, more general syntax would bring more awesomeness, but at least, we have it :) )


What we've just defined is in fact a Lense.

Here's the actual Lense signature in haXe :

typedef Lense < A, B > = {
    get : A -> B,
    set : B -> A -> A
  }

One interesting thing with Lenses is that we can combine them with some combinators, so let's define some extensions:

class LenseExtension {

    inline public static function mod< A, B >(l1 : Lense < A, B > , f : B -> B, a : A) : A  return
      l1.set(f(l1.get(a)), a)

    public static function andThen < A, B, C > (l1 : Lense < A, B > , l2 : Lense < B, C > ) : Lense < A, C > return {
      get : function (a : A) return l2.get(l1.get(a)),
      set : function (c : C, a : A) : A return
              mod(l1, function (b) return l2.set(c, b), a)
    }

    inline public static function compose < A, B, C > ( l2 : Lense < B, C >, l1 : Lense < A, B > ) : Lense < A, C > return andThen(l1, l2) 
  }

With this we can then compose and build some interesting abstractions:

given this:

var userA : User = {
    age : 6,
    name : "georges",
    say : function (x) return "Beh..." + x
  };

  var userGroup :  UserGroup = {
    users : [userA, userA],
    lead : userA,  
  };

We can do that for instance :

var ageOfUserGroupLead = UserGroup_.lead_.andThen(User_.age_);

and use it like that:

var newGroup = ageOfUserGroupLead.set(5, userGroup);

the interesting bits is that newGroup is structurally equivalent to userGroup but its lead age is modified, in fact all datas are the same but the modified parts; It's an instance of immutable structure 'update'.

This reveals to be particularly helpful when coding against types which have a lot of fields.

Providing an additional, zero cost extension can brings some syntactic sugar:

class LenseSyntactic {

    inline public static function set< A, B >(a : A, l : Lense< A, B >, v : B) : A return
l.set(v, a)

    inline public static function get< A, B >(a : A, l : Lense< A, B >) : B return
l.get(a)
  }

so that you can now write this:

userGroup.set(ageOfUserGroupLead, 5);

It is quite clean as an instance of declarative programming.

Here is how the Lenses are built:

import com.mindrocks.macros.Lense;
  using com.mindrocks.macros.Lense;

  class User_ implements LensesFor< User > {}
  class UserGroup_ implements LensesFor< UserGroup > { }

can't be simpler..


Note than you also have the ability to provide a modification through a function you may apply to a lense (syntactic sugar version):

function complexUserModification(user) {
      var newUser : User = Reflect.copy(user);
      // do you complex stuff
      return newUser;
    }
    
    userGroup.mod(UserGroup_.lead_, complexUserModification);


Performance wise it is comparable to an hand written factorised version; we can provide a lot of speed if/when a limitation of inlining, namely the impossibility to inline functions containing closures is relaxed and another optimization involving removal of temporary unused objects ('escape analysis'?) is done - if possible.

Anyway, for those who want to favor coding speed and then optimize on need; this is a valuable tool.

The code is accessible on my misnamed github repo:
https://github.com/sledorze/haxeExtensionBuilder

Feel free to comment!


P.S.: Specific (think containers) Lenses could/would/will? be generated by hand in a near? future.

Thursday 20 October 2011

Compile time extension generation in haXe.

If you don't know haxe, it's a very flexible, statically typed cross platform language.

I'm currently using it for almost all my jobs.

Recently, I've started using it with nodeJs and needed to create some externs (signatures to use host language objects) for various existing APIs (mongodb for instance) you may find those here:
https://github.com/sledorze/nodejs_externs

Doing so; I've used an interesting haXe capability which is to define several variations of a function using some special syntax (for externs classes and types) while keeping your typing right.


for instance:
@:native("Db")
  extern
  class Db {


    @:overload(function () : Cursor {})
    @:overload(function (callBack : Error -> Cursor -> Void) : Void {})
    public function collectionsInfo(collection_name : String, callBack : Error -> Cursor -> Void) : Void;

    ...
  }

That helps a lot, however, there's currently (haxe evolves quickly) some drawbacks associated with this usage, namely:

- Completion with these alternatives signatures do not work yet (haXe can provide completion information thanks to its lightning fast compiler which is able to return type information of a file position in realtime, as you type..).
- You can't use some advanced haXe niceties (like extensions methods) as this overloading mechanism is implemented as a fallback mode in the compiler, so it can't use this extra information with other features).

Another issue i had specifically with the Javascript target is the difficulty to grasp common javascript patterns you can find in libraries; javascript being so flexible.
One of those is merging objects functionnalities.

A common case is JQuery extensions.
HaXe provides some definitions of JQuery, which is nice, you get completion of the API and it catch your errors ahead of crash.

But at the same time, you cannot use your beloved JQuery plugins as you cannot extends this definition.

In fact, after some discussion on the mailing list, Juraj came with an interesting idea.

Use the compile time macro to create an extern class defining extensions methods in order to provide some inline proxy to the plugin methods. (I promess I've tried to shrink this sentence).

I've decided to implement it as my first haXe macro.
It took less than two hours, and I did not known anything to the API before, thanks to completion and lightning fast compiler, it was a joy to explore, very rewarding.

Let's see some code..

Here's how one would define a small extension (here three methods):

  import JQueryExtension;
  import js.JQuery;

  class MyJQueryExtension implements Extends< JQuery > {
    @:native("val") public function valGet() : Dynamic;
    @:native("val") public function valSet(x : Dynamic) : JQuery;
    @:native("val") public function valFun(f : Int -> Dynamic -> Dynamic) : JQuery;
  }

one need to extends the extension interface and prepend each function with @:native("")

and here's an usage:
  import js.JQuery;
  using MyJQueryExtension;

  class Main {
    static function main() {
      var jq = new JQuery("#someId");

      jq.valGet(); // generates jq.val();
      jq.valSet("content"); // generates jq.val("content");
      jq.valFun(function (i, v) return v); // jq.val(function (i, v) { return v;});
    }
  }
you can see that now the programmer can code against the newly defined API and will have code completion working.

and still have this generated:

Main.main = function() {
    var jq = new js.JQuery("#someId");
    jq.val ();
    jq.val ("content");
    jq.val (function(i,v) {
      return v;
    });
  }

I hope you'll enjoy it; you may find this exemple on my github repository:
https://github.com/sledorze/haxeExtensionBuilder

As always, feel free to ask.. and no trolling please!

Wednesday 3 August 2011

New HaXe book: a secret weapon under the spot.

I'm a veteran game developper, I've done it for decades.

Then when I had to learn web dev, it was disturbing; I got to learn completely new stacks to come up to speed (in fact even 'stack' was a new word for me).
I was kind of lost, there was so much new stuff; html, css, javascript, php, actionscript and obviously had to get rid of most static typing and functionnal programming I just started to love (think Scala, Haskell, etc..).

So at that time; I was dreaming of it; a blessing, wonderful language, which could compile to all targets, in which I could use functionnal programming, static typing and the whole backed with a very responsive IDE with backed-by-compiler completion, etc.. (i.e. you know what I'm not thinking about).
You know, the silver bullet..

Then I recall that MotionTwin was using.. the HaXe language thanks to Nicolas Canasse, it's creator.

At that time I was not so sure about how usable it would be; So I gave it a shot.
Damn, it was the second best decision I've made (the first being looking at SICP lectures which revive my declining interest into programming).

Why the hell aren't more people using haXe? It's AWESOME.
HaXe does not have the most advanced type system ever (don't underestimate it; it's quite above the averages) nor the most beautiful syntax or anything like that.
haXe has evolved and is evolving from very pragmatic choices done by some very smart people using it for real life job over many years, taking care of specific targets implementation for great good.
That's why haXe is also called "The ultimate web language" (and surely cross-platform with its Cpp and upcoming Java and C# backends).

I think haXe is not really well known because first; it's a wonderful Secret Weapon that's too good to be true, most people would think it's just buzz words for another crappy solution, second; there's not enough commercial push behind it.
If you don't know it yet; help yourself; give it a go!

When I've started it, it tooks me some time to find/discover the best practices; I would have loved to have a backed-by-examples introduction book.
However, it's difficult to cover haxe and all its features in a unique book; it would take a tree per book, it would be discouraging and that's not requiered.
No, what you're really looking for while starting is having some representative job done so that you can make a decision about it.

Happily Benjamin Dasnois has done it! here it is;


haXe 2 Beginner's Guide
haXe 2 Beginner’s Guide


'Haxe 2 beginners guide' is really succeeding at giving a very easy, progressive introduction to haxe.
It covers the most important topics with a lot of good examples backed with structured explanations.
You'll never be lost and in addition, Benjamin is kind to introduce the very neat tricks every seasonned haxe developper must know.

It tries (and succeed) bringing beginners up to speed on most targets and shows you how to use haXe and the tools that are part of the ecosystem to deliver real product.
You'll end up knowing how to do most jobs done very effectively and you'll then start to look at things with a new perspective; it's a game changer!

I would have loved it was released some years back as it is really a book that takes you by the hand and shows you the way.. Thanks Benjamin!

Feel free to comment on your experience with haXe as it deserves it!

Oh!.. and for those missing a fully functional API; there's the wondeful Stax functional API by
@jdegoes

Monday 25 July 2011

A Safe & Asynchronous Promise implementation explained - Part 2

Hi!

This the second post of a two part about the Promise implementation (first part here).

This second post is about the implementation detail of the MutablePromise, the LazyPromise and the AsyncPromise.

MutablePromise

This is the base implementation of Promises, it provides their registration and dispatch mechanisms.

So if you recall our Promise definition; it has two methods to implement:

abstract class Promise[+T] {
  def foreachEither(f: Either[Throwable, T] => Unit): Unit
  def nowEither(): Option[Either[Throwable, T]]
}


'nowEither' return the current promise state.
'foreachEither' provides a way to register a closure to be call immediately if the value is fulfilled or latter when it will become fulfilled.

In order to achieve good performance, we design a non blocking system for registration and value access.
We'll use a concurrentLinkedQueue in order to store callbacks to execute when the value is fulfilled and a volatile to store the actual value.

private val _callBacks= new ConcurrentLinkedQueue[Either[Throwable, T] => Unit]
  @volatile
  protected var _value: Option[Either[Throwable, T]] = None

We also need two methods to set a successful value or a failure on the MutablePromise :

final def set(x: => T) = setPromiseEither(Right(x))
  final def fail(ex: => Throwable) = setPromiseEither(Left(ex))

they are wrapping their respective parameter in an Either and forward to the setPromiseEither we've already seen in the previous post.
(Note that both methods parameter are thunk (call by name) and that setPromiseEither parameter is also a thunk so that any failure evaluating those would be caught by the setPromiseEither method)

Here is its implementation:

final def setPromiseEither(ev: => Either[Throwable, _ <: T]) = {
    try {  
      _value = Some(ev) // evaluation could throw
    } catch {
      case e: Throwable => _value = Some(Left(e))
    }
    synchronized {
      notifyAll()
    }
    propagate()
    this
  }

'ev' is a call by name parameter so that we can catch any possible exception when evaluating it.
If it throws, we set the Promise as failing with this new exception otherwise, the promise is set with the value passed as parameter.

After setting the value we notify any possible waiting thread, so they can make progress.
Note: this is only requiered in order to support unsafe methods (until, etc..), in a non blocking usage (desired), we could completely get rid of this (and nowEither in Promise abstract class).

After having woke up those thread, we propagate the value to all callbacks.

private def propagate(): Unit =
    nowEither() match {
      case Some(finalValue) =>
        @tailrec
        def internPropagate: Unit = {
          var finished = false;
          try {
            var elem = _callBacks.poll()
            while (elem != null) {
              elem.apply(finalValue)
              elem = _callBacks.poll()
            }
            finished = true;
          } catch {
            case e => e.printStackTrace
          }
          if (!finished)
            internPropagate // continue to consume the poll
        }
        internPropagate
      case _ =>
    }


Propagate takes the current value of the promise.
If the value is not set, does nothing.
If the value is set then consume the concurrentLinkedQueue using its atomic poll method.
Note that during the execution of this method, some other thread may register some new callBacks; as we use a concurrentLinkedQueue, no thread will block.
Each closure we poll out of the queue is executed in a try catch so that any failure would not prevent to execute remaining closures.
This try catch should never catch anything as all our combinators should take care of that but it is useful when one use the lower level API to develop new ones.

The propagate method tries to consume all the callBacks from the queue, however, it is not garanteed the queue is empty when exiting the method as some other thread could have fulfilled it.
The only contract being that all callBacks present in the queue at the time the method is called will be executed.
Consuming all callbacks even those added to the queue is not really fair regarding global progress but it is faster, this would need to be parametrized.

Now, let see how newly registred callBacks are executed (if the value is fulfilled):

def foreachEither(f: Either[Throwable, T] => Unit): Unit = {
    _callBacks offer f
    propagate()
  }

We add the callback to the queue and call propagate.
From our contract, the propagate method consume all callBacks set when called iff a value is set, so we're sure it will be either executed in this call or, in case the value is still undefined, with the call to propagate issued from setPromiseEither when the value is set (even on another thread).

Ok, so now we have a working MutablePromise:

class MutablePromise[T] extends Promise[T] {
  private val _callBacks = new ConcurrentLinkedQueue[Either[Throwable, T] => Unit]
  
  @volatile // may be accessed from different threads.
  protected var _value: Option[Either[Throwable, T]] = None
  
  final def set(x: => T) = setPromiseEither(Right(x))
  final def fail(ex: => Throwable) = setPromiseEither(Left(ex))

  def nowEither(): Option[Either[Throwable, T]] = _value
  def foreachEither(f: Either[Throwable, T] => Unit): Unit = {
    _callBacks offer f
    propagate()
  }

  final def setPromiseEither(ev: => Either[Throwable, _ <: T]) = {
    try {  
      _value = Some(ev) // evaluation could throw
    } catch {
      case e: Throwable => _value = Some(Left(e))
    }
    synchronized {
      notifyAll()
    }
    propagate()
    this
  }

  private def propagate(): Unit =
    nowEither() match {
      case Some(finalValue) =>
        @tailrec
        def internPropagate: Unit = {
          var finished = false;
          try {
            var elem = _callBacks.poll()
            while (elem != null) {
              elem.apply(finalValue)
              elem = _callBacks.poll()
            }
            finished = true;
          } catch {
            case e => e.printStackTrace
          }
          if (!finished)
            internPropagate // continue to consume the poll
        }
        internPropagate
      case _ =>
    } 
}


Let see how to implement the AsyncPromise:

final class AsyncPromise[T] extends MutablePromise[T] {
  override def foreachEither(f: Either[Throwable, T] => Unit): Unit =
    super.foreachEither {x => PromiseExec newSpawn f(x) }
}

We extend MutablePromise and override forEachEither and forward the execution of its parameter function to a threadPool using newSpawn witch register a task to execute 'f(x)'.
'f(x)' is passed in a thunk and is actually executed by the task, not during this registration.

Here's its usage in PromiseIsAsynchronous extension:

final class PromiseIsAsynchronous[+T](outer: Promise[T]) {

def async: Promise[T] = {
val prom = Promise.async[T]() // returns an unbinding AsyncPromise
outer.foreachEither { prom setPromiseEither _ }
prom
}

...

}


Now its time to review how we've achieved lazyPromise.
I've decide to consider that a lazyPromise would execute when it's value is directly or indirectly required (make sense).

A LazyPromise must have its pending execution as parameter.

Here's its code:

final class LazyPromise[T](var lazyAction: () => Unit) extends MutablePromise[T] {

  final def doLazyActionIfRequiered() = {
    if (lazyAction != null) synchronized { // narrowing test
      if (lazyAction != null) {
        val toEvaluate = lazyAction
        lazyAction = null // we free the lazy thunk memory
        try {
          toEvaluate()
        } catch {
          case e => fail(e) // could be _value = Some(Left(e))
        }
      }
    }
  }

  override def nowEither(): Option[Either[Throwable, T]] = {
    doLazyActionIfRequiered() // the reason we've overrided nowEither; triggering lazy computation.
    super.nowEither()
  }
}


So the LazyPromise extends MutablePromise, overriding its nowEither method.
The idea is to intercept all access to the value and if requiered, execute the pending lazyAction.

The evaluation is rather simple, you can note we're doing a narrowing test to avoid any unnecessary synchronization call and also that we're setting the lazyAction to null; this both indicates the lazyAction has been executed and also free the thunk memory (for GC).
LazyAction does not need to be volatile as it is accessed / modified inside a synchronization.

Note that we're not force to call fail(e) in case of failure because it is unnecessary to perform all propagation; indeed, in the case of LazyPromise, doLazyActionIfRequiered will be executed during the first call to nowEither which will happen when the first callback registration will call propagate.

The case of success, which could happen asynchronously, is handled in the usual way as you can see in the lazily method implementation in the PromiseIsLazy extension:

final class PromiseIsLazy[+T](outer: => Promise[T]) {
  def lazily : Promise[T] = {
    lazy val prom: LazyPromise[T] = // lazy to solve initialisation order problem
      new LazyPromise[T](() => outer foreachEither (prom setPromiseEither _))
    prom
  }
}

The outer is passed in a thunk in order to be evaluated on demand..

Here it is; we've covered the second and last post on this Promise implementation.
It is really far from perfect and would need some evolutions (like parametrizing executors..) but I hope it was interesting to some of you.

The repo can be found on github here .

As usual comments are welcome :)

Friday 22 July 2011

A Safe & Asynchronous Promise implementation explained - Part 1

Building safe, asynchronous, composable computations is notoriously known as difficult.
But a program that always terminate, handle failure scenario and that is easy to write, read and reason about should be the bread and butter of most programming activity.
Today I will expose you a Promise library I've done for this purpose during freetime three years ago  (in the old Scala 2.7 days).
While it may be better, those days, to use Akka or ScalaZ abstraction to stay in the momentum, I still think it could be valuable to provide a nice explanation of its  underlying architecture as it will help most of you figuring out how to build great functional abstractions with close to metal performance.

This first post will introduce you the Promise base class and some abstractions built out of it.
The second post will show you their lower level implementation.
(This is an old library so do not expect any 2.8+ features (like continuations..), the expected reader is experienced in Scala but not necessarily at advanced level).

Safe, asynchronous and composable computations?

Let's explain a bit further the intend:
  • Composable: The code is built using combinators on top of the Promises and/or other combinators (functionnal programming style).
     
  • Asynchronous: when doing an url request you don't want to waste a thread waiting for an answer, so basically this means that you use composition to define the rest of the computation ahead of execution.
     
  • Safe: no implicit exception throwing in your program, they'll be caught and you have to deal with, statically enforced (not the java way).
(The Promise abstraction presented in this post integrates several concepts in one implementation; this was done for performance reasons but a cleaner separation could be achieved).


Gimme some food, please

Ok enought talk, I'll show you some code and explain as it goes.
abstract class Promise[+T] {
  def nowEither(): Option[Either[Throwable, T]]
  def foreachEither(f: Either[Throwable, T] => Unit): Unit
}

So here's the Promise abstract class, it has 2 methods;
  • nowEither: It returns a snapshot of the Promise value, revealing its signature Option[Either[Throwable, T]]; the Option part indicates if the computation has terminated, the Either part encode either the parametrized value Right(v) or the error Left(err). It is, in my opinion is a better practice to keep things separated with meaning rather than creating three new case classes to define the Promise state. It will save you rewriting all combinators Option and Either already have.

  • foreachEither:  You pass it a function taking a Either[Throwable, T] and it executes a side effect with its value of type Either[Throwable, T] when fulfilled.
Promise is covariant in its type parameter T ([+T]), meaning that it will play well with subtyping.(i.e: a Promise[Ferrari] could be used as Promise[Car] iff Ferrari is a subtype of Car).
We could have get rid of nowEither as it is not really necessary but we would not be able to implement some pragmatic debug tools without introducing some overhead.

With this Promise definition; we will build some implementations (Mutable, Asynchronous and Lazy in next  post) and a lot of nice extensions / combinators to unleash their power (this post).

Here are some implicits built in order to import Promises functionnalities in scope :
object PromiseImplicits {
  implicit def toMonadPlus[T](p: Promise[T]) = new PromiseIsMonadPlus(p)
  implicit def toAsync[T](p: Promise[T]) = new PromiseIsAsynchronous(p)
  implicit def toAbstractFailure[T](p: Promise[Either[Throwable, T]]) = new PromiseCanAbstractFailure(p)
  implicit def toExposeFailure[T](p: Promise[T]) = new PromiseCanExposeFailure(p)
  implicit def toEitherMonadPlus[T](p: Promise[T]) = new PromiseIsEitherMonadPlus(p)
  implicit def toCombinators[T](p: Promise[T]) = new PromiseCombinators(p)
}

object PromiseUnsafeImplicits {
  implicit def toUnsafe[T](p: Promise[T]) = new PromiseCanBeUnsafe(p)
}

We're defining two Objects, one of them being called PromiseUnsafeImplicits  because it provides some functions that compromise the safety of the promises but can be helpful when developping.
(If you're not used to Implicits it a must know pattern and I invite you to read more about it: Poor man's typeclass from Odersky).
Let's start with it, so we can forget it earlier:
final class PromiseCanBeUnsafe[+T](outer: Promise[T]) {

  final def getOrThrow(): T =
    outer.nowEither() match {
      case Some(Right(v)) => v
      case Some(Left(err)) => throw err
      case _ => throw Promise.undefined
    }  
  
  final def waitUntil(delay: Long) = {
    outer.nowEither(); // to trigger lazyPromises before synchronizing
    synchronized {
      if (outer.nowEither().isEmpty) {
        wait(delay) // could throw an InterruptedException, we're in unsafe land..
      }
    }
    outer
  }
  
  final def waitBefore(delay: Long): Promise[T] = 
    waitUntil(delay).now() match {
      case Some(_) => outer
      case None => Promise.timeoutPromise // not ready yet
    }

  final def waitFulfilled(): Promise[T] =
    waitUntil(0)

  final def waitAndGetUntil(timeOut: Long): Option[T] = {
    waitUntil(timeOut).now()
  }

  final def waitAndGet() = waitFulfilled().now()
}


It takes an outer Promise as a parameter that its methods are using to provide new functionnality, a common pattern with implicits extensions.
  • getOrThrow simply ask for the current value of the computation (nowEither) and pattern match on its value.
    If it's not fulfilled (None case handled by '_'), it throws an undefined exception defined by a Promise Object (which implements promise constructors and constants).
    If it is fulfilled, it returns the Value if the Either value is Right (synonym of success) or throw the exception stored in Left (the exception owns the reason why it fails, wherever it cames from).
    So, yes it throws an exception, and that's why it's unsafe (and a smell for bad coding style).
    (In the same way all methods directly or indirectly using waitUntil are unsafe as they're blocking the current thread).
     
  • You can see a 'now()' method call in 'waitAndGetUntil', the latter is provided by the 'PromiseIsAsynchronous' implicitly; it provides an instant view of the promise current Value (using nowEither under the hood). It is kept in 'PromiseIsAsynchronous' and because it is not unsafe and I would like a programmer to feel confortable knowing they're not using unsafe features by removing the PromiseUnsafeImplicits while keeping this functionnality.
     
  • 'waitFulfilled' and 'waitAndGet' are blocking an unbound amount of time, they're otherwise equivalent to 'waitUntil' and 'waitAndGetUntil'.
     
  • 'waitBefore' returns the Promise under a certain amount of time or a timeout Promise defined in the Promise Object, it is also blocking (an equivalent, non blocking solution has been implemented but is not part of this billet as it relies on another component).

Example:

def unsafeExample(prom : Promise[String]) {    
    val boundedProm = prom.waitBefore(56) // returns the string if provided in the next 56 ms or error
    val myString = boundedProm.getOrThrow()
    println("myString", myString)
  }
Again all those usage must be banned from your code as much as possible; they're just here in case you wanna do something unsafe during development.
The introduction was rude and not very interesting, I had to tell you about those before you think about them.
Now let's go back to a better world. :)

Promise is a Monad.

A useful concept from functional programming that Scala enables is the concept of Monad.
Some well known Monads; List, Options, Either, etc.. they're all Monadics, that means that you can use some nice ways to combine them consistently (quickly said) and also use Scala's for comprehension syntacting sugar.
There's plenty of papers on the subject, so I invite you to read more about them.
Before introducing the 'Promise Monad' (kind of), I must say this version is not usable (in its form) with for comprehension because it does not follows the right naming convention; in fact it has the functionnality but it is meant to be exposed later using a particular method so that the API feels more natural (you'll undestand why later..).

final class PromiseIsEitherMonadPlus[+T](outer: Promise[T]) {

  final def mapEither[U](f: Either[Throwable, T] => Either[Throwable, U]): Promise[U] = {
    val prom = Promise[U]()
    outer.foreachEither { x => prom setPromiseEither f(x) }
    prom
  }

  final def flatMapEither[U](f: Either[Throwable, T] => Promise[U]): Promise[U] = {
    val prom = Promise[U]()
    outer.foreachEither { eith =>
      try {
        f(eith) foreachEither { prom setPromiseEither _ }
      } catch {
        case e => prom fail e
      }
    }
    prom
  }

  final def filterEither(pred: Either[Throwable, T] => Boolean): Promise[T] = {
    val prom = Promise[T]()
    outer.foreachEither { x => prom.setPromiseEither(if (pred(x)) x else Promise.leftUndefined) }
    prom
  }

}
Some of you may have understood it's in fact a Monad Plus because it has a filterEither method which means that some 'null' or 'empty' element (Promise.leftUndefined) in the Monad world. (If you read an Haskell introduction to Monads, you may find that a Monad plus introduce the choice using a mzero and mplus. Let's say they're equivalent for now.)
(In a normal Monad, empty is a singleton, here it is not because we consider all Exceptions begin equal, which is wrong, but we can live with it).
So far so good, so let's comment out those methods (imperatively..);
  • MapEither takes a function, creates a Promise calls a foreachEither with the function on the outer (which will be called and sets the newly created Promise with the value transformed by it's parameter function f when fulfilled) and immediately returns it.
     
  • FilterEither follows the same path but sets the Promise to the outer's value when available of leftUndefined if the predicate is not true for this value.
     
  • FlatmapEither is a little more involving but is quite understandable. Notice how we've carrefully encapsulated the function evaluation to catch a possible exception and then bind it's value to the new Promise; it's how we bring safety to the API, transforming throwed exception to Left(err) values using a fail call on the promise.

We'll build most features atop of Promises this way; by creating new Promise instances and returning them after 'registering computation on them'. It is very imperative, but in the end, we'll have a declarative usable API.
You've noticed several things along the road:
  • We've used some new methods; 'where does the fail and setPromiseEither methods comes from'? They are not part of the Promise definition. In fact, you've just seen a first use of an implementation of Promise; a 'MutablePromise' which implements some way to create an empty Promise and set its value at a later time. MutablePromises are created via the 'Promise()' call. The API tend to hide it as soon as possible under the Promise interface (it does not leak from the function). When wrapping some external libraries, this is also the path to follow.

  • You may have wonder why the abstraction will not leak in mapEither implementation as there's no try/catch! In fact, setPromiseEither takes a thunk of computation as parameter and not a value, deferring it's actual evalutation in a safer place (surrounded by another try catch).

  • Thanks to the Either representation, we're always dealing with the 'Exceptional' case Left(exception).
     

Example:

def mapEitherExample(fileNameProm : Promise[String], fileContent : String => Promise[String]) {
    
    // Not the code you're dreaming of..
    val lengthProm = 
      fileNameProm.flatMapEither{
       case Right(name) => fileContent(name)
       case Left(err) => Promise.failure(err)
      }.mapEither{
        case Right(v) => Right(v.length)
        case Left(l) => Left(l)
      }
    
    lengthProm.foreachEither {
      case Right(v) => println("successful text length is " + v)
      case Left(err) => println("we add an error " + err.toString());
    }
  }

Ok, but dealing with Either..

Indeed, it could become verbose and boring; I recall a great man saying; if you pattern match too much, you're lacking a combinator..
So here comes PromiseIsMonadPlus with our beloved monad methods :
final class PromiseIsMonadPlus[+T](private val outer: Promise[T]) {

  final def foreach(f: T => Unit): Unit =
    outer.foreachEither {
      case Right(x) => f(x)
      case _ =>
    }

  final def map[U](f: T => U): Promise[U] =
    outer.mapEither {
      case Right(x) => Right(f(x))
      case Left(err) => Left(err)
    }

  final def flatMap[U](f: T => Promise[U]): Promise[U] =
    outer.flatMapEither {
      case Right(v) => f(v)
      case _ => outer.asInstanceOf[Promise[U]] // we're abusing the compiler but this is safe
    }

  final def filter(pred: T => Boolean): Promise[T] =
    outer.filterEither {
      case Right(x) => pred(x)
      case _ => false
    }

  final def failure(): Promise[Throwable] =
    outer.mapEither {
      case Left(err) => Right(err)
      case _ => Promise.leftUndefined
    }

}
Ok, so this one will be easy to explain; all it does is wrapping the calls to PromiseIsEitherMonadPlus in order to expose only some methods involving values unboxed from Either.
Two things to explain here;
  • I've used 'outer.asInstanceOf[Promise[U]]' because it cost to construct a new promise to handle the same Exception and has the computation is already performed, there's no semantic difference returning it, saving computation for higher performance (it's a hack).
     
  • 'failure' does something interesting; it returns a promise which owns the failure as value Either[Throwable, Throwable], this is useful when you want to deal with a computation failure. (Note that subsequent faillure of Promise built from it will not appear as values as they would be handled as usual in the Left part of Either).

  • As Either is already a Monad, I could have use less pattern matching, relying on its Right/Left Projections (read the scala.Either doc for more information), I've deliberately kept the code easy to grasp with basic knowledge as I know early Scala developpers discover Either way after List / Options.

Example:

def mapExample(fileNameProm : Promise[String], fileContent : String => Promise[String]) {
    
    val lengthProm = 
      fileNameProm.flatMap(fileContent).map{_.length}
    
    val lengthProm2 = 
      for {
        fileName <– fileNameProm
        content <– fileContent(fileName)
      } yield content.length
    
    lengthProm.foreachEither {
      case Left(err) => println("we add an error " + err.toString())
      case Right(v) => println("successful text length is " + v)
    }
      
  }

Way better! You can also see an example of for comprehension (lengthProm2 is equivalent to lengthProm).

But sometimes I have to deal with it..

Perfect, let's introduce two useful implementations.
final class PromiseCanAbstractFailure[+T](outer: Promise[Either[Throwable, T]]) {
  def unliftEither: Promise[T] = {
    val prom = Promise[T]()
    outer.foreachEither {
      case Right(v) => prom setPromiseEither v
      case Left(err) => prom setPromiseEither Left(err)
    }
    prom
  }
}

final class PromiseCanExposeFailure[+T](outer: Promise[T]) {
  def liftEither: Promise[Either[Throwable, T]] = {
    val prom = Promise[Either[Throwable, T]]()
    outer.foreachEither { prom setPromise _ }
    prom
  }
}
They give you the ability to expose, or not, the Either boxing; recall I told you we would have a more natural way to deal with the Promise in explicit Either form, here it is.
Note the outer type of PromiseCanAbstractFailure; this means that it can only be used on Promise who's value type is already a Either, that's a way to use the power of implicits to provide functionnalities depending on the type.

Example:

def eitherMonadExample(fileNameProm : Promise[String], fileContent : String => Promise[String]) {
   
    val lengthProm = fileNameProm.flatMap(fileContent).map{_.length}
    val lengthPromEither = lengthProm.liftEither
    
    lengthPromEither.foreach{ // normal foreach (not foreachEither), we can now also use for comprehensions..
      case Left(err) => println("we had an error " + err.toString());
      case Right(v) => println("successful text length is " + v)
    }
    
    val lengthProm2 = lengthPromEither.unliftEither // back to Promise[Int]
 
  }



Asynchronicity.

In fact, a lot of asynchronicity has already been covered; its at the heart of the promise as they rely on 'callbacks' so each time we've called (directly or indirectly) foreachEither, we've built asynchronous computations.
Here's the implementation of some additional asynchronicity methods for the Promises:
final class PromiseIsAsynchronous[+T](outer: Promise[T]) {

  def async: Promise[T] = {
    val prom = Promise.async[T]()
    outer.foreachEither { prom setPromiseEither _ }
    prom
  }
 
  final def now(): Option[T] =
    outer.nowEither() match {
      case Some(Right(v)) => Some(v)
      case _ => None
    }
  
}
Async is a forwarder on a promise specifically built using Promise.async().
This constructor will create a promise and deffer its execution using a threadPoolExecutor, bringing the ability to start computation asynchronously (more to come in the next post).
'now()' is the helper method we've discussed at the beginning of this post, it provides a way to access the current Success value without the Either boxing.

Example:

def asyncExample(fileNameProm : Promise[String], fileContent : String => Promise[String]) = {
    println("DEBUG: current value " + fileNameProm.now()) // during dev
    fileNameProm.async.flatMap(fileContent) // I am sure I will return ASAP and it will be executed asynchronously
  }


But where's my orElse stuff?

Ok, you recall we said Promise were a Monad Plus, so it's time to unleash some of its power..
Here are some combinators among stands our beloved orElse and an interesting variation..
final class PromiseCombinators[+T](outer: Promise[T]) {

  def orElse[U >: T](other: => Promise[U]): Promise[U] = {
    val prom = Promise[U]()
    outer.foreachEither {
      case r@Right(_) => prom.setPromiseEither(r)
      case _ =>
        try {
          other.foreachEither { prom setPromiseEither _ }
        } catch {
          case err: Throwable => prom.setPromiseEither(Left(err))
        }
    }
    prom
  }

  def &&[U](other: => Promise[U]): Promise[(T, U)] = {
    val prom = Promise[(T, U)]()
    outer.foreachEither {
      case Right(t) =>
        try {
          other.foreachEither {
            case Right(u) => prom.setPromise((t, u))
            case Left(err) => prom.setPromiseEither(Left(err))
          }
        } catch {
          case err => prom.setPromiseEither(Left(err)) // caused by 'other' evaluation
        }
      case Left(err) => prom.setPromiseEither(Left(err))
    }
    prom
  }

  def or[U >: T](other: Promise[U]): Promise[U] = {
    val prom = Promise[U]()
    val status = new java.util.concurrent.atomic.AtomicInteger(0) // 0 -> not set, 2 -> value set (right), 1 -> one error (left)

    val updateAsNeeded: Either[Throwable, U] => Unit = {
      case r: Right[_, _] =>
        if (status.getAndSet(2) < 2) { 
          prom.setPromiseEither(r)
        }
      case l: Left[_, _] =>
        if (status.getAndSet(1) == 1) { ///we only want to write if we are the second error
          prom.setPromiseEither(l)
        }
    }

    outer.foreachEither(updateAsNeeded)
    other.foreachEither(updateAsNeeded)

    prom
  }

}
Let analyze that a bit.
OrElse give you the ability to build a Promise whose value will be bound to a first promise if it succeed or a second one otherwise (note that it could be chained; you can reduce a List of promises with it). OrElse is based on building blocks already exposed so far; the interesting new elements are:
  • U >: T which means that U is defined as a super type of T. You recall, as our Promise are covariant, they can play nicely with subtyping. So if you use orElse with two Promises, the resulting promise will have it's type equals to the lower common upperBound of both types, ultimately Any.
     
  • 'other' parameter is passed as a thunk; meaning it is not evaluated yet (i.e. the Promise instance may not have been created); each call to it's name will result in an evaluation, hence the 'call by name evaluation' naming (pun not intended :) ). This evaluation takes place surrounded by some try catch which sets the failure if any.
'&&' method provides a way to build a Promise out of two other Promises, combining their result in a pair using two pattern matching. Nothing new is introduced here.

'or' method is a bit more interesting and will bring us an example usage of the atomic package. Basically, or is the same as orElse but without any notion of ordering and will requiers both, potentially lazy, computations (orElse requiers the second only if the first has failed).
'or' returns a Promise who's value is the equals to the value of the first successful Promise.
So we want to get the first successful promise or the second failing one. We do not want to block (lock free) and have no specific order.
We'll use an atomic integer to represent our status with three posibilities;
  • 0, nothing has happened yet (initial state).
  • 1, an error happened.
  • 2, a success happened.
When a Promise succeed, we get the previous value to check if we already had a success while setting the success value in an atomic way (it's using a microprocessor special CAS (check and set) blazing fast non bloking instruction under the hood, if available), if no success, then we can fulfill the new Promise; if the other Promise would have failed, the status could have been 1, and then everything would be equaly fine.
When a Promise fails, we check if the status equals 1 (the other Promise has failed) while setting it to 1, if that's true, then we can fail the new Promise safely.
(Note that a good extension to the Promise would be the ability to provide a way to combine the two failing exceptions in order to give the ability to not loose information, it could be done and would not be difficult, however, storing the first exception would requier a Volatile variable in order to be sure the variable would be visible consistently from any threads (promises may be executed in different threads) as nowadays processors, mostly working in their cache at full speed, are not as synched with the main memory as the uninitiated would guess).

Example:

def orElseExample(localizedContent : Promise[String], defaultContent : Promise[String]) = {
    val content = localizedContent orElse defaultContent
    
    val localizedAndDefault = localizedContent && defaultContent
    
    val firstDelivered = localizedContent or defaultContent
  }


Ok, now we've covered a lot of functionnalities. We've also  seen some way to implement concurrent atomic operations.
I hope this post was useful and will make another post diving into the Promises implementations very soon.

You may find the github project here, I've put everything in a single file to ease playing with it: https://github.com/sledorze/Promises

part two is here

This is my first post on this Blog, so be kind, no trolling, usefull feedback welcome and Share! :)