Monday, June 1, 2015

Actor Pattern Work Distribution Model

We determined that the combined implementation and support costs of Hadoop on REHL was undesirable if it could be avoided. It was also clear to us that in the time we were looking to implement, Hadoop on Windows had the following unappealing challenges:
  • it was a support problem for Microsoft internally, even with all the resources available to the Azure team
  • the online community was too limited to provide consistent consensus for making architectual choices
  • initial attempts to deploy to Windows laptops for local development resulted in frustration.
  • OS licensing costs for Windows blades lowered the value proposition of Hadoop.
I was given the assignment to research a framework to support the actor pattern as a way to build a light weight real time analytics system on a Windows platform. We started looking at Akka.Net and trying to understand the difference an actor framework would make over managing thread pools and concurrency directly. Several team members had been personally interested in Akka.net, which is the only reason we did not start with the Orleans project (which may have been a better fit for us actually). We knew we really just needed some data abstraction layer like Spark (which is predicated on Akka in Scala) and a distribution framework (like Akka...).

As Akka.net was a port to .Net, much of the documentation was either for Java or Scala. I found Typesafe had good documentation. Akka.io did as well. Some of the basic concepts of the actor pattern were available via coursera or youtube


Basic steps to prove out (partially complete):
Simple micro service
starting standalone HTTP server
handling simple file-based configuration
logging
routing,
deconstructing requests
serialize JSON entities to class entities
deserialize class entities to JSON messages
error handling
issuing requests to external services
managing requests from external services
recovery of failed actors
queue and or database persistence
integration testing with mocking of external services
operationalizing the code

Some basic architectural guidelines that were provided for Orleans but may apply to an actor model
• Significant number of loosely coupled entities (hundreds to millions)
• Entities are small enough to be single threaded
• Workload is interactive: request/response, start/monitor/complete
• Need or may need to run on >1 server
• No need for global coordination, only between a few entities at a time
• Different entities used at different times

Problematic fit
• Entities need direct access to each other’s memory
• Small number of huge entities, multithreaded
• Global coordination/consistency needed
• Long running operations, batch jobs, SIMD

REFS:
http://research.microsoft.com/pubs/244727/Orleans%20Best%20Practices.pdf
https://www.typesafe.com/activator/template/akka-http-microservice
http://akka.io/