Thursday, July 9, 2015

NodeJS - Concurrency Model

NodeJS is magical. It is fast and easy and monolingual from UI to DAL. It is the purest unicorn of technology. It supports concurrency but is single threaded.So your code executes in the single thread, However, all I/O is evented and asynchronous, so the following won't block the server. The callback and the promise abstractions provide easy access to the event loop. Any I/O call saves the callback and returns control to the node runtime environment.One key idea is "CPU-intensive work should be split off to another process with which you can interact as with events, or by using an abstraction like WebWorkers." Competing ideas to ponder are the use of the cluster module presented in Portland's use group Chris McDonald's example or using PM2 as outlined here clustering made easy managing eventing

Basic Concepts
  • Continuation passing style functional programing paradigm where each function provides an extra argument to pass a return value to it. That means that when invoking a CPS function, the calling function is required to supply a procedure to be invoked with the subroutine's "return" value. Expressing code in this form makes a number of things explicit which are implicit in direct style. Procedure returns become apparent as calls to a continuation; intermediate values, which are all given names; order of argument evaluation, which is made explicit; and the final action of the called procedure, which simply call a procedure with the same continuation, unmodified, that was passed to the caller.
  • Streams Readable and writable streams an alternative way of interacting with (file|network|process) I/O.
  • Buffers Buffers provide a binary-friendly, higher-performance alternative to strings by exposing raw memory allocation outside the V8 heap.
  • Events Many Node.js core libraries emit events. You can use EventEmitters to implement this pattern in your own applications.
  • Timers setTimeout for one-time delayed execution of code, setInterval for periodically repeating execution of code. See http://book.mixu.net/node/ch9.html

Cautionary Tales and Coding Practices
http://callbackhell.com/
http://becausejavascript.com/node-js-process-nexttick-vs-setimmediate/
http://howtonode.org/understanding-process-next-tick

PayPal has a lot of data and a lot of concurrent users. This is a problem I want to have, so when I see them move from Java to Node, I pay attention. Paypal developed Java and Node pages side by side to benchmark performance. To quote, the node.js app was:
  • Built almost twice as fast with fewer people
  • Written in 33% fewer lines of code
  • Constructed with 40% fewer files

Both CRUD apps were simple (few routes, few API calls). Node outperformed Java, even given the fact that the Java app had a two month head start. You can see the performance benchmarks here
https://www.paypal-engineering.com/2013/11/22/node-js-at-paypal/

I care about Node because I want an easier path to building a site to secure access to business intelligence assets. Most data services teams I work with have no web developers at all, and rely on Sharepoint and Power BI to deliver what-if analysis. Based on my review of Express with templating and the elegance of D3 on Angular for charting, I think Node becomes less "too hard" to implement for a reporting application team.

I wanted to understand the concurrency model as compared to using the actor pattern implemented in Hadoop via Akka.



Some ideas I liked
"

It is useful to understand how node and V8 interact. Node handles waiting for I/O or timers from the operating system. When node wakes up from I/O or a timer, it generally has some JavaScript callbacks to invoke. When node runs these callbacks, control is passed into V8 until V8 returns back to node.
So, if you do var ii = 1; ii++;, you will never find that ii is anything other than 2. All JavaScript runs until completion, and then control is passed back to node. If you do doSomething(); doSomething(); that will always run doSomething twice, and it will not return to node's event loop until the second invocation of doSomething returns. This means you can completely lock up node from a simple error like this:
for (var i=0 ; i >= 0 ; i++) {}
It doesn't mater how many I/O callbacks you have registered, timers set to go off, or sockets waiting to be read. Until V8 returns from that infinite loop, node does no more work.
This is part of what makes programming in node so nice. You never have to worry about locking. There are no race conditions or critical sections. There is only one thread where your JavaScript code runs."

REFS
https://www.paypal-engineering.com/2013/11/22/node-js-at-paypal/
http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
http://stackoverflow.com/questions/5153492/models-of-concurrency-in-nodejs
https://creationix.com/jsconf.pdf
https://github.com/xk/node-threads-a-gogo

No comments:

Post a Comment