Friday, February 27, 2015

Some Information on Data Visualization

Just a few links to help think about charting and color selection.

Notes
Color blindness need not drive every color palette choice if the graphic is also meaningful rendered without color contrast. Where possible provide either safe colors or strong complimentary visualization. If the objects are not discrete (orange and red palette in a heat map or stacked bar chart).
-me

1. Sequential schemes are suited to ordered data that progress from low to high. Lightness steps dominate the look of these schemes, with light colors for low data values to dark colors for high data values.
2. Diverging schemes put equal emphasis on mid-range critical values and extremes at both ends of the data range. The critical class or break in the middle of the legend is emphasized with light colors and low and high extremes are emphasized with dark colors that have contrasting hues.
Learn more »
3. Qualitative schemes do not imply magnitude differences between legend classes, and hues are used to create the primary visual differences between classes. Qualitative schemes are best suited to representing nominal or categorical data.
Learn more »

Brewer, Cynthia A. 1994. Color use guidelines for mapping and visualization. Chapter 7 (pp. 123-147) in Visualization in Modern Cartograph
Theory
http://blog.visual.ly/the-use-of-yellow-in-data-design/
http://www-psych.stanford.edu/~bt/diagrams/papers/diagramsstockholm04.pdf

Technologies
https://plot.ly/plot
http://blog.visual.ly/using-selections-in-d3-to-make-data-driven-visualizations/


Palettes
Look at the color palettes used by classic painters
http://visual.ly/10-artists-10-years-color-palettes
http://colorbrewer2.org/


Sunday, February 8, 2015

Intro to Data Science - Threading Notes

In working on a simple tutorial I always tend to wander off on a tangent. This generally helps me engage the content more fully. In some cases it just confuses the heck out of me for a half hour. Looking at a threading tutorial in C# In A Nutshell I wanted to pass in instance variables to allow me to see the actual changes occurring as threads were being started. Seemed very straight forward to me, but I was getting Invalid Arguments errors. I found this finally "Thread methods needs (sic) to be a method with return type void and accepting no argument." S M Kamran on Stack Overflow. Now this got me wondering why is this such a bad idea it is disallowed.

    public class Asyncer
    {

        public static DateTime time1;
        public static long ticks1;
        public static DateTime time2 = System.DateTime.UtcNow;
        public static long ticks2 ;

        static void Main() {

            async();
            doSomeWork(" --A-- ", 12);
            doSomeWork(" --B-- ", 6);
            Console.ReadLine();
        }

        public static Thread async() {

            Thread threads = new Thread(doSomeWork(" --A-- ", 12));
            threads.Start();
       
        }
       
        public static void doSomeWork(string inPuter, Int32 inter) {
            for
                (Int32 cnt = 0; cnt <= inter; cnt++)

                //time1 = System.DateTime.UtcNow;
                //ticks1 = time1.Ticks;
                Console.Write(inPuter + Convert.ToString(ticks2) + Convert.ToString(ticks1));
        }
    }

Now this is not the point of what I was trying to learn, but a good question I wanted to write down for later.

Good things that I learned from this:

If you have a set of short lived threads you want to ensure to contain and sequence, use a thread pool.

Creating new threads is costly, so avoid it unless you want to force a thread to run in the foreground (taking precedence)  or actively synchronize long running threads. Another question for later is whether context sensitive threading requires creating the threads manually.

If you want to create threads you want to manage yourself, but do not need them to run in the foreground use  getter/setter threads.IsBackground = true; when newing up the thread.

The ThreadState property of a thread states whether the thread is foreground/background and running state.

That code executed in the Main method is a foreground thread. In the example below I execute two method calls and thread two more, where one thread is backgrounded and the other foregrounded.

This is the execution pattern for the code below on an i7 quad core laptop:
 --A-- 635590626227738088 BG 635590626227738090 FG 635590626227738089 BG 6355906
26228050713 FG 635590626228050712 --A-- 635590626228050711 --A-- 635590626228363
175 FG 635590626228363176 BG 635590626228363177 --A-- 635590626228675925 FG 6355
90626228675926 BG 635590626228675927 FG 635590626228988455 BG 635590626228988456
 --A-- 635590626228988454 FG 635590626229300719 --A-- 635590626229300717 BG 6355
90626229300719 FG 635590626229613232 BG 635590626229613234 --A-- 635590626229613
232 FG 635590626229925715 BG 635590626229925715 --A-- 635590626229925713 --A-- 6
35590626230238211 FG 635590626230238213 BG 635590626230238213 BG 635590626230550
780 --A-- 635590626230550778 FG 635590626230550779 FG 635590626230863286 BG 6355
90626230863287 --A-- 635590626230863285 FG 635590626231175797 BG 635590626231175
798 --A-- 635590626231175796 FG 635590626231488304 --A-- 635590626231488303 BG 6
35590626231488305 --B-- 635590626231800814 BG 635590626231800814 FG 635590626231
800815 FG 635590626232113326 BG 635590626232113327 --B-- 635590626232113325 FG 6
35590626232425841 --B-- 635590626232425840 BG 635590626232425842 FG 635590626232
738357 --B-- 635590626232738356 BG 635590626232738358 --B-- 635590626233050880 B
G 635590626233050880 FG 635590626233050881 BG 635590626233363393 FG 635590626233
363392 --B-- 635590626233363391 FG 635590626233675903 BG 635590626233675904 --B-
- 635590626233675902 BG 635590626233988419 FG 635590626233988418

I was surprised to that the background process is executed before the second execution of doSomeWork is ever executed. It makes sense that all executions of the first instance of doSomeWork must complete before the second, and that there is processor available for the BG process while B is waiting. Not as surprising is that the Console also interjects some syncronicity issues in returning the results. We see the last few executions occured in this predictable order
--B-- 635590626233363391
--B-- 635590626233675902
FG 635590626233675903
BG 635590626233675904
FG 635590626233988418

BG 635590626233988419

But were written in this WTF order:
--B-- 635590626233363391
FG 635590626233675903
BG 635590626233675904
--B-- 635590626233675902
BG 635590626233988419
FG 635590626233988418

 
using System;
using System.Threading;

namespace AsyncProject
{
    public class Asyncer
    {
        public static DateTime time1;
        public static DateTime time2 = System.DateTime.UtcNow;
        public static long ticks1;
        public static long ticks2;

        static void Main() {

            //execute a background thread
            Thread threads = new Thread(doSomeWorkFC);
            threads.IsBackground = true;
            threads.Start();

            //execute a foreground thread
            Thread threader = new Thread(doSomeWorkBG);
            threader.IsBackground = false; //default value setting for example clarity
            threader.Start();

            //execute some methods in foreground w/o explicitly creating a thread
            doSomeWork(" --A-- ", 12);
            doSomeWork(" --B-- ", 6);

            //hold open the console
            Console.ReadLine();
        }
        public static void doSomeWork(string inPuter, Int32 inter)
        {
            for
                (Int32 cnt = 0; cnt <= inter; cnt++)
            {
                time1 = System.DateTime.UtcNow;
                ticks1 = time1.Ticks;
                Console.Write(inPuter + Convert.ToString(ticks1));
                Thread.Sleep(20);
            }
        }

        public static void doSomeWorkFC()
        {
            for
                (Int32 cnt = 0; cnt <= 20; cnt++)
            {
                time1 = System.DateTime.UtcNow;
                ticks1 = time1.Ticks+1;
                Console.Write(" FG " + Convert.ToString(ticks1));
                Thread.Sleep(20);
            }
        }

        public static void doSomeWorkBG()
        {
            for
                (Int32 cnt = 0; cnt <= 20; cnt++)
            {
                time1 = System.DateTime.UtcNow;
                ticks1 = time1.Ticks+2;
                Console.Write(" BG " + Convert.ToString(ticks1));
                Thread.Sleep(20);
            }
        }
    }
}



REF:
http://stackoverflow.com/questions/230003/thread-vs-threadpool
http://stackoverflow.com/questions/5155979/c-sharp-thread-method

Sunday, February 1, 2015

SQL To Monitor Table Load Progress



DECLARE @1 TABLE (Startcnt1 BIGINT, Startcnt2 BIGINT,Startcnt3 BIGINT, Secondcnt1 BIGINT, Secondcnt2 BIGINT, Secondcnt3 BIGINT)


INSERT INTO @1
        ( Startcnt1 ,
          Startcnt2 ,
          Startcnt3
        )
SELECT
(SELECT COUNT(1) FROM Table1),
(SELECT COUNT(1) FROM Table2),
(SELECT COUNT(1) FROM Table3)

WAITFOR DELAY '00:01'

UPDATE @1
SET Secondcnt1 = (SELECT COUNT(1) FROM Table1)

UPDATE @1
SET Secondcnt2 = (SELECT COUNT(1) FROM Table2)

UPDATE @1
SET Secondcnt3 = (SELECT COUNT(1) FROM Table3)

SELECT Startcnt1 ,
       Secondcnt1 ,
       Secondcnt1 -Startcnt1 ChangeCnt1,
       Startcnt2 ,
       Secondcnt2 ,
Secondcnt2 - Startcnt2  ChangeCnt2,
       Startcnt3 ,
       Secondcnt3 ,
Secondcnt3 - Startcnt3  ChangeCnt3
FROM @1