Tuesday, September 22, 2015

Running ETL Code in Powershell Workflow

I needed a test harness to run multiple concurrent versions of the same Pentaho job. I wanted to test that the pid file feature I added prevented subsequent executions while the job was already running.

Create a file named ConcurrencyTest.ps1 with the following content
param(
 [Parameter(Position=0,
      Mandatory=$True,
      ValueFromPipeline=$True)]
    [INT]$Attempts=$(throw "You did not provide a value for Attempts parameter.")
    )

function DoStuff
{
    param(
    [Parameter(Position=0,
    Mandatory=$True,
    ValueFromPipeline=$True)]
    [int]$Iter
    )
    $root ="$env:programfiles"
    Set-Location $root\Pentaho\design-tools\data-integration
    cmd /c .\Kitchen.bat /file:C:\Source\Trunk\Transforms\Job_Ods_AggregateMongo.kjb /Level:Detailed | Out-File out.$Iter.log
}

workflow RunStuffParallel
{
    param(
        [Parameter(Position=0,
        Mandatory=$True)]
        [int]$MaxIter
    )

    $ExecutionAttempts=@(1..$MaxIter)
  
    ForEach -Parallel ($Attempt in $ExecutionAttempts)
    {
        DoStuff -Iter $Attempt
    }
}

RunStuffParallel -MaxIter $Attempts

Execute the test using .\ConcurrencyTest.ps1-Attempts 5

No comments:

Post a Comment