Parallel tasks in powershell

Posted on Fri 01 September 2023 in powershell

As the complexity of your application increases, it's a given that your build and release pipeline will get more complex. You'll notice the release time gradually creeps up which could lead to developer (or stakeholder!) frustration when waiting to validate a change when it hits your intended target environment.

The built in azure devops tasks can take you so far but again as complexity increases there is a good chance that you'll need to customise the build pipeline further and what better tool is there for that than powershell, it's extensible and powerful and runs on azure with great integration to azure with either the AZ modules or the full AZ CLI. I'm a fan, especially with my linux background, I almost prefer it bash at times.

Back to the point, increasing the complexity of your application will lead to increased build time so it would be great if we could have of those tasks run at the same time. There is a few options for this but really the best way, although the most complex, is to use powershells built in runspaces feature. Each runspace is a thread within your powershell process so this is an incredibly powerful feature which can cut your tasks runtime in half.

To get started you want to start a runspace pool like this

$RunspacePool = [runspacefactory]::CreateRunspacePool(1,2)  # Set up runspace pool
$RunspacePool.Open()

This defines a runspace pool where you can have a maximum of two runspaces (or threads) running concurrently.

$executionTime = Measure-Command -Expression {

}

Write-Host "Total execution time: $($executionTime.TotalSeconds) seconds." -ForegroundColor Cyan

For testing purposes, I like to wrap the actual work that'll be performed within the script in that Measure-Command, it helps to evaluate if the added complexity that runspaces brings is worth it for your task.

$projectDirectories = Get-ChildItem -Path $applicationPath -Filter "$applicationName.*"

    # Creating npm build runspace
    $npmBuildInstance = [powershell]::Create().AddScript({
        param($projectDirectories, $root, $applicationName)
        Set-Location $root
        . .\BuildService.ps1
        foreach($projectDirectory in $projectDirectories) {
            $buildService = [BuildService]::new($root, $applicationName)
            $buildService.RunNPMBuild($projectDirectory)
        }
    }).AddParameter('projectDirectories', $projectDirectories).AddParameter('root', $root).AddParameter('applicationName', $applicationName)

    $npmBuildInstance.RunspacePool = $RunspacePool
    $Runspaces += ,@{
        Instance = $npmBuildInstance
        Result = $npmBuildInstance.BeginInvoke()
    }

This is a relatively simple example as the actual work done is hidden within the BuildService class, however it gets the point across regarding how a thread is setup to run,

    $npmBuildInstance = [powershell]::Create().AddScript({

    })

    $npmBuildInstance.RunspacePool = $RunspacePool
    $Runspaces += ,@{
        Instance = $npmBuildInstance
        Result = $npmBuildInstance.BeginInvoke()
    }

Is how its done and the contents of the 'AddScript' function is what gets run! However, there's an important caveat, each runspace is its own powershell and has no context passed to it from the executing main script, so things like working directory or imported modules may have to be re imported or set.

The next step is to look at how we know the runspace is finished running, we'll loop round the the Runspaces object and check for all items to have their IsCompleted flag set to true.

while($Runspaces | ? { -not $_.Result.IsCompleted }) { 
    Write-Host "Awaiting completion"
    Start-Sleep -Seconds 1
}

The penultimate step is to kill the runspace and write any output from it to own our own session, otherwise we'll never know any of the output from the runspace.

# Process output or errors from each runspace
foreach($runspace in $Runspaces)
{
    $runspace.Instance.EndInvoke($runspace.Result)
    $runspace.Instance.Streams.Error | ForEach-Object {
        "[DEPLOY SERVICE ERROR] $($_.ToString())"
    }
    $runspace.Instance.Streams.Information | ForEach-Object {
        "[DEPLOY SERVICE INFO]  $($_.ToString())"
    }
}

And the final step is to close and dispose of the runspace pool and if you followed my example of wrapping the code in a Measure-Command block then you'll want to close it off and write the execution time out.

$RunspacePool.Close()
$RunspacePool.Dispose()


}
Write-Host "Total execution time: $($executionTime.TotalSeconds) seconds." -ForegroundColor Cyan

And thats it! Setting up a runspace pool on its own isn't overly complex however the real difficulty comes in any of the overlapping operations that might be occuring within each thread. It gives enormous benefits in speed, for my purposes I measured both build and publish times cut in half.