Using Parallel Jobs

Listen to this blog post!

Table of contents:

Parallel loops are great for executing the same task multiple times — for example, pinging a range of computers. However, loops do not work well if you want to parallelize completely different tasks.

Jobs

PowerShell introduced Jobs to execute script blocks in the background, and each job can run its own code. Jobs are ideal for executing different tasks.

Keep in mind that jobs — like parallel loops — run in isolated environments that do not share variables or functions with your main script.

At first glance, jobs seem simple and straightforward:

  • Three example tasks need to run. Together, they take 6 + 8 + 5 seconds, so executing them sequentially would take 19 seconds.
  • By running two of them in the background, the total execution time is determined by the longest-running job — in this case, 8 seconds.
# three sample jobs 
# (simulating some activity for a given period of time,
#  then returning a status number)
$job1 = {  Start-Sleep -Seconds 6 ; 1}
$job2 = {  Start-Sleep -Seconds 8 ; 2}
$job3 = {  Start-Sleep -Seconds 5 ; 2}

# sending two of these to the background to execute in parallel
$handle1 = Start-Job -ScriptBlock $job1
$handle2 = Start-Job -ScriptBlock $job2

# executing one job ourselves
$result3 = & $job3

# waiting for ALL code to be finished
$null = Wait-Job -Job $handle1, $handle2

# receiving the results from the two background jobs
$result1 = Receive-Job -Job $handle1
$result2 = Receive-Job -Job $handle2

# cleaning up the two background jobs
Remove-Job -Job $handle1, $handle2

# results are now available in $result1, $result2, and $result3 

Execution times are as expected, just a little over 8 seconds. While 8 seconds may not seem like much, keep in mind that this is for illustration only. Time savings could just as easily be minutes or even hours, depending on how long your individual tasks take.

Job Caveats

When you start using jobs in real-world scenarios, they can often perform worse than regular scripts.

By default, PowerShell executes jobs out-of-process: it starts a separate powershell.exe process and copies results to your main PowerShell session via XML serialization. As long as you don’t pass back much data from your jobs (such as status numbers in the previous example), performance is fine.

However, once you start returning objects, the required serialization can consume all the time you gained through parallelization. Try running the code above with these jobs:

$job1 = { Get-Hotfix }
$job2 = { Get-Service }
$job3 = { Start-Sleep -Seconds 5 ; 2} 

Even though these commands don’t take much time to run, sending them to jobs can make the overall script execution time close to a minute. This time penalty is caused by the serialization required to wrap the return data into XML.

Worse still, as a consequence, the jobs no longer return the original objects:

Using Jobs Effectively

To use jobs effectively and actually benefit from faster execution times, there are two main strategies:

  • Use the jobs as-is, but ensure that the code executed by the job does not return large data. Limit the job code to returning just a status number. This way, you bypass the expensive data serialization, and jobs work very well.
  • Use threads instead of out-of-process jobs. The issue with jobs is that they run in separate PowerShell hosts, which requires expensive data serialization. By using threads instead, no serialization is needed, and your jobs run just as fast, regardless of how much data they return.

Using Thread Jobs

In PowerShell 7, much more efficient thread-based jobs have been introduced. They can also be added to Windows PowerShell, as illustrated below.

In PowerShell 7, all it takes is to replace Start-Job with Start-ThreadJob. Now the job executes within your existing PowerShell process, but in a separate thread — much like parallel looping. This eliminates the need for expensive data serialization:

# three sample jobs
# (simulating some activity for a given period of time,
#  then returning a status number)
$job1 = { Get-Process }
$job2 = { Get-Service }
$job3 = { Get-Date }

# sending two of these to the background to execute in parallel
$handle1 = Start-ThreadJob -ScriptBlock $job1 
$handle2 = Start-ThreadJob -ScriptBlock $job2

# executing one job ourselves
$result3 = & $job3

# waiting for ALL code to be finished
$null = Wait-Job -Job $handle1, $handle2

# receiving the results from the two background jobs
$result1 = Receive-Job -Job $handle1
$result2 = Receive-Job -Job $handle2

# cleaning up the two background jobs
Remove-Job -Job $handle1, $handle2

# results are now available in $result1, $result2, and $result3 

With Start-Job, this script took almost a minute. With Start-ThreadJob, it executes in the blink of an eye. Additionally, the data returned by your jobs is no longer serialized. Instead, you now get the real objects, including all their methods:

Thread Jobs in Windows PowerShell

Windows PowerShell does not include a built-in Start-ThreadJob, but you can easily add it by installing the ThreadJob module from the PowerShell Gallery:

Install-Module -Name ThreadJob -Scope CurrentUser 

Now Windows PowerShell also has Start-ThreadJob and can run jobs blazingly fast.

Let’s summarize:

  • Jobs are ideal for running different code simultaneously.
  • Default jobs can create significant overhead when returning data due to expensive data serialization.
  • PowerShell 7 includes Start-ThreadJob, and Windows PowerShell can add this cmdlet via the ThreadJob module. Simply replacing Start-Job with Start-ThreadJob eliminates all bottlenecks and unlocks the true power of jobs.

Related links