Skip to the main content.

Unlocking the Power of PowerShell: Tips for Success

Mastering PowerShell Tokenization for Efficient Scripting

The internal PowerShell parser processes any code before execution. With full access to this parser, you can analyze code directly. Why is this useful?

  • Understanding PowerShell code – Gain deeper insights into how scripts are structured and executed.
  • Auto-documenting scripts – Generate lists of commands, variables, and dependencies used in a script.
  • Security audits – Detect problematic commands in a script based on a blacklist.
  • Refactoring tools – Automatically rename variable names, remove dependencies by copying function definitions from modules directly into your script, and much more.

Converting Code into Tokens

Today, we'll start with the basics by examining the tokenizer—these are the basic tokens ('ingredients') of any PowerShell script:


PS C:\> [Enum]::GetNames([System.Management.Automation.PSTokenType]) | Sort-Object
Attribute
Command
CommandArgument
CommandParameter
Comment
GroupEnd
GroupStart
Keyword
LineContinuation
LoopLabel
Member
NewLine
Number
Operator
Position
StatementSeparator
String
Type
Unknown
Variable

You can tokenize any code. This script processes the tokens in $code:


# code to tokenize
$code = { # sample code
  $service = Get-Service |
    Where-Object Status -eq Running
}


# this empty variable must exist and will be filled by reference after the call
$syntaxErrors = $null

$tokens = [System.Management.Automation.PSParser]::Tokenize($code, [ref]$syntaxErrors)

# if syntax errors were found, they are now in $syntaxErrors
if ($syntaxErrors.Count -gt 0)
{
  # move the nested token up one level so we see all properties
  $syntaxError = $syntaxErrors | 
    Select-Object -ExpandProperty Token -Property Message
  
  $syntaxError
}
else
{
  $tokens
}

Since the sample code is valid, the script returns all tokens:


Content     : # sample code
Type        : Comment
Start       : 2
Length      : 13
StartLine   : 1
StartColumn : 3
EndLine     : 1
EndColumn   : 16

Content     : 
              
Type        : NewLine
Start       : 15
Length      : 2
StartLine   : 1
StartColumn : 16
EndLine     : 2
EndColumn   : 1

Content     : service
Type        : Variable
Start       : 19
Length      : 8
StartLine   : 2
StartColumn : 3
EndLine     : 2
EndColumn   : 11

Content     : =
Type        : Operator
Start       : 28
Length      : 1
StartLine   : 2
StartColumn : 12
EndLine     : 2
EndColumn   : 13

Content     : Get-Service
Type        : Command
Start       : 30
Length      : 11
StartLine   : 2
StartColumn : 14
EndLine     : 2
EndColumn   : 25 

(...)

Each token is an object that provides its type and exact location in the script. Many editors use these tokens to colorize your code.

Identifying Syntax Errors

If you want to see how a syntax error is handled, you can't introduce one in the script block assigned to $code. This is because the same parser checks the text during assignment, and if it's invalid, the assignment fails. Script blocks can never contain syntax errors.

However, you can simply assign a string to $code:


# code to tokenize
$code = '  # testing a syntax error
  "missing quote at the end
  $service = Get-Service |
    Where-Object Status -eq Running

When you run the script again, this is the result:


Message     : The string is missing the terminator: ".
Content     : "missing quote at the end
                $service = Get-Service |
                  Where-Object Status -eq Running
              
Type        : Position
Start       : 30
Length      : 92
StartLine   : 2
StartColumn : 3
EndLine     : 5
EndColumn   : 1

This result precisely identifies what is syntactically incorrect and could serve as the foundation for intelligent tools that autocorrect such errors. In part 2, we’ll use the tokenizer to build practical tools.

 

Good2know

Your ultimate PowerShell Cheat Sheet

Unleash the full potential of PowerShell with our handy poster. Whether you're a beginner or a seasoned pro, this cheat sheet is designed to be your go-to resource for the most important and commonly used cmdlets.

The poster is available for download and in paper form.

PowerShell Poster 2023

Get your poster here!

 

 

Related links 

 

Related posts

4 min read

Bulk Testing PowerShell Scripts with the Tokenizer

In Part 1, we explored the internal PowerShell parser to see how it analyzes code and breaks it down into individual...

3 min read

Mastering PowerShell Tokenization for Efficient Scripting

The internal PowerShell parser processes any code before execution. With full access to this parser, you can analyze...

3 min read

Using .NET Libraries in PowerShell - Functions, Cmdlets and .NET

In part 3, we identified a useful .NET method to display system dialogs and then wrapped it inside a new PowerShell...

About the author: