Skip to content

Instantly share code, notes, and snippets.

@HCRitter
Created May 7, 2024 12:20
Show Gist options
  • Save HCRitter/e560cef3f34d5a375e2b537a68af4a12 to your computer and use it in GitHub Desktop.
Save HCRitter/e560cef3f34d5a375e2b537a68af4a12 to your computer and use it in GitHub Desktop.
How to get the first line of a really large file in PowerShell? Adam Bertram adviced to use the Select-Object -first 1 instead of (Get-Content -Path "file.txt")[0] # Which is correct, but maybe Switch -file is even faster? test based on https://x.com/adbertram/status/1781642880769032196
# How to get the first line of a really large file in PowerShell?
# Adam Bertram adviced to use the Select-Object -first 1 instead of (Get-Content -Path "file.txt")[0]
# Which is correct, but maybe Switch -file is even faster?
# test based on https://x.com/adbertram/status/1781642880769032196
$longfile = 'HugeFileList.txt' # One million lines - 49MB
# retreive the first line of a file very fast
$firstline = switch -file ($Longfile){Default{$_;break}}
# retreive the first line of a file also fast
$firstline = Get-Content -path $longfile| Select-object -First 1
# Compare the performance of both methods
0..1kb| ForEach-Object{
[PSCustomObject]@{
Name = 'Switch'
Ticks = (measure-command {
$firstline = switch -file ($Longfile){Default{$_;break}}
}).Ticks
}
[PSCustomObject]@{
Name = 'SelectFirst'
Ticks = (measure-command {
$firstline = Get-Content -path $longfile| Select-object -First 1
}).Ticks
}
} | Group-Object -Property Name | ForEach-Object{
[PSCustomObject]@{
TestName = $_.Name
AVGTicks = ($_.Group.Ticks | Measure-Object -Average).Average
}
} | Sort-Object -Property AVGTicks
<#
Result for 1024 runs:
TestName AVGTicks
-------- --------
Switch 6236.88
SelectFirst 18622.26
#>
@sassdawe
Copy link

sassdawe commented May 7, 2024

What is this sourcery?

> Get-Command Switch

Get-Command: The term 'Switch' is not recognized as a name of a cmdlet, function, script file, or executable program

@HCRitter
Copy link
Author

HCRitter commented May 7, 2024

@jhochwald
Copy link

@HCRitter You could use the StreamReader option...

try
{
   $StreamReader = (New-Object -TypeName System.IO.StreamReader -ArgumentList ($BigTextFile))
   $firstline = ($StreamReader.ReadLine() | Select-Object -First 1)
}
finally
{
   $StreamReader.Close()
   $StreamReader = $null
}

StreamReader might be a bit faster then Get-Content, but not as fast as the switch! And it is never as efficient as the Switch, because all other options read the complete File, while the switch statement will not
And the bigegr the file get's the mote beats this Switch all other possible options. ;-)

@jhochwald
Copy link

Here is my result:

TestName                  AvgTicks
--------                  --------
Switch            3756,21756097561
StreamReader      14313,5609756098
SelectFirst       21315,0292682927
SelectFirstIOFile 3009278,38536585

And here the test code for it:

0..1kb | ForEach-Object -Process {
   [PSCustomObject]@{
      Name  = 'Switch'
      Ticks = (Measure-Command -Expression {
            $firstline = switch -file ($BigTextFile){Default
               {
                  $_
                  break
               }
            }
      }).Ticks
   }
   [PSCustomObject]@{
      Name  = 'SelectFirst'
      Ticks = (Measure-Command -Expression {
            $firstline = (Get-Content -Path $BigTextFile | Select-Object -First 1)
      }).Ticks
   }
   [PSCustomObject]@{
      Name  = 'SelectFirstIOFile'
      Ticks = (Measure-Command -Expression {
            $firstline = ([io.file]::ReadAllLines($BigTextFile) | Select-Object -First 1)
      }).Ticks
   }
   [PSCustomObject]@{
      Name  = 'StreamReader'
      Ticks = (Measure-Command -Expression {
            try
            {
               $StreamReader = (New-Object -TypeName System.IO.StreamReader -ArgumentList ($BigTextFile))
               $firstline = ($StreamReader.ReadLine() | Select-Object -First 1)
            }
            finally
            {
               $StreamReader.Close()
               $StreamReader = $null
            }
      }).Ticks
   }
} | Group-Object -Property Name | ForEach-Object -Process {
   [PSCustomObject]@{
      TestName = $_.Name
      AvgTicks = ($_.Group.Ticks | Measure-Object -Average).Average
   }
} | Sort-Object -Property AvgTicks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment