Created
May 7, 2024 12:20
-
-
Save HCRitter/e560cef3f34d5a375e2b537a68af4a12 to your computer and use it in GitHub Desktop.
How to get the first line of a really large file in PowerShell? Adam Bertram adviced to use the Select-Object -first 1 instead of (Get-Content -Path "file.txt")[0] # Which is correct, but maybe Switch -file is even faster? test based on https://x.com/adbertram/status/1781642880769032196
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# How to get the first line of a really large file in PowerShell? | |
# Adam Bertram adviced to use the Select-Object -first 1 instead of (Get-Content -Path "file.txt")[0] | |
# Which is correct, but maybe Switch -file is even faster? | |
# test based on https://x.com/adbertram/status/1781642880769032196 | |
$longfile = 'HugeFileList.txt' # One million lines - 49MB | |
# retreive the first line of a file very fast | |
$firstline = switch -file ($Longfile){Default{$_;break}} | |
# retreive the first line of a file also fast | |
$firstline = Get-Content -path $longfile| Select-object -First 1 | |
# Compare the performance of both methods | |
0..1kb| ForEach-Object{ | |
[PSCustomObject]@{ | |
Name = 'Switch' | |
Ticks = (measure-command { | |
$firstline = switch -file ($Longfile){Default{$_;break}} | |
}).Ticks | |
} | |
[PSCustomObject]@{ | |
Name = 'SelectFirst' | |
Ticks = (measure-command { | |
$firstline = Get-Content -path $longfile| Select-object -First 1 | |
}).Ticks | |
} | |
} | Group-Object -Property Name | ForEach-Object{ | |
[PSCustomObject]@{ | |
TestName = $_.Name | |
AVGTicks = ($_.Group.Ticks | Measure-Object -Average).Average | |
} | |
} | Sort-Object -Property AVGTicks | |
<# | |
Result for 1024 runs: | |
TestName AVGTicks | |
-------- -------- | |
Switch 6236.88 | |
SelectFirst 18622.26 | |
#> |
Switch is a statement not a command
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_switch?view=powershell-7.4
Like if-Statements are also not commands :)
@sassdawe
@HCRitter You could use the StreamReader option...
try
{
$StreamReader = (New-Object -TypeName System.IO.StreamReader -ArgumentList ($BigTextFile))
$firstline = ($StreamReader.ReadLine() | Select-Object -First 1)
}
finally
{
$StreamReader.Close()
$StreamReader = $null
}
StreamReader might be a bit faster then Get-Content, but not as fast as the switch! And it is never as efficient as the Switch, because all other options read the complete File, while the switch statement will not
And the bigegr the file get's the mote beats this Switch all other possible options. ;-)
Here is my result:
TestName AvgTicks
-------- --------
Switch 3756,21756097561
StreamReader 14313,5609756098
SelectFirst 21315,0292682927
SelectFirstIOFile 3009278,38536585
And here the test code for it:
0..1kb | ForEach-Object -Process {
[PSCustomObject]@{
Name = 'Switch'
Ticks = (Measure-Command -Expression {
$firstline = switch -file ($BigTextFile){Default
{
$_
break
}
}
}).Ticks
}
[PSCustomObject]@{
Name = 'SelectFirst'
Ticks = (Measure-Command -Expression {
$firstline = (Get-Content -Path $BigTextFile | Select-Object -First 1)
}).Ticks
}
[PSCustomObject]@{
Name = 'SelectFirstIOFile'
Ticks = (Measure-Command -Expression {
$firstline = ([io.file]::ReadAllLines($BigTextFile) | Select-Object -First 1)
}).Ticks
}
[PSCustomObject]@{
Name = 'StreamReader'
Ticks = (Measure-Command -Expression {
try
{
$StreamReader = (New-Object -TypeName System.IO.StreamReader -ArgumentList ($BigTextFile))
$firstline = ($StreamReader.ReadLine() | Select-Object -First 1)
}
finally
{
$StreamReader.Close()
$StreamReader = $null
}
}).Ticks
}
} | Group-Object -Property Name | ForEach-Object -Process {
[PSCustomObject]@{
TestName = $_.Name
AvgTicks = ($_.Group.Ticks | Measure-Object -Average).Average
}
} | Sort-Object -Property AvgTicks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What is this sourcery?