Skip to content

Instantly share code, notes, and snippets.

@mikeschinkel
Last active August 23, 2024 11:51
Show Gist options
  • Save mikeschinkel/14106e1aaa2b66f05ccce73b1af336e1 to your computer and use it in GitHub Desktop.
Save mikeschinkel/14106e1aaa2b66f05ccce73b1af336e1 to your computer and use it in GitHub Desktop.
Go json.Unmarshal() vs. PHP json_decode() performance a large JSON file

GoLang json.Unmarshal() vs. PHP json_decode()

To evaluate if Go json.Unmarshal() is faster or slower than PHP json_decode() for arbitrary JSON I decided to run a quick benchmark on my 2015 MacBook Pro (Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz).

I used this ~25Mb JSON file and ran the attached two program under Go 1.22 and PHP 8.3.2, respectively.

My benchmarks were very unscientific but I was just looking for orders of magnitude.

Here are the results from three different runs, each:

Go (avg 30.82)

  1. Unmarshalling took 31.082830437s for 100 iterations
  2. Unmarshalling took 30.713377126s for 100 iterations
  3. Unmarshalling took 30.665393891s for 100 iterations

PHP (avg 16.59)

  1. Decoding took 16.25067782402 seconds for 100 iterations
  2. Decoding took 17.178377866745 seconds for 100 iterations
  3. Decoding took 16.336436033249 seconds for 100 iterations

PHP is faster than Go?!? (86% faster, or takes only 54% as long)

You may be suprised becayse you know PHP is interpretted and Go is compiled and you might assume that would make PHP much faster. But the fact is the PHP's json_decode() was written in C, and Go's json.UnMarshal() uses reflection, especially for loading a type of any.

Optimizing Go

Go there are ways of making Go's JSON performance faster. Just follow the link to learn more.

I decided to explore of a few of them and found the main2.go which creates a slice of struct rather than use any shaved 22% off the any time, or Unmarshalling took 23.9677021s for 100 iterations.

Suprisingly, at least to me, switching to json.RawMessage in main3.go for the sub-structs Actor, Repo and Payload made effectively no difference from main2.go (Unmarshalling took 23.286783374s for 100 iterations)

Further, dropping the fields Actor, Repo and Payload in main4.go also made effectively no difference from main2.go or main3.go. (Unmarshalling took 22.251595239s for 100 iterations)

So it seems that ~22 seconds for 100 iterations is a floor for json.Unmarshal() for the given JSON file. That tells me if you need better performance then you'll need to evaluate one of the "fast JSON" packages mentioned here.

Hope this helps.

-Mike

P.S. Someone claimed the results would be different if I had used "object" in PHP, so I added a main2.php to test it. I had to increase the memory limit though, which I doubled to 512M from 256M.

The results were:

  1. Decoding took 17.953987836838 seconds for 100 iterations
  2. Decoding took 17.878257989883 seconds for 100 iterations
  3. Decoding took 17.94038105011 seconds for 100 iterations

The difference for objects from these results is %7.5 slower, or 17.92 vs. 16.59 seconds on average.

package main
import (
"encoding/json"
"fmt"
"os"
"time"
)
const iterations = 100
func main() {
filename := "large-file.json"
// Adjust the number of iterations as needed
// Read the JSON file outside of the timing loop
data, err := os.ReadFile(filename)
if err != nil {
fmt.Printf("failed to read file: %v\n", err)
return
}
start := time.Now() // Start the timer
for i := 0; i < iterations; i++ {
var jsonData any // Use this for JSON arrays
// var jsonData map[string]interface{} // Use this instead if the JSON root is an object
if err := json.Unmarshal(data, &jsonData); err != nil {
fmt.Printf("failed to unmarshal json: %v\n", err)
return
}
}
duration := time.Since(start)
fmt.Printf("Unmarshalling took %v for %d iterations\n", duration, iterations)
}
<?php
ini_set('memory_limit', '256M');
$filename = 'large-file.json';
$iterations = 100; // Adjust the number of iterations as needed
$data = file_get_contents($filename);
if ($data === false) {
echo "Failed to read file\n";
return;
}
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
$jsonData = json_decode($data, true); // Pass true for associative array, false for object
if ($jsonData === null) {
echo "Failed to decode JSON\n";
return;
}
}
$duration = microtime(true) - $start;
echo "Decoding took $duration seconds for $iterations iterations\n";
package main
import (
"encoding/json"
"fmt"
"os"
"time"
)
type Item struct {
Id string `json:"id"`
Type string `json:"type"`
Actor struct {
Id int `json:"id"`
Login string `json:"login"`
GravatarId string `json:"gravatar_id"`
Url string `json:"url"`
AvatarUrl string `json:"avatar_url"`
} `json:"actor"`
Repo struct {
Id int `json:"id"`
Name string `json:"name"`
Url string `json:"url"`
} `json:"repo"`
Payload struct {
Ref string `json:"ref"`
RefType string `json:"ref_type"`
MasterBranch string `json:"master_branch"`
Description string `json:"description"`
PusherType string `json:"pusher_type"`
} `json:"payload"`
Public bool `json:"public"`
CreatedAt time.Time `json:"created_at"`
}
func main() {
filename := "large-file.json"
// Adjust the number of iterations as needed
// Read the JSON file outside of the timing loop
data, err := os.ReadFile(filename)
if err != nil {
fmt.Printf("failed to read file: %v\n", err)
return
}
start := time.Now() // Start the timer
for i := 0; i < iterations; i++ {
var jsonData []Item // Use this for JSON arrays
// var jsonData map[string]interface{} // Use this instead if the JSON root is an object
if err := json.Unmarshal(data, &jsonData); err != nil {
fmt.Printf("failed to unmarshal json: %v\n", err)
return
}
}
duration := time.Since(start)
fmt.Printf("Unmarshalling took %v for %d iterations\n", duration, iterations)
}
<?php
ini_set('memory_limit', '512M');
$filename = 'large-file.json';
$iterations = 100; // Adjust the number of iterations as needed
$data = file_get_contents($filename);
if ($data === false) {
echo "Failed to read file\n";
return;
}
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
$jsonData = json_decode($data, false); // Pass true for associative array, false for object
if ($jsonData === null) {
echo "Failed to decode JSON\n";
return;
}
}
$duration = microtime(true) - $start;
echo "Decoding took $duration seconds for $iterations iterations\n";
package main
import (
"encoding/json"
"fmt"
"os"
"time"
)
type Item struct {
Id string `json:"id"`
Type string `json:"type"`
Actor json.RawMessage `json:"actor"`
Repo json.RawMessage `json:"repo"`
Payload json.RawMessage `json:"payload"`
Public bool `json:"public"`
CreatedAt time.Time `json:"created_at"`
}
func main() {
filename := "large-file.json"
// Adjust the number of iterations as needed
// Read the JSON file outside of the timing loop
data, err := os.ReadFile(filename)
if err != nil {
fmt.Printf("failed to read file: %v\n", err)
return
}
start := time.Now() // Start the timer
for i := 0; i < iterations; i++ {
var jsonData []Item // Use this for JSON arrays
// var jsonData map[string]interface{} // Use this instead if the JSON root is an object
if err := json.Unmarshal(data, &jsonData); err != nil {
fmt.Printf("failed to unmarshal json: %v\n", err)
return
}
}
duration := time.Since(start)
fmt.Printf("Unmarshalling took %v for %d iterations\n", duration, iterations)
}
package main
import (
"encoding/json"
"fmt"
"os"
"time"
)
type Item struct {
Id string `json:"id"`
Type string `json:"type"`
Public bool `json:"public"`
CreatedAt time.Time `json:"created_at"`
}
func main() {
filename := "large-file.json"
// Adjust the number of iterations as needed
// Read the JSON file outside of the timing loop
data, err := os.ReadFile(filename)
if err != nil {
fmt.Printf("failed to read file: %v\n", err)
return
}
start := time.Now() // Start the timer
for i := 0; i < iterations; i++ {
var jsonData []Item // Use this for JSON arrays
// var jsonData map[string]interface{} // Use this instead if the JSON root is an object
if err := json.Unmarshal(data, &jsonData); err != nil {
fmt.Printf("failed to unmarshal json: %v\n", err)
return
}
}
duration := time.Since(start)
fmt.Printf("Unmarshalling took %v for %d iterations\n", duration, iterations)
}
@mikeschinkel
Copy link
Author

mikeschinkel commented Mar 11, 2024

@mfriedenhagen — Thank you for the comments, for catching my typo (fixed!), and especially for your contribution of your Python and Go timings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment