Skip to content

Instantly share code, notes, and snippets.

@RDux
Last active October 1, 2020 07:47
Show Gist options
  • Save RDux/1c9eb8c2a3a75b94f2f9f8d19ff90f9a to your computer and use it in GitHub Desktop.
Save RDux/1c9eb8c2a3a75b94f2f9f8d19ff90f9a to your computer and use it in GitHub Desktop.
Basic GridFS operation that illustrates excessive memory usage in the official driver.

Description

Opening a mp4 video from GridFS leads to excessive memory usage after a few concurrent calls. The video in question is 43276570 bytes in size. I mention video because it was the scenario that caused me to notice the problem initially, however I think it's fair to assume the results will be the same regardless of what content you are loading.

It wasn't hard to spot something was going on since I noticed 100MiB+ of memory was being used after opening the video twice in different browser tabs. To get a feel for the gravity of the issue I ran a siege benchmark with 50 concurrent users.

siege -c 50 -r 1 'http://localhost:8080/'

After running this benchmark three times the server memory usage caps out at around 1.9GiB of memory according to the gnome-system-monitor. Afterwards you can take a look at the allocation profile of the example server. It's easy to spot that there is a lot of overhead in the readWireMessage function.

go tool pprof http://localhost:8080/debug/pprof/allocs
(pprof) list readWireMessage

Driver Info

{ driver: { name: "mongo-go-driver", version: "v1.1.1+prerelease" }, os: { type: "linux", architecture: "amd64" }, platform: "go1.13" }

Note

I hardcoded the ID of the content that lead to the initial discovery to keep the example as simple as possible, if you want to test the code for yourself change the ID to whatever content you have available.

package main
import (
"context"
"flag"
"io"
"log"
"net/http"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/gridfs"
"go.mongodb.org/mongo-driver/mongo/options"
_ "net/http/pprof"
)
type Mongo struct {
fs *gridfs.Bucket
*mongo.Client
}
func main() {
// CLI flags for basic options
database := flag.String("d", "test", "Database to use")
collection := flag.String("c", "test", "Collection to use")
flag.Parse()
// Configure client options
cOpts := options.Client()
cOpts.SetHosts([]string{"localhost:27017"})
var m Mongo
var err error
// Connect to the database
m.Client, err = mongo.Connect(context.TODO(), cOpts)
if err != nil {
log.Fatalln(err)
}
m.fs, err = gridfs.NewBucket(
m.Database(*database),
&options.BucketOptions{Name: collection},
)
if err != nil {
log.Fatalln(err)
}
// Setup a basic webserver
http.HandleFunc("/", m.handleGridFS)
log.Fatalln(http.ListenAndServe(":8080", nil))
}
func (m Mongo) handleGridFS(w http.ResponseWriter, r *http.Request) {
// Open video using ID
ds, err := m.fs.OpenDownloadStream("c31e1003-870e-41ac-6738-2d507ca76c48")
if err != nil {
w.WriteHeader(500)
log.Fatalln(err)
} else {
defer ds.Close()
w.Header().Set("Content-Type", "video/mp4")
w.WriteHeader(200)
io.Copy(w, ds)
}
}
@mwmahlberg
Copy link

mwmahlberg commented Oct 22, 2019

Not by any chance you have a sample video? Just want to get as close as possible to your setup.

@mwmahlberg
Copy link

First thing I spot: You do not close the download stream. And there is https://godoc.org/go.mongodb.org/mongo-driver/mongo/gridfs#Bucket.DownloadToStream

@RDux
Copy link
Author

RDux commented Oct 22, 2019

The download stream is closed by the defer on https://gist.github.com/RDux/1c9eb8c2a3a75b94f2f9f8d19ff90f9a#file-example-go-L61. I can't share the original video but you can test with one of these: https://download.blender.org/peach/bigbuckbunny_movies/. I tested against:

big_buck_bunny_720p_stereo.ogg                     30-Apr-2008 10:34           196898674

Change line 62 to video/ogg instead of video/mp4. With the same benchmark against this video the example code sits at 2.0GiB in the system monitor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment