Skip to content

Instantly share code, notes, and snippets.

@kcmannem
Last active July 3, 2019 16:47
Show Gist options
  • Save kcmannem/53aa795d19f4b0f9c816df66c60e17f4 to your computer and use it in GitHub Desktop.
Save kcmannem/53aa795d19f4b0f9c816df66c60e17f4 to your computer and use it in GitHub Desktop.
why zstd breaks

Doing some manual tests of baggageclaim on both linux and darwin.

#DARWIN

In Darwin when we intialize a volume with contents and call stream out with zstd encoding. The output is malformed silently, theres no error thrown by the library for us to detect this. idk why its happening.

/t/test $ curl -X PUT -H "Accept-Encoding: zstd" localhost:7788/volumes/hello/stream-out > thing.zst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   127  100   127    0     0   8401      0 --:--:-- --:--:-- --:--:--  8466
/t/test $ ls
thing.zst
/t/test $ zstd -d thing.zst
thing.zst            : 0 MB...     thing.zst : Read error (39) : premature end

Repeating the same steps on darwin with gzip encoding does not come out malformed.

/t/test $ curl -X PUT -H "Accept-Encoding: gzip" localhost:7788/volumes/hello/stream-out | tar -tvzf -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10240    0 10240    0     0   825k      0 --:--:-- --:--:-- --:--:--  833k
drwxr-xr-x  0 pivotal wheel       0 Jul  3 10:45 ./
-rw-r--r--  0 pivotal wheel      20 Jul  3 10:45 ./why

We're able to read out the table of contents of tar file after ungziping it. This also proves that we are writing the response body correctly.

Update1: I tried piping stderr to dev/null and we can actually untar everything successfully

/t/test $ curl -X PUT -H "Accept-Encoding: zstd" localhost:7788/volumes/hello/stream-out | zstd -d 2>/dev/null | tar tvf -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   127  100   127    0     0   8771      0 --:--:-- --:--:-- --:--:--  9071
drwxr-xr-x  0 pivotal wheel       0 Jul  3 10:45 ./
-rw-r--r--  0 pivotal wheel      20 Jul  3 10:45 ./why

#LINUX

When we repeat the above tests on linux we're successful with both zstd and gzip.

root@9da28e5bb5a5:/tmp# curl -X PUT -H "Accept-Encoding: zstd" localhost:7788/volumes/hello/stream-out | zstd -d | tar tvf -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    85  100    85    0     0  10452      0 --:--:-- --:--:-- --:--:-- 12142
drwxr-xr-x root/root         0 2019-07-03 14:35 ./

zstd cli does not error with the premature end error. It's clear for some reason darwin isn't writing zstd correctly.

@kcmannem
Copy link
Author

kcmannem commented Jul 3, 2019

well throw everything out, instead of testing with the cli and just piping between api calls proves that it works

curl -X PUT -H "Accept-Encoding: zstd" localhost:7788/volumes/hello/stream-out | curl -X PUT -H "Content-Encoding: zstd" --data-binary "@-" localhost:7788/volumes/hello2/stream-in

can successfully transfer contents from hello to an empty hello2 volume. The premature end problem might just have to do with the cli.

@kcmannem
Copy link
Author

kcmannem commented Jul 3, 2019

I'm not convinced its a problem on the worker side anymore.

@kcmannem
Copy link
Author

kcmannem commented Jul 3, 2019

Another test, spun up an linux worker via docker-compose at localhost:7799 and a darwin instance at localhost:7788 and tried streaming across platforms. We are able to that as well.

curl -X PUT -H "Accept-Encoding: zstd" localhost:7799/volumes/hello/stream-out | curl -X PUT -H "Content-Encoding: zstd" --data-binary "@-" localhost:7788/volumes/hello/stream-in
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   155  100   155    0     0   8966      0 --:--:-- --:--:-- --:--:--  9117
/t/v/l/h/volume $ cat mailtodarwin
this is from linux

Linux successfully streamed out the file mailtodarwin over to the darwin worker.

@kcmannem
Copy link
Author

kcmannem commented Jul 3, 2019

I did the above tests with the darwin worker in question on the prod deployment itself and everythings ok.

DX261:tmptest DX261$ curl -X PUT -H "Accept-Encoding: zstd" localhost:7788/volumes/hello/stream-out | curl -X PUT -H "Content-Encoding: zstd" --data-binary "@-" localhost:7788/volumes/hello2/stream-in
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10240    0 10240    0     0   858k      0 --:--:-- --:--:-- --:--:--  833k

DX261:tmptest DX261$ cat /Users/DX261/work-dir/volumes/live/hello/volume/hello
hello this is a test
DX261:tmptest DX261$ cat /Users/DX261/work-dir/volumes/live/hello2/volume/hello
hello this is a test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment