Created
October 10, 2018 18:43
-
-
Save mrocklin/233810e6813e7fd5b5a40f08bde02758 to your computer and use it in GitHub Desktop.
Unfortunately it has been a while since I've played with this, and I don't immediately have a good answer for you. I recommend that you ask the folks at https://github.com/dask/dask-ml/issues/new
if somebody tried to run locally ?
like
https://github.com/rambler-digital-solutions/criteo-1tb-benchmark
how to read data locally from these kind parquet fiels?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Matthew, what if you don't persist data? I tried the same pre-processing on another dataset, which is too large to persist into memory. It raises
ValueError: ('Arrays chunk sizes are unknown: %s'...
. I searched much for the solution but sitll don't know how to deal with this.