Skip to content

Instantly share code, notes, and snippets.

View fac2003's full-sized avatar

Fabien Campagne fac2003

View GitHub Profile
nvidia-smi
Mon Jul 11 13:53:00 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.79 Driver Version: 352.79 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:04:00.0 Off | Off |
| N/A 38C P0 59W / 149W | 207MiB / 12287MiB | 0% Default |
[Stage 4:===> (2 + 30) / 32]CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:2831 code=77(<unknown>) "cudaStreamSynchronize(*stream)"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3921 code=77(<unknown>) "cudaStreamSynchronize(*pStream)"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3921 code=77(<unknown>) "cudaStreamSynchronize(*pStream)"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3947 code=77(<unknown>) "result"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3947 code=77(<unknown>) "result"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3921 code=77(<unknown>) "cudaStreamSynchronize(*pStream)"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3947 code=77(<unknown>) "result"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3921 code=77(<unknown>) "cudaStreamSynchronize(*pStream)"
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:3947 code=77(<unknown>) "result"
CUDA error at /skymind/libnd
checkout on 53f098b231f819f51a5109476b52c446567a1856
diff --git a/pom.xml b/pom.xml
index 532a3ad..ce08171 100644
--- a/pom.xml
+++ b/pom.xml
@@ -127,7 +127,8 @@
<dependency>
<groupId>org.nd4j</groupId>
o.d.e.m.MultiGpuLenetMnistExample - Build model....
o.d.e.m.MultiGpuLenetMnistExample - Train model....
CUDA error at /skymind/libnd4j/blas/cuda/NativeOps.cu:5907 code=77(<unknown>) "cudaStreamSynchronize(*stream)"
o.d.p.ParallelWrapper - Averaged score: 2.079001024173573
Exception in thread "main" java.lang.RuntimeException: Can't allocate [HOST] memory: 32; threadId: 1
at org.nd4j.jita.memory.impl.CudaDirectProvider.malloc(CudaDirectProvider.java:59)
at org.nd4j.jita.memory.impl.CudaCachingZeroProvider.malloc(CudaCachingZeroProvider.java:113)
at org.nd4j.jita.memory.impl.CudaFullCachingProvider.malloc(CudaFullCachingProvider.java:77)
at org.nd4j.jita.handler.impl.CudaZeroHandler.alloc(CudaZeroHandler.java:218)
at org.nd4j.jita.handler.impl.CudaZeroHandler.alloc(CudaZeroHandler.java:239)
java -Xmx60g -cp target/dl4j-cuda-specials-0.4-rc0-SNAPSHOT-bin.jar org.deeplearning4j.examples.multigpu.MultiGpuLenetMnistExample
Peer access [0] -> [4] isn't possible
Peer access [0] -> [5] isn't possible
Peer access [0] -> [6] isn't possible
Peer access [0] -> [7] isn't possible
Peer access [1] -> [4] isn't possible
Peer access [1] -> [5] isn't possible
Peer access [1] -> [6] isn't possible
Peer access [1] -> [7] isn't possible
Peer access [2] -> [4] isn't possible
package org.campagnelab.dl.framework.mixup;
import cern.jet.random.Beta;
import cern.jet.random.engine.RandomEngine;
import it.unimi.dsi.util.XorShift1024StarRandom;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.dataset.api.MultiDataSet;
import org.nd4j.linalg.dataset.api.MultiDataSetPreProcessor;
import org.nd4j.linalg.factory.Nd4j;
while (iterator.hasNext() && shouldWork.get()) {
DataSet smth = null;
if (useWorkspace) {
try (MemoryWorkspace ws = workspace.notifyScopeEntered()) {
smth = iterator.next();
if (callback != null)
callback.call(smth);
}