Skip to content

Instantly share code, notes, and snippets.

@richfitz
Last active August 29, 2015 14:27
Show Gist options
  • Save richfitz/ca1efe5041493edf39e7 to your computer and use it in GitHub Desktop.
Save richfitz/ca1efe5041493edf39e7 to your computer and use it in GitHub Desktop.
FROM stateline
# Python dependencies for the example:
RUN apt-get update && apt-get install -y \
python \
python-dev \
python-matplotlib \
python-numpy \
python-pip && \
pip install \
pyzmq \
triangle_plot
# R dependencies for building our own example:
RUN apt-get update && apt-get install -y \
r-base \
r-base-dev \
r-recommended \
vim-tiny
RUN wget https://github.com/armstrtw/rzmq/archive/master.zip && \
unzip master.zip && R CMD INSTALL rzmq-master && rm -rf rzmq-master
# Update error in demo config
COPY python-demo-config.json ./python-demo-config.json
{
"dimensionality": 4,
"boundaries": {
"min": [-10, 0, -10, -10],
"max": [ 10, 10, 10, 2]
},
"duration": 20,
"sigma": 1.0,
"jobTypes": ["job"],
"annealLength": 1,
"sigmaAdaption": {
"windowSize": 100000,
"adaptionLength": 100000,
"stepsPerAdapt": 2500,
"optimalAcceptRate": 0.24,
"adaptRate": 0.2,
"adaptFactor": {
"min": 0.8,
"max": 1.25
}
},
"parallelTempering": {
"stacks": 4,
"chains": 4,
"sigmaFactor": 1.5,
"betaFactor": 1.5,
"swapInterval": 10,
"betaAdaption": {
"windowSize": 100000,
"adaptionLength": 100000,
"stepsPerAdapt": 2500,
"optimalSwapRate": 0.24,
"adaptRate": 0.2,
"adaptFactor": {
"min": 0.8,
"max": 1.25
}
}
},
"output": {
"directory": "python-demo-output",
"cacheLength": 20
}
}

Rather than trying to get the whole installation working on OS/X, I'm working in a docker container. Ideally it might be nice to have the container expose whatever ports are required, but I understand if that doesn't actually work with the design.

Start container, set up to run the server, mapping the local directory cpp-demo-output to the directory the demo case would use to dump output:

docker run -v ${PWD}/cpp-demo-output:/tmp/build/cpp-demo-output --rm stateline ./stateline --config=cpp-demo-config.json

Start a worker:

docker exec -it $(docker ps -l -q) ./demo-worker

When the job has finished, both container windows will terminate and return the shell.

So far, so good.

Repeating this process with the python demo does not work for me:

docker run -v ${PWD}/python-demo-output:/tmp/build/python-demo-output --rm stateline ./stateline --config=python-demo-config.json

With error:

terminate called after throwing an instance of 'std::domain_error'
  what():  type must be number, but is null

I get the same error when running interactively in the container. It looks like the issue might be that the annealLength parameter is not present. I fixed this, and also created a new container that wraps around the stateline one to also provide all the prerequisites for the python worker (and for a future R worker).

docker run -v ${PWD}/python-demo-output:/tmp/build/python-demo-output --rm stateline2 ./stateline --config=python-demo-config.json
docker exec -it $(docker ps -l -q) python demo-worker.py

That all works fine, except that rather than a clean termination, on the server I see:

I0820 13:13:30.837076     6 router.cpp:75] Router main's Poll thread has exited loop, must be shutting down
F0820 13:13:30.884371    10 thread.hpp:21] Exception thrown in child thread: Context was terminated
*** Check failure stack trace: ***
    @     0x7fecbe02aa0d  google::LogMessage::Fail()
    @     0x7fecbe02c8c0  google::LogMessage::SendToLog()
    @     0x7fecbe02a5d2  google::LogMessage::Flush()
    @     0x7fecbe02d2de  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fecbe2951d0  _ZZN9stateline13startInThreadINS_5comms15ServerHeartbeatEISt17reference_wrapperIN3zmq9context_tEES3_IKNS1_17HeartbeatSettingsEEEEESt6futureIbERbDpOT0_ENKUlOS6_OS9_E_clESG_SH_
    @     0x7fecbe293fce  std::_Function_handler<>::_M_invoke()
    @     0x7fecbe2757a2  std::__future_base::_State_baseV2::_M_do_set()
    @     0x7fecbd8ea36b  __pthread_once_slow
    @     0x7fecbe29a8d5  _ZNSt6thread5_ImplISt12_Bind_simpleIFZNSt13__future_base17_Async_state_implIS1_IFZN9stateline13startInThreadINS4_5comms15ServerHeartbeatEJSt17reference_wrapperIN3zmq9context_tEES8_IKNS6_17HeartbeatSettingsEEEEESt6futureIbERbDpOT0_EUlOSB_OSE_E_SB_SE_EEbEC4EOSP_EUlvE_vEEE6_M_runEv
    @     0x7fecbddcae30  (unknown)
    @     0x7fecbd8e36aa  start_thread
    @     0x7fecbd618eed  (unknown)
*** Aborted at 1440076410 (unix time) try "date -d @1440076410" if you are using GNU date ***
PC: @     0x7fecbd549036 (unknown)
*** SIGSEGV (@0x0) received by PID 1 (TID 0x7fecba5a8700) from PID 0; stack trace: ***
    @     0x7fecbd8ecd10 (unknown)
    @     0x7fecbd549036 (unknown)
    @     0x7fecbe03422c google::DumpStackTraceAndExit()
    @     0x7fecbe02aa0d google::LogMessage::Fail()
    @     0x7fecbe02c8c0 google::LogMessage::SendToLog()
    @     0x7fecbe02a5d2 google::LogMessage::Flush()
    @     0x7fecbe02d2de google::LogMessageFatal::~LogMessageFatal()
    @     0x7fecbe2951d0 _ZZN9stateline13startInThreadINS_5comms15ServerHeartbeatEISt17reference_wrapperIN3zmq9context_tEES3_IKNS1_17HeartbeatSettingsEEEEESt6futureIbERbDpOT0_ENKUlOS6_OS9_E_clESG_SH_
    @     0x7fecbe293fce std::_Function_handler<>::_M_invoke()
    @     0x7fecbe2757a2 std::__future_base::_State_baseV2::_M_do_set()
    @     0x7fecbd8ea36b __pthread_once_slow
    @     0x7fecbe29a8d5 _ZNSt6thread5_ImplISt12_Bind_simpleIFZNSt13__future_base17_Async_state_implIS1_IFZN9stateline13startInThreadINS4_5comms15ServerHeartbeatEJSt17reference_wrapperIN3zmq9context_tEES8_IKNS6_17HeartbeatSettingsEEEEESt6futureIbERbDpOT0_EUlOSB_OSE_E_SB_SE_EEbEC4EOSP_EUlvE_vEEE6_M_runEv
    @     0x7fecbddcae30 (unknown)
    @     0x7fecbd8e36aa start_thread
    @     0x7fecbd618eed (unknown)

I've ported most of the python demo worker over to R. Unfortunately for me, the rzmq package looks pretty terrible and won't do the job. It will only send RAW (serialized) objects. Wheras it looks like stateline is expecting a string (and certainly is not expecting an R serialised object!).

I can port the rzmq package over and try and improve it, or we can write a more limited zmq interface if that's useful. Looking at the package code, there's not a crazy amount there that would require a great deal of work, but I don't really get what's going on enough to really know the best way forward.

That's where I've left this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment