Running on qwin (clusterhyp2)
Drop raw files and fasta in /lustre_qubic/MaxQuantShare/<taskname>
:
.
|- inputfile1.raw
|- inputfile2.raw
|- mouse.fasta
`- params.json
and
touch START
Param file format:
globalParams:
defaults: default
fixedModifications: ["Carbamidomethyl (C)"]
rawFiles:
- files:
- name: inputfile1
fraction: 1
- name: inputfile2
fraction: 2
params:
defaults: default
variableModifications:
- "Acetyl (Protein N-term)"
- "Oxidation (M)"
fastaFiles:
fileNames: ["mouse"]
firstSearch: []
MSMSParams:
defaults: default
topLevelParams:
defaults: default
MaxQuant should start. Watch files STATUS/FAILED/SUCCESS
Still issues to work out:
Test on Monday timed out after 5h because I forgot to change the default timeout...
Also: improve documentation, write more tests
Resources on hypervisor?
Status on qemu on cluster?
Adrian Seyboldt
#!/bin/bash
. ~/.guserc
export CATALINA_PID="$CATALINA_HOME"/catalina.pid
DEFAULT_IFACE=`/sbin/ip route | awk '/default/ { print $5 }'`
IP_ADDR=`/sbin/ip addr show $DEFAULT_IFACE | awk '/inet /{split($2,a,"/");print a[1]}'`
OLD_IP=`cat ~/.prev_init_IP 2> /dev/null`
if [ "x$OLD_IP" != "x$IP_ADDR" ]; then
mysql -u guse --password="guse" guse -e "drop table OPENJPA_SEQUENCE_TABLE, OptionBean, service, servicecomdef, serviceresource, serviceproperties, services_user, servicetype, serviceuser;"
fi
sed -i -e "s|<property name=\"portal.url\" value=\"http://.*:8080/wspgrade\" />|<property name=\"portal.url\" value=\"http://$IP_ADDR:8080/wspgrade\" />|" "$CATALINA_HOME"/webapps/information/WEB-INF/config/service.xml
"$CATALINA_HOME"/startup.sh
while [ 1 ]; do
( tail "$CATALINA_HOME"/logs/catalina.out | grep -q "INFO: Server startup in " ) && break
sleep 5
done
wget -o /dev/null -O /dev/null --http-user=admin --http-password=admin http://$IP_ADDR:8080/information/wizzard?pdriver=org.gjt.mm.mysql.Driver\&pjdbc=jdbc:mysql://localhost/guse\&puser=guse\&ppass=guse
sleep 5
wget -o /dev/null -O /dev/null --http-user=admin --http-password=admin http://$IP_ADDR:8080/information/wizzard?phost=http://$IP_ADDR:8080
sleep 5
wget -o /dev/null -O /dev/null --http-user=admin --http-password=admin http://$IP_ADDR:8080/information/wizzard?pfine=ok
sleep 5
if [ "x$OLD_IP" != "x$IP_ADDR" ]; then
wget -o /dev/null -O /dev/null --http-user=admin --http-password=admin http://$IP_ADDR:8080/information/wizzard?pwspgrade=http://$IP_ADDR:8080/wspgrade\&pwfs=http://$IP_ADDR:8080/wfs\&presource=http://$IP_ADDR:8080/dci_bridge_service\&pstatvisualizer=http://$IP_ADDR:8080/statvisualizer\&pstorage=http://$IP_ADDR:8080/storage\&pwfi=http://$IP_ADDR:8080/wfi
sleep 5
fi
wget -o /dev/null -O /dev/null --http-user=admin --http-password=admin http://$IP_ADDR:8080/information/init?service.url=http://$IP_ADDR:8080/information\&resource.id=/services/urn:infoservice\&guse.system.database.driver=com.mysql.jdbc.Driver\&guse.system.database.url=jdbc:mysql://localhost/guse\&guse.system.database.user=guse\&guse.system.database.password=guse
wget -o /dev/null -O /dev/null http://$IP_ADDR:8080/stataggregator
echo "$IP_ADDR" > ~/.prev_init_IP
export CATALINA_PID="$CATALINA_HOME"/catalina.pid
DEFAULT_IFACE=`/sbin/ip route | awk '/default/ { print $5 }'`
IP_ADDR=`/sbin/ip addr show $DEFAULT_IFACE | awk '/inet /{split($2,a,"/");print a[1]}'`
OLD_IP=`cat ~/.prev_init_IP 2> /dev/null`
if [ "x$OLD_IP" != "x$IP_ADDR" ]; then
mysql -u guse --password="guse" guse -e "drop table OPENJPA_SEQUENCE_TABLE, OptionBean, service, servicecomd
ef, serviceresource, serviceproperties, services_user, servicetype, serviceuser;"
fi
sed -i -e "s|<property name=\"portal.url\" value=\"http://.*:8080/wspgrade\" />|<property name=\"portal.url\" value=\"http://$IP_ADDR:8080/wspgrade\" />|" "$CATALINA_HOME"/webapps/information/WEB-INF/config/service.xml
"$CATALINA_HOME"/startup.sh
while [ 1 ]; do
( tail "$CATALINA_HOME"/logs/catalina.out | grep -q "INFO: Server startup in " ) && break
sleep 5
done
- ~200 000 lines of code in > 1000 files
- no one here understands the details
- if it breaks, we can't repair it
Writing gUSE workflows is easy for non-developers
But we are developers
-
Few restrictions on workflows (shell script, Makefile, knime(?), galaxy, ruffus, snakemake, ipython.parallel...)
-
Run jobs manually without support by database or web server
-
Versioning of workflows (reproducibility)
-
Individual UI for each workflow; web server does not care about it
-
UI can use OpenBis metadata
-
Restart each component without loosing work or stopping jobs
-
Report crashes and failures
-
No user can see or manipulate jobs of other users
-
Isolate web server: If web server is hacked, minimise damage
-
No unencrypted/unauthenticated communication
-
Database
- Stores workflow metadata
- Stores status of jobs ("running workflows")
-
Web server
- stateless, read only db access
- talks to db and bootstrap script
-
State server
- Workflows inform this server about their state
- changes db
-
Bootstrap script / server
- Start new jobs on cluster
simple_workflow
├── meta.json
├── start_job
└── webview.js
- Does no work on its own
- Uses cluster with drmaa or qsub
- Handles SIGTERM
- Talks to the state server
- Expects parameters in
conf.json
Contains functions that creates web UI and parameter file conf.json
var view = {
make_submit_view: function (elem, meta, submit) {
elem.append('<form class="submit-form"></form>');
elem.children().jsonForm({
schema: meta['form_schema'],
onSubmit: submit
});
},
make_job_view: function (elem, meta, job) {
}
};
- Unfinished
- No other users
- Do we have a JavaScript developer?
- Handle hanging/crashing jobs?
- Zeromq security is new
- Needs networking on cluster (zeromq)
- Firewall on cluster?
- How to store / pass around OpenBis passwords?