testing VirtualFlow tutorial on AWS ParallelCluster
https://docs.virtual-flow.org/tutorials/-LdE94b2AVfBFT72zK-v/
- example config file for AWS ParallelCluster
[aws]
aws_region_name = us-east-1
[global]
cluster_template = default
update_check = true
sanity_check = true
[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}
[cluster default]
key_name = <KEY_NAME>
base_os = centos7
scheduler = slurm
master_instance_type = c5.xlarge
compute_instance_type = c5.2xlarge
maintain_initial_size = true
vpc_settings = default
master_root_volume_size = 1000
dcv_settings = dcv1
max_queue_size = 100
tags = {"Project": "ParallelCluster-virtualflow"}
[vpc default]
vpc_id = <VPC_ID>
master_subnet_id = <MASTER_SUBNET_ID>
compute_subnet_id = <COMPUTE_SUBNET_ID>
use_public_ips = false
[dcv dcv1]
enable = master
not work
- edit partition in
templates/all.ctrl
partition=compute
- edit slurm settings in
sudo emacs /opt/slurm/etc/slurm.conf
Value of RealMemory could be found by /opt/slurm/sbin/slurmd -C
in a compute node.
you could also check this issue: aws/aws-parallelcluster#1517
NodeName=DEFAULT RealMemory=14938
include slurm_parallelcluster_nodes.conf
PartitionName=compute Nodes=ALL Default=YES MaxTime=INFINITE State=UP
- after the setting, restart slurmctld
sudo service slurmctld restart
https://docs.virtual-flow.org/documentation/-LdE8RH9UN4HKpckqkX3/vftools/installation-1
- install OpenBabel
sudo yum install openbable
- install VFTools
wget https://github.com/VirtualFlow/VFTools/archive/master.tar.gz
tar -xvzf master.tar.gz
mv VFTools-master VFTools
- set
PATH
export PATH="<parent folder>/VFTools/bin:$PATH"
- https://docs.virtual-flow.org/tutorials/-LdE94b2AVfBFT72zK-v/vfvs-tutorial-1/the-completed-workflow
- command in the tutorial document does not work. fixed version is bellow
vfvs_pp_prepare_dockingposes.sh ../../../output-files/complete/qvina02_rigid_receptor1/results/ meta_tranch compounds dockingsposes overwrite
$ ./vf_start_jobline.sh 1 12 templates/template1.slurm.sh submit 1
:: :: :: ::::. :::::: :: :: .::::. :: ::::: :: .::::. :: ::
:: :: :: :: :: :: :: :: :: :: :: :: :: :: :: :: :: ::
:::: :: :::. :: :: :: :::::: :: ::::: :: :: :: ::::::::
:: :: :: :: :: :::: :: :: :::: :: ::::: '::::' :: ::
Syncing the jobfile of jobline 1 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 2 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 3 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 4 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 5 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 6 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 7 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 8 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 9 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 10 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 11 with the controlfile file ../../workflow/control/all.ctrl.
Syncing the jobfile of jobline 12 with the controlfile file ../../workflow/control/all.ctrl.
Submitted batch job 4
The job for jobline 1 has been submitted at Sun Jul 12 06:41:50 UTC 2020.
Submitted batch job 5
The job for jobline 2 has been submitted at Sun Jul 12 06:41:51 UTC 2020.
Submitted batch job 6
The job for jobline 3 has been submitted at Sun Jul 12 06:41:52 UTC 2020.
Submitted batch job 7
The job for jobline 4 has been submitted at Sun Jul 12 06:41:53 UTC 2020.
Submitted batch job 8
The job for jobline 5 has been submitted at Sun Jul 12 06:41:54 UTC 2020.
Submitted batch job 9
The job for jobline 6 has been submitted at Sun Jul 12 06:41:55 UTC 2020.
Submitted batch job 10
The job for jobline 7 has been submitted at Sun Jul 12 06:41:56 UTC 2020.
Submitted batch job 11
The job for jobline 8 has been submitted at Sun Jul 12 06:41:57 UTC 2020.
Submitted batch job 12
The job for jobline 9 has been submitted at Sun Jul 12 06:41:58 UTC 2020.
Submitted batch job 13
The job for jobline 10 has been submitted at Sun Jul 12 06:41:59 UTC 2020.
Submitted batch job 14
The job for jobline 11 has been submitted at Sun Jul 12 06:42:00 UTC 2020.
Submitted batch job 15
The job for jobline 12 has been submitted at Sun Jul 12 06:42:01 UTC 2020.
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
4 compute t-1.1 centos R 5:37 1 ip-10-0-19-5
5 compute t-2.1 centos R 5:34 1 ip-10-0-19-5
6 compute t-3.1 centos R 5:34 1 ip-10-0-19-5
7 compute t-4.1 centos R 5:34 1 ip-10-0-19-5
8 compute t-5.1 centos R 5:33 1 ip-10-0-19-5
9 compute t-6.1 centos R 5:31 1 ip-10-0-19-5
10 compute t-7.1 centos R 5:31 1 ip-10-0-19-5
11 compute t-8.1 centos R 5:28 1 ip-10-0-19-5
12 compute t-9.1 centos R 1:30 1 ip-10-0-22-46
13 compute t-10.1 centos R 1:30 1 ip-10-0-22-46
$ ./vf_report.sh -c workflow
:: :: :: ::::. :::::: :: :: .::::. :: ::::: :: .::::. :: ::
:: :: :: :: :: :: :: :: :: :: :: :: :: :: :: :: :: ::
:::: :: :::. :: :: :: :::::: :: ::::: :: :: :: ::::::::
:: :: :: :: :: :::: :: :: :::: :: ::::: '::::' :: ::
Sun Jul 12 06:48:26 UTC 2020
Workflow Status
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Joblines
................................................................................................
Number of jobfiles in the workflow/jobfiles/main folder: 12
Number of joblines in the batch system: 10
Number of joblines in the batch system currently running: 10
* Number of joblines in queue "compute" currently running: 10
Number of joblines in the batch system currently not running: 0
* Number of joblines in queue "compute" currently not running: 0
Number of cores/slots currently used by the workflow: 10
Collections
................................................................................................
Total number of ligand collections: 68
Number of ligand collections completed: 6
Number of ligand collections in state "processing": 10
Number of ligand collections not yet started: 52
Ligands (in completed collections)
................................................................................................
Total number of ligands: 1123
Number of ligands started: 8
Number of ligands successfully completed: 8
Number of ligands failed: 0
Dockings (in completed collections)
................................................................................................
Docking runs per ligand: 2
Number of dockings started: 16
Number of dockings successfully completed: 16
Number of dockings failed: 0
$ ./vf_report.sh -c vs -d qvina02_rigid_receptor1 -n 10
:: :: :: ::::. :::::: :: :: .::::. :: ::::: :: .::::. :: ::
:: :: :: :: :: :: :: :: :: :: :: :: :: :: :: :: :: ::
:::: :: :::. :: :: :: :::::: :: ::::: :: :: :: ::::::::
:: :: :: :: :: :::: :: :: :::: :: ::::: '::::' :: ::
Sun Jul 12 06:49:09 UTC 2020
Preliminary Virtual Screening Results
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Binding affinity - statistics
................................................................................................
Number of ligands screened with binding affinity between 0 and inf kcal/mole: 0
Number of ligands screened with binding affinity between -0.1 and -5.0 kcal/mole: 1
Number of ligands screened with binding affinity between -5.0 and -5.5 kcal/mole: 5
Number of ligands screened with binding affinity between -5.5 and -6.0 kcal/mole: 7
Number of ligands screened with binding affinity between -6.0 and -6.5 kcal/mole: 4
Number of ligands screened with binding affinity between -6.5 and -7.0 kcal/mole: 0
Number of ligands screened with binding affinity between -7.0 and -7.5 kcal/mole: 1
Number of ligands screened with binding affinity between -7.5 and -8.0 kcal/mole: 0
Number of ligands screened with binding affinity between -8.0 and -8.5 kcal/mole: 0
Number of ligands screened with binding affinity between -8.5 and -9.0 kcal/mole: 0
Number of ligands screened with binding affinity between -9.0 and -9.5 kcal/mole: 0
Number of ligands screened with binding affinity between -9.5 and -10.0 kcal/mole: 0
Number of ligands screened with binding affinity between -10.0 and -10.5 kcal/mole: 0
Number of ligands screened with binding affinity between -10.5 and -11.0 kcal/mole: 0
Number of ligands screened with binding affinity between -11.0 and -11.5 kcal/mole: 0
Number of ligands screened with binding affinity between -11.5 and -12.0 kcal/mole: 0
Number of ligands screened with binding affinity between -12.0 and -12.5 kcal/mole: 0
Number of ligands screened with binding affinity between -12.5 and -13.0 kcal/mole: 0
Number of ligands screened with binding affinity between -13.0 and -13.5 kcal/mole: 0
Number of ligands screened with binding affinity between -13.5 and -14.0 kcal/mole: 0
Number of ligands screened with binding affinity between -14.0 and -14.5 kcal/mole: 0
Number of ligands screened with binding affinity between -14.5 and -15.0 kcal/mole: 0
Number of ligands screened with binding affinity between -15.0 and -20.0 kcal/mole: 0
Number of ligands screened with binding affinity between -20.0 and -inf kcal/mole: 0
Binding affinity - highest scoring compounds
................................................................................................
Rank Ligand Collection Highest-Score
1 PV-001938623963_2_T1 GACBEG_00000 -7.0
2 Z3013447159_1_T1 GACCAD_00000 -6.3
3 PV-001873781580_1 HACBAE_00000 -6.3
4 PV-001873781580_2 HACBAE_00000 -6.0
5 PV-001847295098_1 HACBFF_00000 -6.0
6 PV-001938623963_1_T1 GACBEG_00000 -5.8
7 PV-001873781822_1 HACBAE_00000 -5.7
8 Z2801168368_1_T1 GACACC_00000 -5.6
9 Z2092504580_1_T1 GAFFCG_00000 -5.6
10 Z2092508107_1_T1 GAFFCG_00000 -5.6
$ vfvs_pp_prepare_dockingposes.sh ../../../output-files/complete/qvina02_rigid_receptor1/results/ meta_tranch compounds dockingsposes overwrite
*********************************************************************
Extracting the winning structrures
*********************************************************************
* The output folder dockingsposes does already exist. Removing...
* The file compounds.energies does already exist. Deleting...
*** Preparing structure Z2624037004_3 ***
GACEBG/00000.tar.gz
00000/Z2624037004_3_replica-1.pdbqt
9 molecules converted
9 files output. The first is Z2624037004_3.rank-1.pdb
9 molecules converted
9 files output. The first is Z2624037004_3.rank-1.sdf
9 molecules converted
*** Preparing structure Z2624037004_4 ***
GACEBG/00000.tar.gz
00000/Z2624037004_4_replica-1.pdbqt
5 molecules converted
5 files output. The first is Z2624037004_4.rank-1.pdb
5 molecules converted
5 files output. The first is Z2624037004_4.rank-1.sdf
5 molecules converted
*** Preparing structure Z2087260951_4 ***
GACEBG/00000.tar.gz
00000/Z2087260951_4_replica-1.pdbqt
9 molecules converted
9 files output. The first is Z2087260951_4.rank-1.pdb
9 molecules converted
9 files output. The first is Z2087260951_4.rank-1.sdf
9 molecules converted
*** Preparing structure Z2087256678_1 ***
GACEBG/00000.tar.gz
00000/Z2087256678_1_replica-1.pdbqt
9 molecules converted
9 files output. The first is Z2087256678_1.rank-1.pdb
9 molecules converted
9 files output. The first is Z2087256678_1.rank-1.sdf
9 molecules converted
*** Preparing structure Z2087260951_2 ***
GACEBG/00000.tar.gz
00000/Z2087260951_2_replica-1.pdbqt
9 molecules converted
9 files output. The first is Z2087260951_2.rank-1.pdb
9 molecules converted
9 files output. The first is Z2087260951_2.rank-1.sdf
9 molecules converted
$ pwd
/home/centos/virtualflow/VFVS_GK/pp/docking_poses/qvina02_rigid_receptor1/dockingsposes.plain
$ ls
100_Z1669288933_1_T1.pdb 32_Z2700583334_1.pdb 55_Z1175719058_2.pdb 78_Z2046069599_1.pdb
10_Z2624037004_1.pdb 33_Z2638723223_1.pdb 56_Z2505285340_1_T1.pdb 79_Z1668414848_2_T1.pdb
11_Z2211137992_4.pdb 34_Z2230216305_1_T1.pdb 57_Z2378042591_6_T1.pdb 7_Z2087256678_2.pdb
12_Z2211139111_1.pdb 35_Z1237025175_1.pdb 58_PV-001089728404_1_T1.pdb 80_PV-001873778304_2.pdb
13_Z2624037004_2.pdb 36_PV-001282503059_2.pdb 59_Z2364809982_1_T1.pdb 81_PV-001915879035_1.pdb
14_PV-001701895824_2.pdb 37_PV-001377853194_1_T1.pdb 5_Z2087260951_4.pdb 82_Z2144418621_1.pdb
15_Z2087260951_1.pdb 38_Z2364787117_1_T1.pdb 60_Z1897122191_4.pdb 83_Z2144418621_4.pdb
16_Z2700583334_2.pdb 39_PV-001288562049_2.pdb 61_PV-001826919885_7.pdb 84_Z2221447237_1.pdb
17_Z2211137992_2.pdb 3_Z2087256678_1.pdb 62_Z2717222271_4.pdb 85_Z1175719058_1.pdb
18_Z2211139111_2.pdb 40_PV-001702179999_2.pdb 63_Z2042828126_1.pdb 86_Z2723142397_2_T1.pdb
19_Z2700586182_2.pdb 41_Z2596550737_1.pdb 64_Z2144418621_2.pdb 87_Z829994926_1_T1.pdb
1_Z2624037004_4.pdb 42_Z2713537244_2.pdb 65_Z2593207602_1_T1.pdb 88_PV-000256089451_1_T1.pdb
20_PV-001958058751_2.pdb 43_Z1418769667_1.pdb 66_Z510613592_1_T2.pdb 89_PV-000979741330_1_T1.pdb
21_PV-001702179999_1.pdb 44_Z1418769667_2.pdb 67_PV-000902495780_1_T1.pdb 8_Z2087260951_3.pdb
22_PV-000380950674_1.pdb 45_PV-001288562049_1.pdb 68_PV-000902495780_2_T1.pdb 90_PV-001002400892_1_T1.pdb
23_PV-001958058751_1.pdb 46_PV-000378673869_1.pdb 69_Z2084379853_2_T1.pdb 91_PV-001378044208_2_T1.pdb
24_Z2211137992_3.pdb 47_PV-001826919885_1.pdb 6_PV-001701895824_1.pdb 92_PV-001743414951_1_T1.pdb
25_Z2717222271_1.pdb 48_PV-001826919885_2.pdb 70_Z1656518334_1.pdb 93_Z2364787117_2_T1.pdb
26_PV-000376279119_1.pdb 49_Z2211137992_1.pdb 71_Z1656518334_2.pdb 94_Z2366885184_1_T1.pdb
27_PV-000376279119_2.pdb 4_Z2087260951_2.pdb 72_Z1897122191_2.pdb 95_Z1103196794_1_T1.pdb
28_PV-001958058751_4.pdb 50_Z812712648_1.pdb 73_Z2155602585_3.pdb 96_Z1897122191_3.pdb
29_Z2700586182_1.pdb 51_Z812712648_5.pdb 74_Z2155602585_4.pdb 97_PV-001826919885_5.pdb
2_Z2624037004_3.pdb 52_Z812712648_8.pdb 75_Z2717222271_2.pdb 98_PV-001826919885_6.pdb
30_PV-000286243379_1_T1.pdb 53_Z2713204296_1.pdb 76_Z2717222271_3.pdb 99_Z812712648_7.pdb
31_Z2893380031_1_T1.pdb 54_Z449211618_1.pdb 77_PV-001287209271_1.pdb 9_PV-001958058751_3.pdb
-
connect to the cluster via NICE-DCV
pcluster dcv connect -k <KEY_NAME> <CLUSTER_NAME>
-
visualize doking pose PDB files in PyMol.
- some docking simulation was failed
- use Spot with Slurm
SBATCH requeue
option - adjust number of jobline
- use FSx for Lustre
#!/bin/bash -xe
input_folder=CF
temp_folder=tmp
output_filename=collections.txt
mkdir -p ${temp_folder}/${input_folder}
for metatranche in $(ls ${input_folder}); do
for tranche in $(ls ${input_folder}/${metatranche}); do
echo " * Extracting ${tranche} to ${temp_folder}"
tar -xf ${tranche} -C ${temp_folder}/${input_folder} || true
for file in $(ls ${temp_folder}/${tranche/.*}); do
echo " * Adding file ${temp_folder}/${tranche/.*}/${file} to ${output_filename}"
tmp_tranche=${tranche/.tar}
count=$(tar tf ${temp_folder}/${tranche/.*}/${file} | grep .pdbqt | wc -l)
echo "${tmp_tranche##*/}_${file/.*} ${count}" >> ${output_filename}
done
done
done
rm -r ${temp_folder}