Created
November 14, 2024 19:53
-
-
Save ChristinaLK/b9d6e10dd5824b41d6887895ff2ad5ed to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
type=$1 | |
num=$2 | |
datafile=$3 | |
$type -n $num $datafile > $datafile.sub | |
[christina.koch@ap40 automation]$ cat simple.def | |
Bootstrap: docker | |
From: hub.opensciencegrid.org/htc/rocky:9 | |
%files | |
extract.sh /opt | |
%environment | |
export PATH=/opt/:$PATH |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import htcondor | |
import classad | |
input_list = ["SRR2584866_2", "SRR2589044_1", "SRR2589044_2", "SRR2584866_1", "SRR2584863_1", "SRR2584863_2"] | |
itemdata = [] | |
for samplename in input_list: | |
itemdata.append({"sample": samplename}) | |
function = 'head' | |
num = 5 | |
upload_bucket = "osdf:///ospool/ap40/data/christina.koch/output-buckets" | |
singularity_image = "osdf:///ospool/ap40/data/christina.koch/singularity_imgs/extract.sif" | |
input_bucket = "osdf:///ospool/uc-shared/public/osg-training/tutorial-fastqc/data" | |
job = htcondor.Submit({ | |
"container_image": singularity_image, | |
"universe": "container", | |
"executable": "/opt/extract.sh", | |
"transfer_executable": "False", | |
"should_transfer_files" : "True", | |
"transfer_input_files" : f"{input_bucket}/$(sample).trim.sub.fastq", | |
"arguments": f"{function} {num} $(sample).trim.sub.fastq", | |
"transfer_output_remaps": f'"$(sample).trim.sub.fastq.sub={upload_bucket}/$(ClusterID)/$(sample).trim.sub.fastqc.sub"', | |
"output": "logs/extract-$(ProcId).out", # output and error for each job, using the $(ProcId) macro | |
"error": "logs/extract-$(ProcId).err", | |
"log": "logs/extract.log", # we still send all of the HTCondor logs for every job to the same file (not split up!) | |
"request_cpus": "1", | |
"request_memory": "1GB", | |
"request_disk": "1GB", | |
}) | |
schedd = htcondor.Schedd() | |
submit_result = schedd.submit(job, itemdata = iter(itemdata)) # submit one job for each item in the itemdata |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
type=$1 | |
num=$2 | |
datafile=$3 | |
$type -n $num $datafile > $datafile.sub |
slack transcript:
- can you use the python bindings to submit jobs to a remote schedd, without having a full htcondor install on your local machine?
- https://htcondor.readthedocs.io/en/latest/apis/python-bindings/api/version2/htcondor2/schedd.html#htcondor2.Schedd
- You would generally write something like
collector = htcondor2.Collector("remote-pool-cm.host.tld")
location = collector.locate(htcondor2.DaemonType.Schedd, "name-of-schedd-if-more-than-one")
schedd = htcondor2.Schedd(location)
- You do not need anything installed beyond the bindings, but you may need a Condor config file to set authentication-related settings (or set them directly in your python script)
- (... I don't remember what locate() returns if it fails to find anything; if it's None you may need to check for that explicitly to avoid accidentally submitting/trying to submit to the local schedd, instead.)
- But if you have an IDToken in the usual location, that will work without any config files.
- …or do the bindings complain if the default config file doesn’t exist?
- The bindings may well complain, but I think they work anyway.
- They spit out a warning if they don't see a config file
- A blank file in the right spot or even setting the environment to look at /dev/null is enough to get it to not emit that warning (edited)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
container.def
andextract.sh
are used to build a container, which can be added to the OSDF.extract.py
takes a list of input data and submits a job (using the container/script) for each item, putting the results back into an OSDF location.