Documentation¶
Submitting a task with MyQueue typically works like this:
$ mq submit <task> -R <resources>
or:
$ mq submit "<task> <arguments>" -R <resources>
And checking the result looks like this:
$ mq list -s <states> # or just: mq ls
Below, we describe the important concepts Tasks, Arguments, Resources and States.
Tasks¶
There are five kinds of tasks: Python module, Function in a Python module, Python script, Shell command and Shell-script.
Python module¶
Examples:
modulemodule.submodule(a Python submodule)
These are executed as python3 -m module so Python must be able to import
the modules.
Function in a Python module¶
Examples:
module@functionmodule.submodule@function
These are executed as python3 -c "import module; module.function(...) so
Python must be able to import the function from the module.
Python script¶
Examples:
script.py(usescript.pyin folders where tasks are running)./script.py(usescript.pyfrom folder where tasks were submitted)/path/to/script.py(absolute path)
Executed as python3 script.py.
Shell command¶
Example:
shell:command
The command must be in $PATH.
Shell-script¶
Example:
./script
Executed as . ./script.
Arguments¶
All tasks can take extra arguments by enclosing task and arguments in quotes like this:
"<task> <arg1> <arg2> ..."
Arguments will simply be added to the command-line that executes the task, except for Function in a Python module tasks where the arguments are converted to Python literals and passed to the function. Some examples:
$ mq submit "script.py ABC 123"
would run python3 script.py ABC 123 and:
$ mq submit "mymod@func ABC 123"
would run python3 -c "import mymod; mymod.func('ABC', 123).
Using a Python virtual environment¶
If a task is submitted from a virtual environment then that venv will also
be activated in the script that runs the task. MyQueue does this by looking
for an VIRTUAL_ENV environment variable.
Resources¶
A resource specification has the form:
[use-mpi:]cores[:processes][:gpus][:nodename]:tmax[:weight]
use-mpi: Usep(as in parallel) if you want to start your job withmpiexec; uses(as in serial) if you don’t. The default is to usempiexecunless you have specified'use_mpi': Falsein your configuration file: Using MPI or not?.cores: Number of cores to reserve.processes: Number of MPI processes to start (defaults to number of cores).gpus: Number of GPUs per node to allocate. Example:4G. Default is no GPUs (0G).nodename: Node-name (defaults to best match in the list of node-types).tmax: Maximum time (use s, m, h and d for seconds, minutes, hours and days respectively). Examples:1h,2d. Default is10m.weight: weight of a task. Can be used to limit the number of simultaneously running tasks. See Task weight. Defaults to 0.
Both the submit and resubmit commands
as well as the myqueue.task.task() function, take
an optional resources argument (-R or --resources).
Default resources are a modest one core and 10 minutes.
Examples:
1:1h1 core and 1 process for 1 hour64:xeon:2d64 cores and 64 processes on “xeon” nodes for 2 days24:1:30m24 cores and 1 process for 30 minutes (useful for OpenMP tasks)s:24:30m24 cores and 24 processes for 30 minutes (useful for tasks that do their own mpiexec call)96:4:4G:10h96 cores and 4 processes and 4 GPUs per node for 10 hours
Resources can also be specified via special comments in scripts:
# MQ: resources=40:1d
from somewhere import run
run('something')
Preamble¶
The value of the MYQUEUE_PREAMBLE environment variable
will be inserted at the beginning of the script that will be
submitted.
Tip
To see the script that you are about to submit, use:
$ mq submit ... -vz # --verbose --dry-run
States¶
These are the 8 possible states a task can be in:
queued |
waiting for resources to become available |
hold |
on hold |
running |
actually running |
done |
successfully finished |
FAILED |
something bad happened |
MEMORY |
ran out of memory |
TIMEOUT |
ran out of time |
CANCELED |
a dependency failed or ran out of memory or time |
The the -s or --states options of the
list, resubmit, remove and
modify use the following abbreviations: q, h, r,
d, F, C, M and T. It’s also possible to use a as a
shortcut for the all the “good” states qhrd and A for the “bad” ones
FCMT.
Examples¶
Sleep for 2 seconds on 1 core using the
time.sleep()Python function:$ mq submit "time@sleep 2" -R 1:1m 1 ./ time@sleep 2 +1 1:1m 1 task submitted
Run the
echo helloshell command in two folders (using the defaults of 1 core for 10 minutes):$ mkdir f1 f2 $ mq submit "shell:echo hello" f1/ f2/ Submitting tasks: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2/2 2 ./f1/ shell:echo hello +1 1:10m 3 ./f2/ shell:echo hello +1 1:10m 2 tasks submitted
Run
script.pyon 8 cores for 10 hours:$ echo "x = 1 / 0" > script.py $ mq submit script.py -R 8:10h 4 ./ script.py 8:10h 1 task submitted
You can see the status of your jobs with:
$ mq list
id folder name args info res. age state time error
── ────── ────────── ───── ──── ───── ──── ────── ──── ───────────────────────────────────
1 ./ time@sleep 2 +1 1:1m 0:02 done 0:02
2 ./f1/ shell:echo hello +1 1:10m 0:00 done 0:00
3 ./f2/ shell:echo hello +1 1:10m 0:00 done 0:00
4 ./ script.py 8:10h 0:00 FAILED 0:00 ZeroDivisionError: division by zero
── ────── ────────── ───── ──── ───── ──── ────── ──── ───────────────────────────────────
done: 3, FAILED: 1, total: 4
Remove the failed and done jobs from the list with (notice the dot meaning the current folder):
$ mq remove -s Fd -r .
1 ./ time@sleep 2 +1 1:1m 0:02 done 0:02
2 ./f1/ shell:echo hello +1 1:10m 0:00 done 0:00
3 ./f2/ shell:echo hello +1 1:10m 0:00 done 0:00
4 ./ script.py 8:10h 0:00 FAILED 0:00 ZeroDivisionError: division by zero
4 tasks removed
The output files from a task will look like this:
$ ls -l f2
total 4
-rw-rw-r-- 1 jensj jensj 0 Oct 28 10:46 shell:echo.3.err
-rw-rw-r-- 1 jensj jensj 6 Oct 28 10:46 shell:echo.3.out
$ cat f2/shell:echo.3.out
hello
If a job fails or times out, then you can resubmit it with more resources:
$ mq submit "shell:sleep 4" -R 1:2s
5 ./ shell:sleep 4 +1 1:2s
1 task submitted
$ mq list
id folder name args info res. age state time
── ────── ─────────── ──── ──── ──── ──── ─────── ────
5 ./ shell:sleep 4 +1 1:2s 0:02 TIMEOUT 0:02
── ────── ─────────── ──── ──── ──── ──── ─────── ────
TIMEOUT: 1, total: 1
$ mq resubmit -i 5 -R 1:1m
6 ./ shell:sleep 4 +1 1:1m
1 task submitted