[MINC-users] Announce - qbatch 1.0 - Execute shell command lines in parallel (serial farm) on SGE/PBS clusters

Gabriel A. Devenyi gdevenyi at gmail.com
Wed May 11 21:42:37 EDT 2016


Thanks for your feedback! See responses inline.

On Wed, May 11, 2016 at 8:30 PM, Andrew Janke a.janke at gmail.com wrote:

I’ve taken a proper look at this now. It currently is in no way a drop
in replacement but from what I can see it shouldn’t be too hard. I’m
more than happy to do away with my old perl version. For one your
handling of dependencies is much nicer and self contained. There big
differences are:

qbatch(mine), was designed equivalent to nohup. ie: write a
command, test it, add qbatch.

$ for i in .mnc; do process_something.sh -args .mnc; done

When happy you’d do this:

$ for i in .mnc; do qbatch —queue my.q —name $i —
process-something.sh -args .mnc;
done

Whereas qbatch(python) expects commands to be in a file, this makes it
difficult to use things like globs with qbatch:

$ qbatch —logfile blah.log — do_something_args.pl *.mnc

I see a PR which is along these lines:

https://github.com/pipitone/qbatch/issues/90

How hard would it be to support a single command after “—“? I note
that you can pipe things to qbatch via echo but this is a real pain to
use in things like perl and python scripts. OK in shell though.

We do support job lists input via STDIN

#From this:
$ for i in *.mnc; do process_something.sh -args *.mnc; done
#To this
$ for i in *.mnc; do echo process_something.sh -args *.mnc; done | qbatch -

Would this work for you?

We’ve thought a bit about single command as you’ve seen, we may end up
providing a simple wrapper to handle your use case.

The version with “—“ is also interesting, hadn’t thought of that,
will note it in the issue. Cleaner without needing a warpper

I find debugging output in the logfiles to be invaluable when
debugging things. Stuff like the hostname and time of execution.

https://github.com/andrewjanke/qbatch/blob/master/qbatch#L151

and

https://github.com/andrewjanke/qbatch/blob/master/qbatch#L172

This means that I can quickly grep for “exit status” on a directory of logfiles.

On our main PBS cluster the prolog and epilog scripts provide this for
us, I’ve actually been scraping the net for prolog and epilog examples
for SGE and yet to find them.

Would you be amenable to allowing —header-file and —footer-file
inserts into the joblist? I’m not sure about hard coding that kind of
status lines into the tool.

I’ll open an issue to discuss further
https://github.com/pipitone/qbatch/issues/93

I note slight differences in how we handle cwd, is there are a
reason for you cd?

https://github.com/pipitone/qbatch/blob/master/bin/qbatch#L67

I found this problematic for NFS mounts if using relative paths, and
let gridengine sort out things itself.:

https://github.com/andrewjanke/qbatch/blob/master/qbatch#L108

You’re pointing to the “local” template which is for qbatch to
parallelize work on a local machine, for the SGE and PBS templates we
use the cluster workdir features.

See https://github.com/pipitone/qbatch/blob/master/bin/qbatch#L51 and
https://github.com/pipitone/qbatch/blob/master/bin/qbatch#L34

There are a number of smaller things (job name larger than 15
characters == FAIL in PBS). This causes issues when tracking
dependencies as your unique job name may be longer than this. You can
either bomb and error out to the user (choose a shorter job name!) or
you have to track an internal database of jobnumbers (capture the
return value of qsub) and match these to input job names. Currently
in your implementation I think this will fail.

if we were using PBS’s awful qstat text outputs which are indeed
truncated, that would be the case, however, we’re parsing PBS’s XML
qstat output, which gives the full jobname properly (at least on our
version of 4.2.something)

See https://github.com/pipitone/qbatch/blob/master/bin/qbatch#L138-L171

There are also issues around re-running jobs:

https://github.com/andrewjanke/qbatch/blob/master/qbatch#L103

As this can also screw up checking dependencies. In your case I think
this is handled better as you query based upon name. I didn’t see from
the code but can you specify a dependency as a job number?

I hadn’t thought at all about job re-running. Currently the dependency
support relies on the cluster, so I’m not sure of the heuristics of a
“soft failure” that allows a re-run, does it get back it’s old job
number? If it does, this should work, if it doesn’t I presume the
cluster kills the dependent job. Will need to investigate this
further.

See https://github.com/pipitone/qbatch/issues/94

As for depending on a job number, we don’t handle that right now,
easily added, noted here: https://github.com/pipitone/qbatch/issues/95

ta

Thanks for all your feedback!

>

a

On 10 May 2016 at 21:23, Gabriel A. Devenyi gdevenyi at gmail.com wrote:

Simon/Andrew, indeed this was written as an intended replacement for
sge_batch and/or the original qbatch (we at CoBrALab also had an earlier
internal qbatch which was not Andrew’s version).
Having said that, we didn’t ensure that it was a one-to-one drop in
replacement, so use may require some modification.

Andrew:
Yes, dependencies in PBS/SGE (and LSF, but I haven’t tested LSF at all yet)
are already implemented, just missing from the README, will add :) My
antsRegistration-MaGeT pipeline makes extensive use of them already.

For PBS we parse the XML joblist and do pattern matching on names:
https://github.com/pipitone/qbatch/blob/master/bin/qbatch#L138-L171
And for SGE we use jobname pattern matching built into SGE:
https://github.com/pipitone/qbatch/blob/master/bin/qbatch#L381-L382

As for the queue, I have an in progress branch to specify “other” batch
options, handled by environment variable, see
https://github.com/pipitone/qbatch/pull/80
I just need to solve the append vs. replace issue I’m having with argparse.
Thoughts here welcome :)

For bugs/feature requests/modifications, please feel free to request! Our
goal is to provide a maintained tool to support existing workflows and
enable new ones.

—
Gabriel A. Devenyi B.Eng. Ph.D.
Research Computing Associate
Computational Brain Anatomy Laboratory
Cerebral Imaging Center
Douglas Mental Health University Institute
Affiliate, Department of Psychiatry
McGill University
t: 514.761.6131x4781
e: gdevenyi at gmail.com

On Tue, May 10, 2016 at 9:07 AM, Andrew Janke a.janke at gmail.com wrote:
>

Very tasty.

Was it designed as a drop in replacement for qbatch?

https://github.com/andrewjanke/qbatch/blob/master/qbatch

I note a number of similarities in the C/L arguments. Any plans for
handling dependencies? I see code in there but no arguments to make
use of it? In particular to get around the hoops of tracking
dependencies via job numbers (that my qbatch returns) in PBS and/or
via names in gridengine? ie:

$ for i in .mnc; do qbatch —name STEP1-$i — ; done
$ qbatch —name STEP2 —depends ‘STEP1-‘ —

I ask as I use qbatch internally in things like volgenmodel in order
to abstract away from the various qsub’s. Currently I handle this via
some pretty horrendous perl code here:

https://github.com/andrewjanke/volgenmodel/blob/master/volgenmodel#L751

It’d be nice to be able to do away with it via your version!

I also find the notion of being able to define a queue via an ENV var
useful when you have a number of levels of scripts/perl/etc that don’t
always pass arguments through. ie:

https://github.com/andrewjanke/qbatch/blob/master/qbatch#L69

a

On 10 May 2016 at 03:38, Gabriel A. Devenyi gdevenyi at gmail.com wrote:

We (Gabriel A. Devenyi and Jon Pipitone) would like to announce the 1.0
release https://github.com/pipitone/qbatch/releases of qbatch, a
command-line tool for easily running a list of commands in parallel
(serial
farming) on a compute cluster. This tool takes the list of commands,
divides them up into batches of arbitrary size, and then submits each
batch
as a separate job or as part of an array job. qbatch also gives you a
consistent interface to use to submit commands on PBS and SGE clusters
or
locally, (support for others are planned/in testing, PRs welcome) while
setting requirements for processors, walltime, memory and job
dependencies.
This tool can be used as a quick interface to spread work out on a
cluster,
or as the glue for for connecting a simple pipeline to a cluster (see
https://github.com/CobraLab/antsRegistration-MAGeT for a sample
implementation)

The target audience of qbatch is two-fold: it is immediately available
for
users of PBS or SGE clusters to simplify their job construction, in
addition, through the use of environment variables, cluster
administrations
can craft a default qbatch deployment which allows new cluster users to
quickly submit jobs which honours the cluster’s policies.

For more information, check out our github webpage here:
http://github.com/pipitone/qbatch
qbatch is also available in pypi via pip install qbatch

—
Gabriel A. Devenyi B.Eng. Ph.D.
Research Computing Associate
Computational Brain Anatomy Laboratory
Cerebral Imaging Center
Douglas Mental Health University Institute
Affiliate, Department of Psychiatry
McGill University
t: 514.761.6131x4781
e: gdevenyi at gmail.com

________________________________

MINC-users at bic.mni.mcgill.ca
http://www.bic.mni.mcgill.ca/mailman/listinfo/minc-users

________________________________

MINC-users at bic.mni.mcgill.ca
http://www.bic.mni.mcgill.ca/mailman/listinfo/minc-users


More information about the MINC-users mailing list