Grixec: simple user interface for batch computing

Alexandru Dan Corlan 22/06/2011

Persistent link to this document:

http://www.webcitation.org/query?url=http%3A%2F%2Fdan.corlan.net%2Fsoftware%2Fgrixec.html&date=2011-07-02

or cite as:
Corlan, Alexandru Dan. Grixec: simple user interface for batch computing. corlan.net. 2011-07-02.
URL:http://dan.corlan.net/software/grixec.html. Accessed: 2011-07-02.
(Archived by WebCite® at http://www.webcitation.org/5zsJ5uK9k

Motivation

In contrast with the perspective of the administrator of a large facility, the computational scientist values a specific set of features for a batch execution system:

the most valuable resource is the time of the computational scientist himself, and also that of his coworkers
reproducibility and persistence of the computational results and of the environment (libraries, programming tools and languages, other software)
the ready availability of large computational facilities is good, but most tasks involve rather his own workstation or other local computers to which he has full (user level) access; the same interface must be used both for scheduling jobs locally and remotely
the size of a job (in cpu-hours and/or gigabytes) is not always proportional to its importance; smallish jobs may be more important, largish one may be less so---for example the result is anticipated, but large numbers of samples of a parameter space are needed for a convincing confirmation
predictibility of the response of the computation facility is important
organisation and referencing of the computation results is one of the main issues imparing productivity
the user tools must be small enough to fully understand and customise, must not depend on sizeable and unstable software infrastructure and must not involve complex experiments in order to achieve configuration

Purpose

The grixec collection of simple tools is a (really) small scale efort for a prototype aiming to capture a functional user interface that meets our needs listed above. It does not plan to evolve beyond a collection of a few strightforward bash scripts that only use the basic textutils from any Linux distribution.

Your comments

These scripts, however, are fully usable, the `capturing' mentioned above being achieved through daily use and the polishing of their features.

Your comments are highly welcome, please enter them here.

Grixec release 0.2

This new version uses danotation as a log database for all information about jobs. Otherwise the use of the grixec commands are the same as in the previous version. User's jobs however can log any new information in a structured way in the dict files and it can be extracted with the danotation utility.

If you do not install danotation you can't use some of the new gxac forms, but otherwise the system works the same as before.

Download

Grixec release 0.2, January 24, 2012

Installation

chmod a+x ./grixall0.2
sudo ./grixall0.2 install

this will install this file and a few links to it (gxec, gxac, gxay) in the /usr/local/bin directory; use the links as the user commands; change the INSTALLDIR variable, from the grixall script, to install in another directory

If you also have the previous version it remains in /usr/local/bin but the links are updated.

Grixec release 0.1

This release consists of a single bash script, that also needs the uuid command to be present.

Download

Grixec release 0.1, June 27, 2011 Save as the file grixall

If you prefer a gzipped tar:

Grixec release 0.1, June 27, 2011 (tgz)

It is the same thing.

Installation

chmod a+x ./grixall                                                                                   
sudo ./grixall install

Usage

Directory structure

Grixec runs in user space. You normally cd into a directory that is called 'the project directory' where you develop your project and issues command from there. The first time you call gxec (when you start your first job), it makes a directory, grixome, in your home directory and also a subdirectory, data, in the current (project) directory.

The grixome directory will contain one directory for each job you run, no matter from which project. The name of this job directory is going to be an uuid, such as: 70a52008-a0a3-11e0-bc6c-20cf30bebeca

The job directory will be linked to a shorter and easier to use name (usually something like J034) from the project's directory.

Each job directory, and the data directory will also contain a file named 'dict' that contains further info about the project or the job. Some of the info is entered by the gxec command, some other by the user with the gxay command.

gxec

The gxec command starts a new job. It currently has the syntax:

gxec -c * [-m ] -x  *

A new directory is made, the files are copied there and the command issued (after being included in a wrapper script). The machine can be another machine where the user has an ssh account and a valid certificate (otherwise the password will be required a number of times). It may be of the form user@machine.domain When another machine is specified, the grixome directory is made in the user's account there, the files copied, the script started and waited to finish and then the result files copied back.

gxac

All the information available about jobs is seen as hierarchical structure, each component and subcomponent being identified by a name or a number. The gxac command is followed by a series of such identifiers, each more detailed than the previous. Currently implemented forms of the comand are:

gxac  [ls|status|machine|id|prjid|syno|]

Where the jobid may be either the UUID form of a job identifier, a short name of a job, 'last' meaning the last created job or 'all' meaning all jobs in the current project in reverse chronological order. Second level components (tags) are the list of files of the job, the status, the machine where it runs, the UUID form of the job identifier, the project id form and a synoptic of the job, its machine and status, or the contents of any file belonging to the job.

gxac  format *

In this form, a set of tags as from the first form may be required for the jobid. For example:

gxac J023 format id machine status

will return the id, machine and status of job J023.

gxac dict key1 key2

The information stored with the two keys, in the project dictionary, is printed on standard output (see gxay below).

gxac  dict key1 [key2]

The information stored with the keys, in the job dictionary, is returned. Each time gxec is called, the folowing (single) keys are used to store information about it: id, projdir, call, created, by, machine, runfrom. The projdir is the path of the project directory, the call is the gxec list of arguments, created is the date gxec was issued, by is the user name, machine is the target machine and runfrom is the name of the machine where the project resides (user's machine).

gxay

This command enters information into the project and the job dictionaries.

gxay    *

where dict can be "dict", "last" (the last created job) or the project id of a job (such as J032).