Prepare and Submit a Job

Continuing with the previous example of kallisto (see Find an Application), we know there are three required input files, one required input parameter, and two optional input parameters for this specific application.

Build a job template file

To run an instance of this application with our data (called a “job”), we first must assemble a json description of the job we would like to run. The simplest way to do this is to use the Agave jobs-template command:

% jobs-template kallisto-0.43.1u1

{
  "name": "kallisto test-1506708374",
  "appId": "kallisto-0.43.1u1",
  "archive": true,
  "inputs": {
    "transcripts": "read1.fastq",
    "fastq1": "",
    "fastq2": ""
  },
  "parameter": {
    "output": "output"
  }
}

By default, this is output to the screen. To store it in a file instead, perform:

% jobs-template kallisto-0.43.1u1 > kallisto_job.json

We see very basic information including an identifying name for our job, the identity of the app, and whether it should be archived. Also, we see only the required inputs and parameters. (All optional inputs and parameters can be automatically supplied using the -A flag. See jobs-template -h for more details).

Add input files to job template

If you staged your own data to your private STORAGE system, now is the time to provide the path to your data. This is easy to do using Agave URIs. For example, see these URIs describing the path to publicly accessible data for this kallisto job:

 "inputs": {
   "transcripts": "agave://data-sd2e-community/sample/kallisto/test/transcripts.fasta.gz",
   "fastq1": "agave://data-sd2e-community/sample/kallisto/test/reads_1.fastq.gz",
   "fastq2": "agave://data-sd2e-community/sample/kallisto/test/reads_2.fastq.gz"
 },

The prefix for an Agave URI is always agave:// followed by the STORAGE system, followed by the complete path relative to your root directory, and finally the name of the file. Modify your kallisto_job.json file to point to either this public data, or your data that you have staged on your private STORAGE system.

Add parameters to job template

Initially only the required parameter output is listed. If we wish to use non-default values for the unlisted parameters, bootstrap and seed, we must add them to the job template now:

 "parameters": {
   "output": "output",
   "bootstrap": 100,
   "seed": 1
 }

The json format is very unforgiving with typos. If your job file is not accepted, you may consider running it through an external JSON validator.

Submit a job

Once you are satisfied that your data is staged and the job template file contains the instructions you want to use to run the job, use the jobs-submit command to submit the job:

% jobs-submit -F kallisto_job.json
Successfully submitted job 833421020533756391-242ac11b-0001-007

If there are no errors, you will see a success message along with a long unique identifier (UID) for your job. You can monitor the progress of the job with the jobs-list command and, optionally, the job UID:

% jobs-list
833421020533756391-242ac11b-0001-007 PENDING

Download the results

Once the job status is FINISHED, you can list what output is available:

% jobs-output-list 833421020533756391-242ac11b-0001-007
.agave.archive
.agave.log
_util
app.yml
kallisto-job.json
kallisto-tester.sh
kallisto.json
kallisto.sh.template
kallisto_test-833421020533756391-242ac11b-0001-007.err
kallisto_test-833421020533756391-242ac11b-0001-007.out
kallisto_test.ipcexe
output
reads_1.fastq.gz
reads_2.fastq.gz
test
transcript.idx
transcripts.fasta.gz

The important output for this job is all located in the output directory. Download just the output directory using the following command:

% jobs-output-get -r 833421020533756391-242ac11b-0001-007 /output/
Downloading output/abundance.h5 ...
######################################################################## 100.0%
Downloading output/abundance.tsv ...
######################################################################## 100.0%
Downloading output/run_info.json ...
######################################################################## 100.0%

Or, you can download all of the job files including output, logs, and other run time files using the following command:

% jobs-output-get -r 833421020533756391-242ac11b-0001-007

Return to the API Documentation Overview