Scaling Jobs with Tapis

Scaling Jobs with Tapis#

Moving beyond interactive job submission

At some point, the Web Portal, interactive environments like JupyterHub, or even scripts run on your own computer, stop being the best tool—not because they are limited, but because your workflow has grown.

These are excellent for development, testing, and small-to-moderate runs, but they are not designed to manage large numbers of long-running, resource-intensive jobs.

You may be ready to:

run many jobs (parameter studies, ensembles, UQ, sensitivity analyses)
increase model size and runtime
eliminate repetitive clicking and manual bookkeeping
build repeatable, scriptable workflows that you can rerun, share, and extend

This is the point where you move beyond purely interactive execution—whether from the Web Portal, JupyterHub, or your local machine—and begin using Tapis as an automation layer to reliably launch, monitor, and manage jobs on HPC systems.

You now have two complementary paths:#

1. Write your own SLURM scripts#

This path gives you direct, low-level control over how jobs run on the scheduler.

Start from existing SLURM scripts used inside Tapis apps.
Customize nodes, cores, MPI layout, job arrays, and I/O strategies.
Chain jobs manually using dependencies or wrapper scripts.
Best suited for users who want the scheduler itself to be part of the workflow logic.

Best for: large MPI jobs, job arrays, tightly managed scratch workflows, advanced scheduler usage.

2. Automate jobs using Tapis#

This path focuses on workflow automation, while letting Tapis handle the HPC details.

Step 1 — Use existing Tapis apps programmatically#

Your first step is not to write new apps.

Instead, you:

reuse existing, production-tested Tapis apps (e.g., OpenSeesMP, OpenSeesSP, OpenSeesPy)
submit jobs via Tapipy (Python) or tapis-cli
pass inputs and parameters programmatically
launch many jobs with structured variations (loops, tables, JSON configs)

At this stage, you are using the exact same execution environment as the Web Portal — just without the clicking.

Step 2 — Build automation around those apps#

Once programmatic submission is in place, you can:

generate inputs automatically
track job IDs and states
link pre-processing and post-processing steps
re-run failed cases or extend studies incrementally

This is often where Jupyter becomes a workflow controller, not the execution environment.

Step 3 — Write your own Tapis apps (when needed)#

Only after the above do you typically need to:

create a new app wrapper
define a new input schema
support a custom executable or workflow pattern

By then, you already understand:

what the app needs to expose
how users will automate it
how it fits into larger pipelines

Why this matters for scientific workflows#

Using Tapis apps as reusable templates is a cornerstone of scalable research computing.

They:

enforce consistent input/output handling
reduce user error and environment drift
enable the same workflow to run from the Web Portal, Jupyter, or scripts
make collaboration and reproducibility practical

Most importantly, this approach lets researchers scale their science without rebuilding infrastructure each time.