Paths in Tapis Applications

Paths in Tapis Applications#

How Storage Is Referenced in Jobs

Tapis provides a unified way to access storage systems, but paths are never standalone. Every path is interpreted relative to a system.

System + Path (The Core Concept)#

In Tapis, file locations are defined using:

systemId → where the file lives
path → where it is on that system

Conceptually:

(systemId, path)

Documentation often shows this as:

tapis://systemId/path/to/file

…but this is shorthand. APIs and job definitions always separate the two.

Common DesignSafe Storage Systems#

Storage Location	Typical System ID
MyData	designsafe.storage.default
Work	designsafe.work.stampede3
Community	designsafe.storage.community
Scratch (HPC)	designsafe.compute.stampede3

Example: Input File Reference#

{
  "sourceSystemId": "designsafe.storage.default",
  "sourcePath": "/myfolder/input/model.tcl"
}

This tells Tapis exactly where to fetch the file from.

Output Locations Matter#

If your app writes output to:

/scratch/05072/silvia/run01

Those files will not automatically appear in JupyterHub or the Data Depot.

Best practice:

Write final outputs to WORK or MyData
Or explicitly copy them at the end of your job

Why Tapis Paths Feel Strict (and That’s Good)#

Tapis enforces clarity:

No ambiguous relative paths
No implicit working directories
No guessing which system you meant

This discipline is what allows:

Web-submitted jobs
Automated pipelines
Reproducible workflows

Summary: Tapis Path Rules of Thumb#

Always think system first, path second
Scratch is fast but temporary
Work is the bridge
Final results belong in Work or MyData

Designing Path-Safe, Reproducible Workflows#

From Interactive Exploration to Scalable Runs

Paths are not just technical details—they encode workflow intent.

A good path strategy answers:

Where do inputs live?
Where do intermediate results go?
Which outputs should persist?

Recommended Directory Roles#

Location	Purpose
MyData	Inputs, scripts, notebooks, long-term storage
Work	Active runs, shared staging, job outputs
Scratch	High-performance temporary computation

Example Workflow Pattern#

MyData/
 └── projectA/
     ├── inputs/
     ├── scripts/
     └── notebooks/

Work/
 └── projectA/
     ├── run01/
     ├── run02/
     └── results/

Scratch/
 └── projectA/
     └── run01_tmp/

Portability Principle#

Your code should not change when moving between:

Jupyter exploration
Single-node tests
Large-scale HPC runs

Only paths and environment variables should differ.

Common Anti-Patterns to Avoid#

Hard-coding /home/jupyter/… in batch jobs
Writing final results only to Scratch
Relying on implicit working directories
Mixing inputs and outputs in the same folder

The Big Picture#

Path-safe workflows enable:

Debugging in Jupyter
Scaling on HPC
Automation through Tapis
Sharing with collaborators

They are foundational—not optional—for serious computational research on DesignSafe.