<a class="reference external" href="https://jupyter.designsafe-ci.org/hub/user-redirect/lab/tree/CommunityData/OpenSees/TrainingMaterial/training-OpenSees-on-DesignSafe/Jupyter_Notebooks/tapisConnect_tapisPaths.ipynb" target="_blank">
<img alt="Try on DesignSafe" src="https://raw.githubusercontent.com/DesignSafe-Training/pinn/main/DesignSafe-Badge.svg" /></a>

# Tapis Paths
***How DesignSafe File Storage and Tapis Work Together***

by Silvia Mazzoni, DesignSafe, 2025

Tapis powers file access and job submission on DesignSafe. It provides a consistent interface to interact with **multiple storage systems** and **compute environments**, making it easier to manage data before, during, and after simulation workflows.

With Tapis, you can:

* **List, upload, download, move, and delete files** across storage systems
* **Stage input files** (e.g., move from long-term storage to a compute node)
* **Collect outputs automatically** and return them to Corral (*MyData*)
* Use the same scripting or automation tools across locations

Tapis acts as the **glue** between DesignSafe‚Äôs storage and compute environments, streamlining data movement and improving reproducibility.

### What is a URI?

A **URI** (Uniform Resource Identifier) is the formal way Tapis identifies the location of your files and directories. Think of it as the ‚Äúaddress‚Äù for data in the Tapis ecosystem. Instead of just using a simple path (like `/home/user/file.txt`), a Tapis URI encodes both **where** the file lives (which storage system) and **what** the path to the file is on that system.

For example:

```
tapis://designsafe.storage.mydata/home/username/project/data/input.txt
```

* **tapis://** ‚Üí tells us we‚Äôre using Tapis to access this resource
* **designsafe.storage.mydata** ‚Üí identifies the storage system (e.g., MyData, Community Data, Work on Stampede3, etc.)
* **/home/username/project/data/input.txt** ‚Üí the actual path on that system

This consistent URI format allows you to write scripts or automation that work across **different storage systems** without changing code every time.

---

In this notebook, we‚Äôll assemble a JSON dictionary of your Tapis paths and save it to your MyData (**\~/MyData/.tapis\_user\_paths.json**) so you can reference it from any script at any time, even in new Jupyter Sessions, or from elsewhere.

Make sure you have *Established Your System Credentials* first!


---

## The Two Parts of a Tapis Path

A Tapis path is conceptually:

```
tapis://<SYSTEM_ID>/<RELATIVE_PATH>
```

1. ***SYSTEM_ID*** ‚Äî which storage system you‚Äôre targeting
   Examples:
   * *designsafe.storage.default* (MyData),
   * *designsafe.storage.community* (Community),
   * *cloud.data* (Work allocations),
   or a
   * project-scoped system (e.g., *project-<uuid>*)

3. ***RELATIVE_PATH*** ‚Äî the location **inside that system‚Äôs root**
   This is *not* relative to your program‚Äôs current directory; it‚Äôs relative to the **system root**.

   Tapis ignores CWD entirely; it‚Äôs always ‚Äú(system, path-within-that-system).‚Äù


When building your tapis paths, it's always a good idea to define these two components separately and join them only in the input. This makes your script more portable and reusable.

---

## How Tapis Paths Differ from Jupyter Paths

* **Jupyter/OS paths** (e.g., */home/jovyan/...* or *data/run1.csv*) resolve relative to your **current working directory (CWD)** on that machine.
* **Tapis paths** ignore CWD. They always mean **(system, path-within-that-system)**, no matter where your code runs.
* For batch/automation (Tapis jobs, SLURM), Tapis paths are **more portable** than CWD-dependent relative paths.

---
## Obtaining File-System Paths

### Start with the Easy, Fixed Bases

These don‚Äôt depend on an HPC system or allocation. Once you know your username, the bases are stable. Because you can obtain your username programmatically from Tapis, you may not need to know it.

| Storage       | Typical Base (Tapis)                             | Notes                                  |
| ------------- | ------------------------------------------------ | -------------------------------------- |
| **MyData**    | *tapis://designsafe.storage.default/<username>/* | Your personal storage (aka Corral)     |
| **Community** | *tapis://designsafe.storage.community/*          | Public community content (read-mostly) |
| **Published** | *tapis://designsafe.storage.published/*          | Published content (read-only) |

**Examples**


* **MyData**:
  
    MyData/inputs/model.tcl
    ‚Üí **tapis://designsafe.storage.default/username**/inputs/model.tcl
  
    ***don't forget your username!***


* **CommunityData**:
  
    CommunityData/Records/ATC-63/groundmotion.at2
    ‚Üí **tapis://designsafe.storage.community**/Records/ATC-63/groundmotion.at2

* **Published**:
   
    Published/Records/ATC-63/groundmotion.at2
    ‚Üí **tapis://designsafe.storage.published**/Records/ATC-63/groundmotion.at2

  You can find the relative path in the Data Depot


Use these as **bases**, then append project/job-specific subpaths.


In [1]:
# Local Utilities Library
import sys,os
PathOpsUtils = os.path.expanduser('~/CommunityData/OpenSees/TrainingMaterial/training-OpenSees-on-DesignSafe/OpsUtils')
if not PathOpsUtils in sys.path: sys.path.append(PathOpsUtils)
from OpsUtils import OpsUtils

In [2]:
# Connect to Tapis
t=OpsUtils.connect_tapis()

 -- Checking Tapis token --
 Token loaded from file. Token is still valid!
 Token expires at: 2025-09-05T23:57:32+00:00
 Token expires in: 3:39:50.881346
-- LOG IN SUCCESSFUL! --


In [3]:
# Initialize Json dictionary
TapisPaths = {}

---
### Obtain your username programmatically
Using a utility function

In [4]:
username = OpsUtils.get_tapis_username(t)
print('username:',username)

username: silvia


In [5]:
# we will make the keys lower case, as they'll be easier to match
TapisPaths['mydata'] = f'tapis://designsafe.storage.default/{username}'
TapisPaths['community'] = f'tapis://designsafe.storage.community'
TapisPaths['published'] = f'tapis://designsafe.storage.published'

--

### Next: Work (User & System Dependent)

**Work** is the shared, high-performance project area mounted on both **JupyterHub** and **HPC**‚Äîideal for staging inputs and storing outputs for jobs. Its base includes **allocation and username**, and differs by system (e.g., Stampede3 vs. LS6).

**Typical form:**

```
tapis://cloud.data/work/<allocation>/<username>/<system>/
```

Because this base is user/system-specific, it‚Äôs best to **discover it once** and **save it**.

#### One-Time Setup (Recommended)

Use your utility function (e.g., `get_user_work_path`) to fetch and persist the base for each system you use (Stampede3, LS6, Frontera). Run it **once**, cache results to a small file (e.g., `~/.designsafe/user_paths.json`), and reuse forever. Because this app submits a job, it may take a bit to run.

In [6]:
OpsUtils.show_text_file_in_accordion(PathOpsUtils, ['get_user_path_tapis_uri.py'])

In [7]:
from pathlib import Path
systems = ["stampede3", "ls6", "frontera"]

for s in systems:
    work_base = OpsUtils.get_user_work_tapis_uri(t, system_id=s)    
    print(f'{s}: {work_base}')
    TapisPaths[f'work/{s}'] = work_base


stampede3: tapis://cloud.data/work/05072/silvia/stampede3/
ls6: tapis://cloud.data/work/05072/silvia/ls6/
frontera: tapis://cloud.data/work/05072/silvia/frontera/


In [14]:
MyData_base = OpsUtils.get_user_work_tapis_uri(t, system_id=s)
print(MyData_base)

tapis://cloud.data/work/05072/silvia/frontera/


## save to a file
This is the default ("~/MyData/.tapis_user_paths.json") path that will be used in the utility functions (OpsUtils) used in this training module.

In [8]:
print('TapisPaths',TapisPaths)

TapisPaths {'mydata': 'tapis://designsafe.storage.default/silvia', 'community': 'tapis://designsafe.storage.community', 'published': 'tapis://designsafe.storage.published', 'work/stampede3': 'tapis://cloud.data/work/05072/silvia/stampede3/', 'work/ls6': 'tapis://cloud.data/work/05072/silvia/ls6/', 'work/frontera': 'tapis://cloud.data/work/05072/silvia/frontera/'}


In [9]:
import json
Path("~/MyData/.tapis_user_paths.json").expanduser().write_text(json.dumps(TapisPaths, indent=2))

368

---
## Use Utility Function
We have put the above steps into a utility function: get_tapis_base_paths(t,system) that you can call at the beginning of your script.

In [10]:
OpsUtils.show_text_file_in_accordion(PathOpsUtils, ['get_user_path_tapis_uri.py'])

In [16]:
allPaths = OpsUtils.get_user_path_tapis_uri(t,force_refresh=True)
for key,value in allPaths.items():
    print(key,value)

saved data to /home/jupyter/MyData/.tapis_user_paths.json
mydata tapis://designsafe.storage.default/silvia
community tapis://designsafe.storage.community
published tapis://designsafe.storage.published
work/stampede3 tapis://cloud.data/work/05072/silvia/stampede3
work/ls6 tapis://cloud.data/work/05072/silvia/ls6
work/frontera tapis://cloud.data/work/05072/silvia/frontera


In [12]:
#specify a specific value of file_system
thisPath = OpsUtils.get_user_path_tapis_uri(t,file_system='MyData')
print('thisPath:',thisPath)

found paths file: /home/jupyter/MyData/.tapis_user_paths.json
thisPath: tapis://designsafe.storage.default/silvia


In [13]:
#specify a specific value of file_system
thisPath = OpsUtils.get_user_path_tapis_uri(t,file_system='Work/stampede3')
print('thisPath:',thisPath)

found paths file: /home/jupyter/MyData/.tapis_user_paths.json
thisPath: tapis://cloud.data/work/05072/silvia/stampede3


---
## Alternate Semi-Automatic Method: Copy from the Web Portal

If you don‚Äôt have a base yet (or you‚Äôre exploring):

1. Open an app page (e.g., OpenSeesMP on Stampede3).
2. Click the **folder** icon to browse.
3. Navigate to the target directory and select it.
4. Copy the displayed Tapis URI, e.g.

   ```
   tapis://cloud.data/work/05072/jdoe/stampede3/somefolder
   ```
5. Keep the **base portion** for reuse:

   ```
   tapis://cloud.data/work/05072/jdoe/stampede3/
   ```

Once you get the hang of this, it‚Äôs quick‚Äîthen move that base into your saved JSON so you never have to browse again.

---

## Quick Reminders

* Tapis paths are **system-rooted**, not CWD-rooted (different from Jupyter).
* Use **Work** for HPC I/O; copy anything you want to keep from **Scratch** into **Work** or **MyData**.
* Save your bases once; **append relative subpaths** in all your scripts and job submissions.
