generate_task_commands()#
Generating PyLauncher Task Commands for Parameter Sweeps
This utility provides a general, reusable way to generate PyLauncher tasklists (for example, a runsList.txt file) by expanding a single command template into all combinations of parameter values.
Instead of manually writing dozens or hundreds of nearly identical command lines, you define:
one base command, and
a dictionary of parameters to sweep,
and the utility produces one shell command per parameter combination, ready to be executed by PyLauncher.
Why This Utility Exists#
Large parametric studies are common in HPC workflows, but manually managing tasklists is:
error-prone,
hard to scale, and
difficult to maintain.
This utility:
keeps sweeps explicit and readable,
guarantees collision-free task generation, and
integrates cleanly with SLURM + PyLauncher + Tapis-style workflows.
Function Overview#
generate_task_commands(base_command, sweep, *, placeholder_style="token")
Purpose#
Expand a command template into a list of fully resolved shell commands, one for each combination of sweep parameters.
Parameters#
base_command : str#
A command template containing placeholders for parameters to be swept.
Example:
python3 -u simulate.py --alpha ALPHA --beta BETA --gamma GAMMA --output "$WORK/sweep_$SLURM_JOB_ID/line_$LAUNCHER_JID/slot_$LAUNCHER_TSK_ID"
The placeholders (ALPHA, BETA, GAMMA) must match the keys in the sweep dictionary.
Environment variables such as $WORK, $SLURM_JOB_ID, and $LAUNCHER_TSK_ID are intentionally left unresolved here and will be expanded by the shell at runtime.
sweep : Mapping[str, Sequence[Any]]#
A dictionary defining the parameter sweep.
Each key corresponds to a placeholder in base_command, and each value is a sequence of values to sweep over.
Example:
{
"ALPHA": [0.3, 0.5, 3.7],
"BETA": [1.1, 2, 3],
"GAMMA": ["a", "b", "c"],
}
All combinations are generated using a Cartesian product.
placeholder_style : {"token", "braces"}, optional#
Controls how placeholders are written in base_command.
"token"(default): Placeholders appear as bare tokens:--alpha ALPHA
"braces": Placeholders appear inside braces:--alpha {ALPHA}
The brace style can be safer if parameter names might overlap with other text.
Returns#
List[str]#
A list of fully expanded command strings.
Each element corresponds to one unique combination of sweep parameters, in deterministic order based on the insertion order of the sweep dictionary.
Example Usage#
inputFilename = "simulate.py"
base_command = (
f'python3 -u {inputFilename} '
f'--alpha ALPHA --beta BETA --gamma GAMMA '
f'--output "$WORK/sweep_$SLURM_JOB_ID/line_$LAUNCHER_JID/slot_$LAUNCHER_TSK_ID"'
)
sweep_params = {
"ALPHA": [0.3, 0.5],
"BETA": [1, 2],
"GAMMA": ["a", "b"],
}
commands = generate_task_commands(base_command, sweep_params)
for cmd in commands:
print(cmd)
This produces one command per parameter combination, suitable for use in a PyLauncher tasklist.
Writing a PyLauncher Tasklist#
The companion helper function can write the generated commands directly to a file:
write_tasklist(commands, "runsList.txt")
Each command is written on its own line, matching PyLauncher’s expected format.
Notes on Environment Variables and Output Paths#
Environment Variables#
Variables such as:
$WORK$SLURM_JOB_ID$LAUNCHER_JID$LAUNCHER_TSK_ID
are not interpreted by Python. They are expanded by the shell at runtime when PyLauncher executes each command. This makes them ideal for constructing unique, collision-free output directories per task.
Output Location Strategy#
The example command writes outputs to:
$WORK/sweep_<jobid>/line_<launcher_jid>/slot_<launcher_task_id>
This path is outside the job execution directory. This design choice:
keeps the execution directory lightweight,
reduces I/O and packaging overhead at job completion, and
avoids unnecessarily archiving large sweep results when using Tapis-based workflows.
When to Use This Utility#
Use this pattern when you need to:
run large parametric sweeps,
generate PyLauncher
runsList.txtfiles programmatically,keep job outputs well-organized and scalable, and
minimize archiving overhead in HPC and Tapis workflows.
Files#
You can find these files in Community Data.
flatten_dict.py
from __future__ import annotations
from itertools import product
from pathlib import Path
from typing import Any, Dict, Iterable, List, Mapping, Sequence
import pandas as pd
def generate_task_commands(
base_command: str,
sweep: Mapping[str, Sequence[Any]],
*,
placeholder_style: str = "token",
) -> List[str]:
"""
Expand a command template into a list of commands for all combinations.
Parameters
----------
base_command
Command template containing placeholders for parameters. Example:
'python3 -u simulate.py --alpha ALPHA --beta BETA --gamma GAMMA --output ".../slot_$LAUNCHER_TSK_ID"'
Placeholders must match keys in `sweep`.
sweep
Mapping of placeholder -> list/tuple of values to sweep over.
Example: {"ALPHA": [0.3, 0.5], "BETA": [1, 2]}
placeholder_style
How placeholders appear in `base_command`:
- "token": placeholders are bare tokens like ALPHA, BETA (default)
- "braces": placeholders are in braces like {ALPHA}, {BETA}
Returns
-------
list of str
One command per combination of values, in deterministic order based
on the insertion order of `sweep`.
Notes
-----
- This function does *string substitution only*; it does not validate that
the command is runnable on your system.
- Environment variables such as $WORK or $SLURM_JOB_ID are left untouched.
"""
if not sweep:
return [base_command]
keys = list(sweep.keys())
value_lists = [sweep[k] for k in keys]
# Basic validation
for k, vals in sweep.items():
if not isinstance(vals, Sequence) or isinstance(vals, (str, bytes)):
raise TypeError(f"sweep[{k!r}] must be a non-string sequence of values.")
if len(vals) == 0:
raise ValueError(f"sweep[{k!r}] is empty; provide at least one value.")
commands: List[str] = []
for combo in product(*value_lists):
cmd = base_command
for k, v in zip(keys, combo):
if placeholder_style == "token":
cmd = cmd.replace(k, str(v))
elif placeholder_style == "braces":
cmd = cmd.replace("{" + k + "}", str(v))
else:
raise ValueError("placeholder_style must be 'token' or 'braces'.")
commands.append(cmd)
return commands
def write_tasklist(commands: Iterable[str], outfile: str | Path) -> None:
"""
Write commands to a PyLauncher tasklist file (one command per line).
"""
outpath = Path(outfile)
outpath.parent.mkdir(parents=True, exist_ok=True)
outpath.write_text("\n".join(commands) + "\n", encoding="utf-8")
# # ---------------------------------------------------------------------
# # Example usage (edit these for your sweep)
# # ---------------------------------------------------------------------
# inputFilename = "simulate.py"
# base_command = (
# f'python3 -u {inputFilename} '
# f'--alpha ALPHA --beta BETA --gamma GAMMA '
# f'--output "$WORK/sweep_$SLURM_JOB_ID/line_$LAUNCHER_JID/slot_$LAUNCHER_TSK_ID"'
# )
# # Define parameters for dynamic generation (add/remove keys freely)
# sweep_params: Dict[str, Sequence[Any]] = {
# "ALPHA": [0.3, 0.5, 3.7],
# "BETA": [1.1, 2, 3],
# "GAMMA": ["a", "b", "c"],
# }
# generated_tasks = generate_task_commands(base_command, sweep_params, placeholder_style="token")
# print(f"Generated {len(generated_tasks)} task commands:")
# print("-" * 60)
# for cmd in generated_tasks:
# print(cmd)
# # Optional: write a runsList file for PyLauncher
# # write_tasklist(generated_tasks, "runsList.txt")
def preview_sweep_table(sweep: Mapping[str, Sequence[Any]]) -> pd.DataFrame:
"""
Create a preview table of all parameter combinations in a sweep.
Parameters
----------
sweep
Mapping of parameter name -> sequence of values to sweep over.
Example:
{"ALPHA": [0.3, 0.5], "BETA": [1, 2], "GAMMA": ["a", "b"]}
Returns
-------
pandas.DataFrame
A table with one row per parameter combination. Column order follows
the insertion order of `sweep`.
Notes
-----
- This is intended for interactive notebooks. For large sweeps, consider
displaying only `.head()` or sampling rows.
- Numeric formatting (e.g., 3 significant figures) is handled by display
settings; see example below.
"""
if not sweep:
return pd.DataFrame()
keys = list(sweep.keys())
value_lists = [sweep[k] for k in keys]
# Basic validation
for k, vals in sweep.items():
if not isinstance(vals, Sequence) or isinstance(vals, (str, bytes)):
raise TypeError(f"sweep[{k!r}] must be a non-string sequence of values.")
if len(vals) == 0:
raise ValueError(f"sweep[{k!r}] is empty; provide at least one value.")
rows = [dict(zip(keys, combo)) for combo in product(*value_lists)]
return pd.DataFrame(rows)
# # Example
# df = preview_sweep_table(sweep_params)
# print(f"Total runs: {len(df)}")
# df.head(10)
# # Optional: Display Numeric Values with 3 Significant Figures
# # If you want the notebook preview to show numbers with 3 significant figures, you can set a pandas display format:
# import pandas as pd
# pd.options.display.float_format = "{:.3g}".format
# df = preview_sweep_table(sweep_params)
# df
# # Tip for Large Sweeps
# # If your sweep is very large, preview only a small portion:
# df.sample(20, random_state=0) # random sample of 20 rows
# # or
# df.head(20) # first 20 rows