Jobs¶
Module Functions¶
- dapi.jobs.generate_job_request(tapis_client, app_id, input_dir_uri, script_filename=None, app_version=None, job_name=None, description=None, tags=None, max_minutes=None, node_count=None, cores_per_node=None, memory_mb=None, queue=None, allocation=None, archive_system=None, archive_path=None, extra_file_inputs=None, extra_app_args=None, extra_env_vars=None, extra_scheduler_options=None, script_param_names=['Input Script', 'Main Script', 'tclScript'], input_dir_param_name='Input Directory', allocation_param_name='TACC Allocation')[source]¶
Generate a Tapis job request dictionary based on app definition and inputs.
Creates a properly formatted job request dictionary by retrieving the specified application details and applying user-provided overrides and additional parameters. The function automatically maps the script filename (if provided) and input directory to the appropriate app parameters. It dynamically reads the app definition to detect parameter names, determines whether to use appArgs or envVariables, and automatically populates all required parameters with default values when available.
- Parameters:
tapis_client (Tapis) – Authenticated Tapis client instance.
app_id (str) – The ID of the Tapis application to use for the job.
input_dir_uri (str) – Tapis URI to the input directory containing job files.
script_filename (str, optional) – Name of the main script file to execute. If None (default), no script parameter is added. This is suitable for apps like OpenFOAM that don’t take a script argument.
app_version (str, optional) – Specific app version to use. If None, uses latest.
job_name (str, optional) – Custom job name. If None, auto-generates based on app ID and timestamp.
description (str, optional) – Job description. If None, uses app description.
tags (List[str], optional) – List of tags to associate with the job.
max_minutes (int, optional) – Maximum runtime in minutes. Overrides app default.
node_count (int, optional) – Number of compute nodes. Overrides app default.
cores_per_node (int, optional) – Cores per node. Overrides app default.
memory_mb (int, optional) – Memory in MB. Overrides app default.
queue (str, optional) – Execution queue name. Overrides app default.
allocation (str, optional) – TACC allocation to charge for compute time.
archive_system (str, optional) – Archive system for job outputs. If “designsafe” is specified, uses “designsafe.storage.default”. If None, uses app default.
archive_path (str, optional) – Archive directory path. Can be a full path or just a directory name in MyData (e.g., “tapis-jobs-archive”). If None and archive_system is “designsafe”, defaults to “${EffectiveUserId}/tapis-jobs-archive/${JobCreateDate}/${JobUUID}”.
extra_file_inputs (List[Dict[str, Any]], optional) – Additional file inputs beyond the main input directory.
extra_app_args (List[Dict[str, Any]], optional) – Additional application arguments. Use for parameters expected in ‘appArgs’ by the Tapis app.
extra_env_vars (List[Dict[str, Any]], optional) – Additional environment variables. Use for parameters expected in ‘envVariables’ by the Tapis app (e.g., OpenFOAM solver, mesh). Each item should be a dict like {“key”: “VAR_NAME”, “value”: “var_value”}.
extra_scheduler_options (List[Dict[str, Any]], optional) – Additional scheduler options.
script_param_names (List[str], optional) – Parameter names/keys to check for script placement if script_filename is provided. Defaults to [“Input Script”, “Main Script”, “tclScript”].
input_dir_param_name (str, optional) – The ‘name’ of the fileInput in the Tapis app definition that corresponds to input_dir_uri. Defaults to “Input Directory”. The function will auto-detect the correct name from the app definition.
allocation_param_name (str, optional) – Parameter name for TACC allocation. Defaults to “TACC Allocation”.
- Returns:
Complete job request dictionary ready for submission to Tapis.
- Return type:
Dict[str, Any]
- Raises:
AppDiscoveryError – If the specified app cannot be found or details cannot be retrieved.
ValueError – If required parameters are missing, invalid, or if script_filename is provided but a suitable placement (matching script_param_names) cannot be found in the app’s parameterSet.
JobSubmissionError – If unexpected errors occur during job request generation.
- dapi.jobs.submit_job_request(tapis_client, job_request)[source]¶
Submit a pre-generated job request dictionary to Tapis.
Takes a complete job request dictionary (typically generated by generate_job_request) and submits it to the Tapis jobs service for execution. Prints the job request details before submission for debugging purposes.
- Parameters:
tapis_client (Tapis) – Authenticated Tapis client instance.
job_request (Dict[str, Any]) – Complete job request dictionary containing all necessary job parameters, file inputs, and configuration.
- Returns:
A SubmittedJob object for monitoring and managing the submitted job.
- Return type:
- Raises:
ValueError – If job_request is not a dictionary.
JobSubmissionError – If the Tapis job submission fails, with additional context from the HTTP request and response when available.
Example
>>> job_request = generate_job_request(...) >>> submitted_job = submit_job_request(client, job_request)
— Submitting Tapis Job Request — {
“name”: “matlab-r2023a-20231201_143022”, “appId”: “matlab-r2023a”, …
}¶
Job submitted successfully. UUID: 12345678-1234-1234-1234-123456789abc
- dapi.jobs.get_job_status(t, job_uuid)[source]¶
Get the current status of a job by UUID.
Standalone convenience function that creates a temporary SubmittedJob instance to retrieve the current status of an existing job.
- Parameters:
t (Tapis) – Authenticated Tapis client instance.
job_uuid (str) – The UUID of the job to check.
- Returns:
Current job status (e.g., “QUEUED”, “RUNNING”, “FINISHED”, “FAILED”).
- Return type:
str
- Raises:
JobMonitorError – If status retrieval fails.
TypeError – If t is not a Tapis instance.
ValueError – If job_uuid is empty or invalid.
Example
>>> status = get_job_status(client, "12345678-1234-1234-1234-123456789abc") >>> print(f"Job status: {status}")
- dapi.jobs.get_runtime_summary(t, job_uuid, verbose=False)[source]¶
Print a runtime summary for a job by UUID.
Standalone convenience function that creates a temporary SubmittedJob instance to analyze and print the runtime summary of an existing job.
- Parameters:
t (Tapis) – Authenticated Tapis client instance.
job_uuid (str) – The UUID of the job to analyze.
verbose (bool, optional) – If True, prints detailed job history events in addition to the runtime summary. Defaults to False.
- Raises:
JobMonitorError – If job details cannot be retrieved.
TypeError – If t is not a Tapis instance.
ValueError – If job_uuid is empty or invalid.
Example
>>> get_runtime_summary(client, "12345678-1234-1234-1234-123456789abc")
Runtime Summary¶
QUEUED time: 00:05:30 RUNNING time: 01:23:45 TOTAL time: 01:29:15 —————
- dapi.jobs.interpret_job_status(final_status, job_uuid=None)[source]¶
Print a user-friendly interpretation of a job status.
Provides human-readable explanations for various job status values, including both standard Tapis states and special monitoring states.
- Parameters:
final_status (str) – The job status to interpret. Can be a standard Tapis status (“FINISHED”, “FAILED”, etc.) or a special monitoring status (STATUS_TIMEOUT, STATUS_INTERRUPTED, etc.).
job_uuid (str, optional) – The job UUID to include in the message for context. If None, uses generic “Job” in the message. Defaults to None.
Example
>>> interpret_job_status("FINISHED", "12345678-1234-1234-1234-123456789abc") Job 12345678-1234-1234-1234-123456789abc completed successfully.
>>> interpret_job_status("FAILED") Job failed. Check logs or job details.
>>> interpret_job_status(STATUS_TIMEOUT, "12345678-1234-1234-1234-123456789abc") Job 12345678-1234-1234-1234-123456789abc monitoring timed out.
- dapi.jobs.list_jobs(tapis_client, app_id=None, status=None, limit=100, output='df', verbose=False)[source]¶
Fetch Tapis jobs with optional filtering.
Retrieves jobs from Tapis ordered by creation date (newest first) and optionally filters by app ID and/or status. Filters are applied client-side after fetching.
- Parameters:
tapis_client (Tapis) – Authenticated Tapis client instance.
app_id (str | None) – Filter by application ID (e.g., “opensees-mp-s3”).
status (str | None) – Filter by job status (e.g., “FINISHED”, “FAILED”). Case-insensitive.
limit (int) – Maximum number of jobs to fetch from Tapis. Defaults to 100.
output (str) – Output format. “df” returns a pandas DataFrame (default), “list” returns a list of dicts, “raw” returns the raw TapisResult objects.
verbose (bool) – If True, prints the number of jobs found.
- Returns:
“df”: pandas DataFrame with formatted datetime columns.
”list”: list of dicts with job metadata.
”raw”: list of TapisResult objects as returned by the API.
- Return type:
Depends on
output- Raises:
JobMonitorError – If the Tapis API call fails.
ValueError – If output format is not recognized.
Example
>>> df = list_jobs(t, app_id="matlab-r2023a", status="FINISHED") >>> jobs = list_jobs(t, output="list") >>> raw = list_jobs(t, limit=10, output="raw")
SubmittedJob¶
- class dapi.jobs.SubmittedJob(tapis_client, job_uuid)[source]¶
Bases:
objectRepresents a submitted Tapis job with methods for monitoring and management.
This class provides a high-level interface for interacting with Tapis jobs, including status monitoring, output retrieval, job cancellation, and runtime analysis. It caches job details and status to minimize API calls.
- Parameters:
tapis_client (Tapis)
job_uuid (str)
- uuid¶
The unique identifier of the Tapis job.
- Type:
str
- TERMINAL_STATES¶
List of job states that indicate completion.
- Type:
List[str]
Example
>>> job = SubmittedJob(client, "12345678-1234-1234-1234-123456789abc") >>> status = job.status >>> if status in job.TERMINAL_STATES: ... print("Job completed") >>> final_status = job.monitor(timeout_minutes=60)
- TERMINAL_STATES = ['FINISHED', 'FAILED', 'CANCELLED', 'STOPPED', 'ARCHIVING_FAILED']¶
- __init__(tapis_client, job_uuid)[source]¶
Initialize a SubmittedJob instance for an existing Tapis job.
- Parameters:
tapis_client (Tapis) – Authenticated Tapis client instance.
job_uuid (str) – The UUID of an existing Tapis job.
- Raises:
TypeError – If tapis_client is not a Tapis instance.
ValueError – If job_uuid is empty or not a string.
- property details: Tapis¶
Get cached job details, fetching from Tapis if not already cached.
- Returns:
- Complete job details object containing all job metadata,
configuration, and current state information.
- Return type:
Tapis
- property status: str¶
Get the current job status, using cached value when appropriate.
For terminal states, returns the cached status without making an API call. For non-terminal states, may fetch fresh status depending on cache state.
- Returns:
- Current job status (e.g., “QUEUED”, “RUNNING”, “FINISHED”, “FAILED”).
Returns STATUS_UNKNOWN if status cannot be determined.
- Return type:
str
- get_status(force_refresh=True)[source]¶
Get the current job status from Tapis API.
- Parameters:
force_refresh (bool, optional) – If True, always makes a fresh API call. If False, may return cached status. Defaults to True.
- Returns:
Current job status from Tapis API.
- Return type:
str
- Raises:
JobMonitorError – If status cannot be retrieved from Tapis.
- property last_message: str | None¶
Get the last status message recorded for the job.
Retrieves the most recent status message from the job details, which typically contains information about the current job state or any errors that have occurred.
- Returns:
- The last status message if available and non-empty,
otherwise None. Empty strings are treated as None.
- Return type:
str or None
Note
Returns None if job details cannot be retrieved or if no message is available. Does not raise exceptions for retrieval failures.
- monitor(interval=15, timeout_minutes=None)[source]¶
Monitor job status with progress bars until completion or timeout.
Continuously monitors the job status using tqdm progress bars to show progress through different job phases (waiting, running). Handles interruptions and errors gracefully.
- Parameters:
interval (int, optional) – Status check interval in seconds. Defaults to 15.
timeout_minutes (int, optional) – Maximum monitoring time in minutes. If None, uses the job’s maxMinutes from its configuration. Use -1 or 0 for unlimited monitoring. Defaults to None.
- Returns:
- Final job status. Can be a standard Tapis status (“FINISHED”, “FAILED”,
etc.) or a special monitoring status: - STATUS_TIMEOUT: Monitoring timed out - STATUS_INTERRUPTED: User interrupted monitoring (Ctrl+C) - STATUS_MONITOR_ERROR: Error occurred during monitoring
- Return type:
str
Example
>>> job = SubmittedJob(client, job_uuid) >>> final_status = job.monitor(interval=30, timeout_minutes=120) Monitoring Job: 12345678-1234-1234-1234-123456789abc Waiting for job to start: 100%|████████| 12 checks Monitoring job: 100%|████████████| 45/45 checks Status: FINISHED >>> if final_status == "FINISHED": ... print("Job completed successfully!")
- print_runtime_summary(verbose=False)[source]¶
Print a summary of job runtime phases and total execution time.
Analyzes the job’s execution history to show time spent in different phases (queued, running) and calculates the total runtime from submission to completion.
- Parameters:
verbose (bool, optional) – If True, prints detailed job history events in addition to the runtime summary. Defaults to False.
Example
>>> job.print_runtime_summary()
Runtime Summary¶
QUEUED time: 00:05:30 RUNNING time: 01:23:45 TOTAL time: 01:29:15 —————
>>> job.print_runtime_summary(verbose=True)
Detailed Job History: Event: JOB_NEW_STATUS, Detail: PENDING, Time: 2023-12-01T14:30:22.123456Z Event: JOB_NEW_STATUS, Detail: QUEUED, Time: 2023-12-01T14:30:25.234567Z …
Summary: QUEUED time: 00:05:30 RUNNING time: 01:23:45 TOTAL time: 01:29:15 —————
- property archive_uri: str | None¶
Get the Tapis URI of the job’s archive directory.
Returns the URI where job outputs are stored after completion. The archive directory contains all job outputs, logs, and metadata.
- Returns:
- Tapis URI of the archive directory if available,
otherwise None if archive information is not set.
- Return type:
str or None
Example
>>> uri = job.archive_uri >>> if uri: ... print(f"Job outputs at: {uri}") ... files = client.files.list(uri)
- list_outputs(path='/', limit=100, offset=0)[source]¶
List files and directories in the job’s archive directory.
- Parameters:
path (str, optional) – Relative path within the job archive to list. Defaults to “/” (archive root).
limit (int, optional) – Maximum number of items to return. Defaults to 100.
offset (int, optional) – Number of items to skip for pagination. Defaults to 0.
- Returns:
List of file and directory objects in the specified path.
- Return type:
List[Tapis]
- Raises:
FileOperationError – If archive information is not available, the path cannot be accessed, or listing fails.
Example
>>> outputs = job.list_outputs() >>> for item in outputs: ... print(f"{item.name} ({item.type})") tapisjob.out (file) tapisjob.err (file) results/ (dir)
>>> results = job.list_outputs(path="results/")
- download_output(remote_path, local_target)[source]¶
Download a specific file from the job’s archive directory.
- Parameters:
remote_path (str) – Relative path to the file within the job archive.
local_target (str) – Local filesystem path where the file should be saved.
- Raises:
FileOperationError – If archive information is not available or download fails.
Example
>>> job.download_output("tapisjob.out", "/local/job_output.txt") >>> job.download_output("results/data.txt", "/local/results/data.txt")
- get_output_content(output_filename, max_lines=None, missing_ok=True)[source]¶
Retrieve the content of a specific output file from the job’s archive.
Fetches and returns the content of a file from the job’s archive directory as a string. Useful for examining log files, output files, and error files.
- Parameters:
output_filename (str) – Name of the file in the job’s archive root (e.g., “tapisjob.out”, “tapisjob.err”, “results.txt”).
max_lines (int, optional) – If specified, returns only the last N lines of the file. Useful for large log files. Defaults to None (full file).
missing_ok (bool, optional) – If True and the file is not found, returns None. If False and not found, raises FileOperationError. Defaults to True.
- Returns:
- Content of the file as a string, or None if the file
is not found and missing_ok=True.
- Return type:
str or None
- Raises:
FileOperationError – If the job archive is not available, the file is not found (and missing_ok=False), or if there’s an error fetching the file.
Example
>>> # Get job output log >>> output = job.get_output_content("tapisjob.out") >>> if output: ... print(output)
>>> # Get last 50 lines of error log >>> errors = job.get_output_content("tapisjob.err", max_lines=50)
>>> # Require file to exist (raise error if missing) >>> results = job.get_output_content("results.txt", missing_ok=False)
- cancel()[source]¶
Attempt to cancel the job execution.
Sends a cancellation request to Tapis. Note that cancellation may not be immediate and depends on the job’s current state and the execution system.
- Raises:
JobMonitorError – If the cancellation request fails or encounters an error.
Note
Jobs that are already in terminal states cannot be cancelled. The method will print the current status if cancellation is not possible.
Example
>>> job.cancel() Attempting to cancel job 12345678-1234-1234-1234-123456789abc... Cancel request sent for job 12345678-1234-1234-1234-123456789abc. Status may take time to update.
Status Constants¶
- dapi.jobs.STATUS_TIMEOUT = 'TIMEOUT'¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to ‘utf-8’. errors defaults to ‘strict’.
- dapi.jobs.STATUS_INTERRUPTED = 'INTERRUPTED'¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to ‘utf-8’. errors defaults to ‘strict’.
- dapi.jobs.STATUS_MONITOR_ERROR = 'MONITOR_ERROR'¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to ‘utf-8’. errors defaults to ‘strict’.
- dapi.jobs.STATUS_UNKNOWN = 'UNKNOWN'¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to ‘utf-8’. errors defaults to ‘strict’.
- dapi.jobs.TAPIS_TERMINAL_STATES = ['FINISHED', 'FAILED', 'CANCELLED', 'STOPPED', 'ARCHIVING_FAILED']¶
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.