Writing action commands in Python
In row, actions execute arbitrary shell commands. When your action is Python code, you must wrap it with command line parsing that takes directories as arguments. There are many ways you can achieve this goal.
This guide will show you how to structure all of your actions in a single file:
actions.py. This layout is inspired by row's predecessor signac-flow
and its project.py.
tip
If you are familiar with signac-flow, see migrating from signac-flow for many helpful tips.
To demonstrate the structure of a project, let's build a workflow that computes the sum of squares. The focus of this guide is on structure and best practices. You need to think about how your simulation, analysis, data processing, or other code will fit into this structure.
Create the project
First, create the row project:
row init sum_squares --signac
cd sum_squares
Then, create a file populate_workspace.py in the same directory as workflow.toml
with the contents:
"""Populate the workspace."""
import signac
N = 10
project = signac.get_project()
for x in range(N):
job = project.open_job({'x': x}).init()
Execute:
signac init
python populate_workspace.py
to initialize the signac workspace and populate it with directories.
note
If you are not familiar with signac, then go read the basic tutorial. Come back to the row documentation when you get to the section on workflows. For extra credit, reimplement the signac tutorial workflow in row after you finish reading this guide.
Write actions.py
Now, create a file actions.py with the contents:
"""Implement actions."""
import argparse
import os
import signac
def square(*jobs):
"""Implement the square action.
Squares the value `x` in each job's statepoint and writes the output to
`square.out` when complete.
"""
for job in jobs:
# If the product already exists, there is no work to do.
if job.isfile('square.out'):
continue
# Open a temporary file so that the action is not completed early or on error.
with open(job.fn('square.out.in_progress'), 'w') as file:
x = job.cached_statepoint['x']
file.write(f'{x**2}')
# Done! Rename the temporary file to the product file.
os.rename(job.fn('square.out.in_progress'), job.fn('square.out'))
def compute_sum(*jobs):
"""Implement the compute_sum action.
Prints the sum of `square.out` from each job directory.
"""
total = 0
for job in jobs:
with open(job.fn('square.out')) as file:
total += int(file.read())
print(total)
if __name__ == '__main__':
# Parse the command line arguments: python action.py --action <ACTION> [DIRECTORIES]
parser = argparse.ArgumentParser()
parser.add_argument('--action', required=True)
parser.add_argument('directories', nargs='+')
args = parser.parse_args()
# Open the signac jobs
project = signac.get_project()
jobs = [project.open_job(id=directory) for directory in args.directories]
# Call the action
globals()[args.action](*jobs)
This file defines each action as a function with the same name. These functions take
an array of jobs as an argument: def square(*jobs) and def compute_sum(*jobs). The
if __name__ == "__main__": block parses the command line arguments, forms an array of
signac jobs, and calls the requested action function.
note
This example demonstrates looping over directories in serial. However, this structure also gives you the ability to choose serial or parallel execution. Grouping many directories into a single cluster job submission will increase your workflow's throughput.
Write workflow.toml
Next, replace the contents of workflow.toml with the corresponding workflow:
[workspace]
value_file = "signac_statepoint.json"
[default.action]
command = "python actions.py --action $ACTION_NAME {directories}"
[[action]]
name = "square"
products = ["square.out"]
resources.walltime.per_directory = "00:00:01"
[[action]]
name = "compute_sum"
previous_actions = ["square"]
resources.walltime.per_directory = "00:00:01"
group.submit_whole = true
Both actions have the same command, set once by the default action:
[default.action]
command = "python actions.py --action $ACTION_NAME {directories}"
python actions.py executes the actions.py file above. It is given the argument
--action $ACTION_NAME which selects the Python function to call. Here $ACTION_NAME
is an environment variable that row sets in job scripts. The
last arguments are given by {directories}. Unlike {directory} shown in previous
tutorials, {directories} expands to ALL directories in the submitted group.
action.py is executed once and is free to process the list of directories in
any way it chooses (e.g. in serial, with
multiprocessing parallelism, multiple threads,
using MPI parallelism, ...).
Execute the workflow
Now, submit the square action:
row submit --action square
and you should see:
Submitting 1 job that may cost up to 0 CPU-hours.
Proceed? [Y/n]: y
[1/1] Submitting action 'square' on directory 04bb77c1bbbb40e55ab9eb22d4c88447 and 9 more.
Next, submit the compute_sum action:
row submit --action compute_sum
and you should see:
Submitting 1 job that may cost up to 0 CPU-hours.
Proceed? [Y/n]: y
[1/1] Submitting action 'compute_sum' on directory 04bb77c1bbbb40e55ab9eb22d4c88447 and 9 more.
285
It worked! sum printed the result 285.
note
If you are on a cluster, use --cluster=none or wait for jobs to complete
after submitting.
Applying this structure to your workflows
With this structure in place, you can add new actions to your workflow following these steps:
- Write a function
def action(*jobs)inactions.py. - Add:
to your[[action]] name = "action" # And other relevant optionsworkflow.tomlfile.
note
You may write functions that take only one job def action(job) without
modifying the given implementation of __main__. However, you will need to set
action.group.maximum_size = 1 or use {directory} to ensure that action.py is
given a single directory.
Next steps
In this guide, you learned how to write workflow action commands in Python. Now, you should know everything you need to build complex workflows with row and deploy them on HPC resources.
Development of row is led by the Glotzer Group at the University of Michigan.
Copyright © 2024-2025 The Regents of the University of Michigan.