API Reference
The recommended way to use the API is to start by looking at the final function distribute_compute_config.write_config_to_file() and work backwards
to all the constituent functions.
- distribute_compute_config.apptainer_config(meta: Meta, description: Description, slurm: Slurm | None = None) ApptainerConfig
assemble the
metadata()anddescription()into a config file that can be written to disk- Parameters:
meta – output of
metadata()functiondescription – output of
description()functionslurm – global-level attribtues to pass to Slurm created from
slurm(). These values will be overriden by job-level Slurm attributes specified injob(). If you do not intend to run this job set on a slurm cluster, you may ignore this value.
You can write this description object to disk with
write_config_to_file()
- distribute_compute_config.metadata(namespace: str, batch_name: str, capabilities: List[str], matrix_username: str | None = None) Meta
construct the metadata
Metaobject for information about what this job batch will run- Parameters:
namespace (str) – namespace where several
batch_nameruns may livebatch_name (str) – the name of this batch of jobs
capabilities (List[str]) – required capabilities for your job
matrix_username (Optional[str]) – a matrix username in the format
@your_username:homeserver_url
for an apptainer job the
capabilitiesare is simply["apptainer"]. If you need GPU capabilities, use["apptainer", "gpu"].Example:
import distribute_compute_config as distribute matrix_user = "@matrx_id:matrix.org" batch_name = "test_batch" namespace = "test_namespace" capabilities = ["apptainer"] meta = distribute.metadata(namespace, batch_name, capabilities, matrix_user)
- distribute_compute_config.description(initialize: Initialize, jobs: List[Job]) Description
combines initialization information with a list of jobs to describe the solver files, how they should be started, and what they will run
- Parameters:
initialize – You can create an
Initializestruct withinitialize()jobs – information for each job can be created from the
job()function.
Example:
import distribute_compute_config as distribute initialize = distribute.initialize( sif_path="./some/path/to/file.sif", required_files=[], required_mounts=[] ) job_1_config_file = distribute.file( "./path/to/config1.json", alias="config.json", relative=True ) job_1 = distribute.job("job_1", [job_1_config_file]) description = distribute.description(initialize, jobs=[job_1])
- distribute_compute_config.file(path: str, relative=False, alias=None) File
create a file to appear in the
/inputdirectory of the solver- Parameters:
path – a path to the file on disk.
pathshould be an absolute, unlessrelative=Trueis also specified. Ifrelative=Trueis not speficied, the full directory structure to the file must already exist.relative – a flag for if the
pathspecified is a relative path or not. Defaults to Falsealias – how the file should be renamed when it appears in the
/inputdirectory of your solver. If noaliasis specified, the current name of the file will be used.
For example, a file with a
path./path/to/config_1.jsonwill appear as/input/config_1.json. withalias="config.json", this file will appear as/input/config.jsonExample
import distribute_compute_config as config # config1.json appears as `/input/config.json` config_file_1 = distribute.file( "./path/to/config1.json", alias="config.json", relative=True ) # config2.json appears as `/input/config2.json` config_file_2 = distribute.file( "./path/to/config2.json", relative=True ) # config3.json appears as `/input/config3.json`, folder structure # `/root/path/to/` must already exist config_file_3 = distribute.file("/root/path/to/config3.json")
- distribute_compute_config.initialize(sif_path: str, required_files: List[File], required_mounts: List[str]) Initialize
create the initialize section for loading
apptainer.siffiles, required container mounts, and input files present in all runs.Also see the documentation for creating a
Filewith thefile()function- Parameters:
sif_path – The path to the
.siffile produced byapptainer buildrequired_files – These files appear in the
/inputdirectory of every job, in addition to theFilespecified in eachJobrequired_mounts – a list of strings to paths inside the container that should be mutable.
Example:
import distribute_compute_config as distribute sif_path = "./path/to/some/container.sif" initial_condition = distribute.file( "./path/to/some/file.h5", relative=True, alias="initial_condition.h5" ) required_files = [initial_condition] required_mounts = ["/solver/extra_mount"] initialize = distribute.initialize(sif_path, required_files, required_mounts)
- distribute_compute_config.job(name: str, required_files: List[File], slurm: Slurm | None = None) Job
creates a job from its name and the required input files
once you have a list of jobs that should be run in the batch, move on to creating a
description()Also see the documentation for creating a
Filewith thefile()function- Parameters:
name – the name of the job. It should be unique in combination with the
batch_nameof this job.required_files – a python list of files that should appear in the
/inputdirectory when this job is run, along with therequired_filesspecified ininitialize().slurm – job-level attribtues to pass to Slurm created in
slurm(). These values will override global Slurm attributes specified inapptainer_config(). If you do not intend to run this job set on a slurm cluster, you may ignore this value.
nameshould therefore be unique to this batch sincebatch_nameremains constant. thenameshould be slightly descriptive of the content of what the job will be running. This will make it easier to usedistribute pullto download the files laterFiletypes can be constructed with thefile()functionExample:
import distribute_compute_config as distribute job_1_config_file = distribute.file( "./path/to/config1.json", alias="config.json", relative=True ) job_1_required_files = [job_1_config_file] job_1 = distribute.job("job_1", job_1_required_files)
- distribute_compute_config.write_config_to_file(config: ApptainerConfig, path: str)
write an
apptainer_config()to a path- Parameters:
config – output of
apptainer_config()functionpath – the path that the config file should be written to, usually with the name
distribute-jobs.yaml
Example:
See the User Documentation page in on the python api for a worked example
- distribute_compute_config.slurm(job_name: str | None = None, output: str | None = None, nodes: int | None = None, ntasks: int | None = None, cpus_per_task: int | None = None, mem_per_cpu: str | None = None, hint: str | None = None, time: str | None = None, partition: str | None = None, account: str | None = None, mail_user: str | None = None, mail_type: str | None = None) Slurm
Configure the attributes that will be passed to Slurm if you intend to run the job set on a cluster.
- Parameters:
job_name – the name of the job as it will appear in Slurm. defaults to the job name chosen in the
distributeconfiguration fileoutput – file name where where the stdout of the process will be dumped.
nodes – the number of Slurm nodes to use. This essentially corresponds to the number of physical CPUs units you would like your job to run across. On pronghorn, with no multithreading, more than 32
ntaskswill require more than 1 node.ntasks – the number of tasks to use in Slurm. This should correspond to the number of MPI processes you would like to use. If you would execute your job with
mpirun -np 4, then this value would be4.cpus_per_task – The number of CPUs that each task will use. Most likely, this parameter should be set to
1. If you have 16 physical cores on a CPU, andntasks=4, and you want to fully utilize the CPU, this number would be set to4mem_per_cpu – The amount of memory that each cpu should be allocated. To specify an amount in gigabytes (megabytes), use
G(M) as the suffix. For example, requesting 100 megabytes of memory for each cpu would be100Mhint – Any hints you want to pass to Slurm. This can be blank, or possibly
nomultithreadas well as any other valid Slurm hint.time – The amount of time your job will require, in the format of
HH:MM:SS. For a 1 hour and 20 minute job, this would be01:20:00partition – The slurm partition you wish to run on. This is probably
cpu-core-0on pronghorn for CPU tasks.account – The billing account attached to this job.
mail_user – An email address to send mail to after the job completes.
mail_type – Email type. Possibly
ALL
When generating slurm routines, distribute will default all values to the root level
slurmvalue specified inapptainer_config(), and then override of these values with the values ofjob(). With this, you may set global attribtues (such as the number of tasks to use, the number of nodes to request, your email, etc), and then override them at the job-level with specifics that each job requires.If all jobs that you are submitting in this batch are homogeneous (for example, same grid size, time step, etc), then there is little need to specify job-level slurm attribtues.
Example:
slurm = distribute.slurm( output = "output.txt", nodes = 1, ntasks = 4, cpus_per_task = 1, # 10 megabytes of memory allocated mem_per_cpu = "10M", hint = "nomultithread", # 30 minutes of runtime time = "00:30:00", partition = "cpu-core-0", account = "my_account" )