Computing Fast at Scale

Running Julia on Slurm Cluster

Here is a simple example of how to run a julia script on a SLURM cluter. If you want to run a julia script with multiple workers, you need to allocate some nodes and then have the ClusterManager use srun to get those nodes to run julia.

See the main.sh script for an example.

Main.sh


#!/usr/bin/sh

# start an allocation with 4 nodes 2 cpus per node and run the sbatch script which will start multiple julia process in a Julia Cluster.
salloc --nodes=4 --cpus-per-task 2 | sbatch julia.sbatch

This runs the following batch script which acquires the resources within the allocation and starts the main julia process.

julia.sbatch

#!/bin/sh
#SBATCH --time=00:15:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1

# the resources requested above must be within the allocation

# we need to load the julia module so that the paths are set up right.
module load julia

# this starts the julia script which will srun its own processes
julia slurm.jl 

Julia is responsible for starting the workers with srun.

slurm.jl


#= Julia code for launching jobs on the slurm cluster.

This code is expected to be run from an sbatch script after a module load julia command has been run.
It starts the remote processes with srun within an allocation.
If you get an error make sure to Pkg.checkout("CluterManagers").

=#
try

	using ClusterManagers
catch
	Pkg.add("ClusterManagers")
	Pkg.checkout("ClusterManagers")
end

using ClusterManagers
# Arguments to the Slurm srun(1) command can be given as keyword
# arguments to addprocs.  The argument name and value is translated to
# a srun(1) command line argument as follows:
# 1) If the length of the argument is 1 => "-arg value",
#    e.g. t="0:1:0" => "-t 0:1:0"
# 2) If the length of the argument is > 1 => "--arg=value"
#    e.g. time="0:1:0" => "--time=0:1:0"
# 3) If the value is the empty string, it becomes a flag value,
#    e.g. exclusive="" => "--exclusive"
# 4) If the argument contains "_", they are replaced with "-",
#    e.g. mem_per_cpu=100 => "--mem-per-cpu=100"

np = 4 #
addprocs(SlurmManager(np), t="00:5:00")
	

hosts = []
pids = []
println("We are all connected and ready.")
for i in workers()
	host, pid = fetch(@spawnat i (gethostname(), getpid()))
	println(host, pid)
	push!(hosts, host)
	push!(pids, pid)
end

# The Slurm resource allocation is released when all the workers have
# exited
for i in workers()
	rmprocs(i)
end

You can print out the output using head *.out which will look like:

head *.out

==> julia/job0000.out <==
julia_worker:9009#172.30.0.146

==> julia/job0001.out <==
julia_worker:9009#172.30.0.147

==> julia/job0002.out <==
julia_worker:9009#172.30.0.148

==> julia/job0003.out <==
julia_worker:9009#172.30.0.149

==> julia/slurm-10495.out <==
connecting to worker 1 out of 4
connecting to worker 2 out of 4
connecting to worker 3 out of 4
connecting to worker 4 out of 4
We are all connected and ready.
node1.domain.tld146677
node2.domain.tld109050
node3.domain.tld140934
node4.domain.tld48648