SlideShare a Scribd company logo
Introduction to SLURM
Ismael Fernández Pavón Cristian Gomollón Escribano
08 / 10 / 2019
What is SLURM?
What is SLURM?
• Allocates access to resources for some duration of time.
• Provides a framework for starting, executing, and
monitoring work (normally a parallel job).
• Arbitrates contention for resources by managing
a queue of pending work.
Cluster manager and job scheduler
system for large and small Linux
clusters.
LoadLeveler (IBM)
LSF
SLURM
PBS Pro
Resource Managers Scheduler
What is SLURM?
ALPS (Cray)
Torque
Maui
Moab
✓ Open source
✓ Fault-tolerant
✓ Highly scalable
LoadLeveler (IBM)
LSF
SLURM
PBS Pro
Resource Managers Scheduler
What is SLURM?
ALPS (Cray)
Torque
Maui
Moab
SLURM: Resource Management
Node
CPU
(Core)
CPU
(Thread)
SLURM: Resource Management
Nodes:
• Baseboards, Sockets,
Cores, Threads
• CPUs (Core or thread)
• Memory size
• Generic resources
• Features
• State
− Idle − Completing
− Mix − Drain / ing
− Alloc − Down
SLURM: Resource Management
Partitions:
• Associatedwith specific
set of nodes
• Nodes can be in more
than one partition
• Job size and time limits
• Access control list
• State information
− Up
− Drain
− Down
Partitions
Allocated
cores
SLURM: Resource Management
Allocated
memory
Jobs:
• ID (a number)
• Name
• Time limit
• Size specification
• Node features required
• Other Jobs Dependency
• Quality Of Service (QoS)
• State (Pending, Running,
Suspended, Canceled,
Failed, etc.)
Core
used
SLURM: Resource Management
Memory
used
Jobs Step:
• ID (a number)
• Name
• Time limit (maximum)
• Size specification
• Node features required
in allocation
SLURM: Resource Management
FULL CLUSTER!
✓ Job scheduling
SLURM: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
FIFO Scheduler
Backfill Scheduler
Resources
Time
SLURM: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
Backfill Scheduler:
• Based on the job request, resources available, and
policy limits imposed.
• Starts with job priority.
• Results in a resource allocation over a period.
SLURM: Job Scheduling
Backfill Scheduler:
• Starts with job priority.
Job_priority = site_factor +
(PriorityWeightAge) * (age_factor) +
(PriorityWeightAssoc) * (assoc_factor) +
(PriorityWeightFairshare) * (fair-share_factor) +
(PriorityWeightJobSize) * (job_size_factor) +
(PriorityWeightPartition) * (partition_factor) +
(PriorityWeightQOS) * (QOS_factor) +
SUM(TRES_weight_cpu * TRES_factor_cpu,
TRES_weight_<type> * TRES_factor_<type>,
...) - nice_factor
•sbatch – Submit a batch script to Slurm.
•salloc – Request resources to SLURM for an interactive
job.
•srun – Start a new job step.
•scancel – Cancel a job.
SLURM: Commands
• sinfo – Report system status (nodes, queues, etc.).
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
rest up infinite 3 idle~ pirineusgpu[1-2],pirineusknl1
rest up infinite 1 idle canigo2
std* up infinite 11 idle~ pirineus[14,19-20,23,25-26,29-30,33-34,40]
std* up infinite 18 mix pirineus[13,15-16,18,21-22,27-28,35,38-39,41-45,48-49]
std* up infinite 7 alloc pirineus[17,24,31,36-37,46-47]
gpu up infinite 2 alloc pirineusgpu[3-4]
knl up infinite 3 idle~ pirineusknl[2-4]
mem up infinite 1 mix canigo1
class_a up infinite 8 mix canigo1,pirineus[1-7]
class_a up infinite 1 alloc pirineus8
class_b up infinite 8 mix canigo1,pirineus[1-7]
class_b up infinite 1 alloc pirineus8
class_c up infinite 8 mix canigo1,pirineus[1-7]
class_c up infinite 1 alloc pirineus8
std_curs up infinite 5 idle~ pirineus[9-12,50]
gpu_curs up infinite 2 idle~ pirineusgpu[1-2]
SLURM: Commands
• sinfo – Report system status (nodes, queues, etc.).
sinfo -Np class_a -O
"Nodelist,Partition,StateLong,CpusState,Memory,Freemem"
NODELIST PARTITION STATE CPUS(A/I/O/T) MEMORY FREE_MEM
canigo1 class_a mixed 113/79/0/192 3094521 976571
pirineus1 class_a mixed 20/28/0/48 191904 120275
pirineus2 class_a mixed 24/24/0/48 191904 185499
pirineus3 class_a mixed 46/2/0/48 191904 54232
pirineus4 class_a mixed 38/10/0/48 191904 58249
pirineus5 class_a mixed 38/10/0/48 191904 58551
pirineus6 class_a mixed 36/12/0/48 191904 114986
pirineus7 class_a mixed 38/10/0/48 191904 58622
pirineus8 class_a allocated 48/0/0/48 191904 165682
SLURM: Commands
1193936 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority)
1195916 gpu A2B2_APO_n ubator01 PD 0:00 1 24 3900M (Priority)
1195915 gpu A2B2_APO_n ubator01 PD 0:00 1 24 3900M (Priority)
1195920 gpu A2B2_APO_n ubator01 PD 0:00 1 24 3900M (Priority)
1195927 gpu uncleaved_ ubator02 PD 0:00 1 24 3900M (Priority)
1195928 gpu uncleaved_ ubator02 PD 0:00 1 24 3900M (Priority)
1195929 gpu cleaved_wt ubator02 PD 0:00 1 24 3900M (Priority)
1138005 std U98-CuONN1 imoreira PD 0:00 1 12 3998M (Priority)
1195531 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority)
1195532 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority)
1195533 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority)
1195536 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority)
1195597 std sh gomollon R 20:04:04 4 24 6000M pirineus[31,38,44,47]
1195579 class_a rice crag49366 R 6:44:45 1 8 3998M pirineus5
1195576 class_a rice crag49366 R 6:36:48 1 8 3998M pirineus2
1195578 class_a rice crag49366 R 6:37:53 1 8 3998M pirineus4
• squeue – Report job and job step status.
SLURM: Commands
• scontrol – Administrator tool to view and/or update
system, job, step, partition or reservation status.
scontrol hold <jobid>
scontrol release <jobid>
scontrol show job <jobid>
SLURM: Commands
JobId=1195597 JobName=sh
UserId=gomollon(80128) GroupId=csuc(10000) MCS_label=N/A
Priority=100176 Nice=0 Account=csuc QOS=test
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=20:09:58 TimeLimit=5-00:00:00 TimeMin=N/A
SubmitTime=2019-10-07T12:21:29 EligibleTime=2019-10-07T12:21:29
StartTime=2019-10-07T12:21:29 EndTime=2019-10-12T12:21:30 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=std AllocNode:Sid=login2:20262
ReqNodeList=(null) ExcNodeList=(null)
NodeList=pirineus[31,38,44,47]
BatchHost=pirineus31
NumNodes=4 NumCPUs=24 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=24,mem=144000M,node=4
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryCPU=6000M MinTmpDiskNode=0
Features=(null) Gres=(null) Reservation=(null)
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=(null)
WorkDir=/home/gomollon
Power=
SLURM: Commands
Jobs:
State Information
Enjoy SLURM!
How to launch jobs?
Login on CSUC infrastructure
• Login
ssh –p 2122 username@hpc.csuc.cat
• Transferfiles
scp -P 2122 local_file username@hpc.csuc.cat:[path to your folder]
sftp -oPort=2122 username@hpc.csuc.cat
• Useful paths
Name Variable Availability Quote/project Time limit Backup
/home/$user $HOME global 4 GB unlimited Yes
/scratch/$user $SCRATCH global unlimited 30 days No
/scratch/$user/tmp/jobid $TMPDIR Local to each node job file limit 1 week No
/tmp/$user/jobid $TMPDIR Local to each node job file limit 1 week No
• Get HC consumption
consum -a ‘any’ (group consumption)
consum -a ‘any’ -u ‘nom_usuari’ (user consumption)
Batch job submission: Default settings
• 4Gb/core (excepting on mem partition).
• 24Gb/core on mem partition.
• 1 core on std and mem partitions.
• 24 cores on gpu partition
• The whole node on KNL partition
• Non-exclusive, multinode job.
• Scratch and Output directory are the submit directory.
Batch job submission
• Basic Linux commands:
Description Command Exemple
List files ls ls /home/user
Making folders mkdir mkdir /home/prova
Changing folder cd cd /home/prova
Copy files cp cp nom_arxiu1 nom_arxiu2
Move file mv mv /home/prova.txt /cescascratch/prova.txt
Delete file rm rm filename
Print file content cat cat filename
Find string into files grep grep ‘word’ filename
List last lines on file tail tail filename
• Text editors : vim, nano, emacs,etc.
• More detailed info and options about the commands:
‘command’ –help
man ‘command’
Scheduler directives/Options
• -c, --cpus-per-task=ncpus number of cpus required per task
• --gres=list required generic resources
• -J, --job-name=jobname name of job
• -n, --ntasks=ntasks number of tasks to run
• --ntasks-per-node=n number of tasks to invoke on each node
• -N, --nodes=N number of nodes on which to run (N = min[-max])
• -o, --output=out file for batch script's standard output
• -p, --partition=partition partition requested
• -t, --time=minutes time limit (format: dd-hh:mm)
• -C, --constraint=list specify a list of constraints(mem, vnc , ....)
• --mem=MB minimum amount of total real memory
• --reservation=name allocate resources from named reservation
• -w, --nodelist=hosts... request a specific list of hosts
• --mem-per-cpu=MB amount of real memory per allocated core
Scheduler directives/Options
#!/bin/bash
#SBATCH–jtreball_prova
#SBATCH-o treball_prova.log
#SBATCH-e treball_prova.err
#SBATCH-p std
#SBATCH-n 48
module load mpi/intel/openmpi/3.1.0
cp –r $input $SCRATCH
Cd $SCRATCH
srun $APPLICATION
mkdir -p $OUTPUT_DIR
cp -r * $output
Batch job submission
Schedulerdirectives
Setting up the environment
Move the input files to the working directory
Launch the application(similar to mpirun)
Create the output folderand move the outputs
Gaussian 16 Example
#!/bin/bash
#SBATCH-j gau16_test
#SBATCH-o gau_test_%j.log
#SBATCH-e gau_test_%j.err
#SBATCH-p std
#SBATCH-n 1
#SBATCH-c 16
module load gaussian/g16b1
INPUT_DIR=/$HOME/gaussian_test/inputs
OUTPUT_DIR=$HOME/gaussian_test/outputs
cd $SCRATCH
cp -r $INPUT_DIR/*.
g16 < input.gau > output.out
mkdir -p $OUTPUT_DIR
cp -r * $output
Vasp 5.4.4 Example
#!/bin/bash
#SBATCH-j vasp_test_%j
#SBATCH-o vasp_test_%j.log
#SBATCH–e vasp_test_%j.err
#SBATCH-p std
#SBATCH-n 24
module load vasp/5.4.4
INPUT_DIR=/$HOME/vasp_test/inputs
OUTPUT_DIR=$HOME/vasp_test/outputs
cd $SCRATCH
cp -r $INPUT_DIR/*.
srun `which vasp_std`
mkdir -p $OUTPUT_DIR
cp -r * $output
Gromacs Example
#!/bin/bash
#SBATCH--job-name=gromacs
#SBATCH--output=gromacs_%j.out
#SBATCH--error=gromacs_%j.err
#SBATCH-n 24
#SBATCH--gres=gpu:2
#SBATCH-N 1
#SBATCH-p gpu
#SBATCH-c 2
#SBATCH--time=00:30:00
module load gromacs/2018.4_mpi
cd $SHAREDSCRATCH
cp -r $HOME/SLMs/gromacs/CASE/*.
srun `which gmx_mpi`mdrun -v -deffnm input_system -ntomp $SLURM_CPUS_PER_TASK -nb
gpu -npme 12 -dlb yes -pin on –gpu_id 01
cp –r * /scratch/$USER/gromacs/CASE/output/
ANSYS Fluent Example
#!/bin/bash
#SBATCH-j truck.cas
#SBATCH-o truck.log
#SBATCH-e truck.err
#SBATCH-p std
#SBATCH-n 16
module load toolchains/gcc_mkl_ompi
INPUT_DIR=$HOME/FLUENT/inputs
OUTPUT_DIR=$HOME/FLUENT/outputs
cd $SCRATCH
cp -r $INPUT_DIR/*.
`/prod/ANSYS16/v162/fluent/bin/fluent3ddp –t $SLURM_NCPUS -mpi=hp -g -i input1_50.txt
mkdir -p $OUTPUT_DIR
cp -r * $output
Best Practices
• Use $SCRATCHas workingdirectory.
• Move only the necessaryfiles(notall files in the folder each time).
• Try to keep importantfiles only at $HOME
• Try to choose the partition and resoruces whose mostfit to your job.

More Related Content

PPTX
Google file system GFS
PPTX
2022.03.23 Conda and Conda environments.pptx
PPTX
PDF
Introduction to Cassandra
PPTX
Synchronization in distributed computing
PDF
Everything You Need To Know About ChatGPT
PDF
System Design Interviews.pdf
Google file system GFS
2022.03.23 Conda and Conda environments.pptx
Introduction to Cassandra
Synchronization in distributed computing
Everything You Need To Know About ChatGPT
System Design Interviews.pdf

What's hot (20)

PDF
Process Scheduler and Balancer in Linux Kernel
PDF
Systemd: the modern Linux init system you will learn to love
PDF
Linux scheduler
PDF
PDF
Ansible - Introduction
PDF
New Ways to Find Latency in Linux Using Tracing
PDF
Linux Internals - Part II
PDF
Linux Basic Commands
PPTX
Apache Airflow overview
PDF
BPF Internals (eBPF)
PPTX
Using Wildcards with rsyslog's File Monitor imfile
PDF
LISA2019 Linux Systems Performance
PDF
Introduction To Linux Kernel Modules
PPTX
Cgroups, namespaces and beyond: what are containers made from?
PDF
PostgreSQL WAL for DBAs
PPTX
Automating with Ansible
DOCX
Linux or unix interview questions
Process Scheduler and Balancer in Linux Kernel
Systemd: the modern Linux init system you will learn to love
Linux scheduler
Ansible - Introduction
New Ways to Find Latency in Linux Using Tracing
Linux Internals - Part II
Linux Basic Commands
Apache Airflow overview
BPF Internals (eBPF)
Using Wildcards with rsyslog's File Monitor imfile
LISA2019 Linux Systems Performance
Introduction To Linux Kernel Modules
Cgroups, namespaces and beyond: what are containers made from?
PostgreSQL WAL for DBAs
Automating with Ansible
Linux or unix interview questions
Ad

Similar to Introduction to SLURM (20)

PDF
Debugging Ruby Systems
PPTX
PDF
Performance tweaks and tools for Linux (Joe Damato)
PDF
Linux Du Jour
PDF
Solr Troubleshooting - TreeMap approach
PDF
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
PDF
Debugging Ruby
PPTX
Using Libtracecmd to Analyze Your Latency and Performance Troubles
PDF
Linux Performance Tools 2014
PDF
Summit demystifying systemd1
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PPTX
Basics of unix
PPTX
Infrastructure review - Shining a light on the Black Box
PPTX
Working with clusters, shell profiles, UNIX extras..pptx
PDF
Training Slides: 104 - Basics - Working With Command Line Tools
PDF
Unit 10 investigating and managing
PDF
Designing Tracing Tools
Debugging Ruby Systems
Performance tweaks and tools for Linux (Joe Damato)
Linux Du Jour
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Debugging Ruby
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Linux Performance Tools 2014
Summit demystifying systemd1
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
Basics of unix
Infrastructure review - Shining a light on the Black Box
Working with clusters, shell profiles, UNIX extras..pptx
Training Slides: 104 - Basics - Working With Command Line Tools
Unit 10 investigating and managing
Designing Tracing Tools
Ad

More from CSUC - Consorci de Serveis Universitaris de Catalunya (20)

PDF
Novetats a l'Anella Científica, per Maria Isabel Gandia
PDF
IPCEI Cloud - Using European Open-Source Technologies to Build a Sovereign, M...
PDF
L'impacte geopolític a les TIC, per Genís Roca
PDF
Pirineus OnDemand: l'accés fàcil al càlcul científic del CSUC
PDF
Funcionament del servei de càlcul científic del CSUC
PDF
El servei de càlcul científic del CSUC: presentació
PPTX
RDM Training: Publish research data with the Research Data Repository
PPTX
Facilitar a gestão, a visibilidade e a reutilização dos dados de investigação...
PDF
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
PDF
Construint comunitat i governança: ​ el rol del CSUC en el cicle de vida de l...
PDF
Formació RDM: Publicar dades de recerca amb el Repositori de Dades de Recerca
PDF
Publica les teves dades de recerca al Repositori de Dades de Recerca
PDF
Com fer un pla de gestió de dades amb l'eiNa DMP (en català)
PDF
Los datos abiertos: movimiento en expansión
PDF
Dataverse as a FAIR Data Repository (Mercè Crosas)
PDF
From Automation to Autonomous Networks with AI
PDF
Jornada de presentació de les noves infraestructures de càlcul i emmagatzematge
PDF
Les subvencions del Departament de Cultura per a projectes relatius al patrim...
PDF
Presentació dels serveis d'eScire (patrocinador)
PDF
L'Arxiu Històric de la Biblioteca del Centre de Lectura de Reus
Novetats a l'Anella Científica, per Maria Isabel Gandia
IPCEI Cloud - Using European Open-Source Technologies to Build a Sovereign, M...
L'impacte geopolític a les TIC, per Genís Roca
Pirineus OnDemand: l'accés fàcil al càlcul científic del CSUC
Funcionament del servei de càlcul científic del CSUC
El servei de càlcul científic del CSUC: presentació
RDM Training: Publish research data with the Research Data Repository
Facilitar a gestão, a visibilidade e a reutilização dos dados de investigação...
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Construint comunitat i governança: ​ el rol del CSUC en el cicle de vida de l...
Formació RDM: Publicar dades de recerca amb el Repositori de Dades de Recerca
Publica les teves dades de recerca al Repositori de Dades de Recerca
Com fer un pla de gestió de dades amb l'eiNa DMP (en català)
Los datos abiertos: movimiento en expansión
Dataverse as a FAIR Data Repository (Mercè Crosas)
From Automation to Autonomous Networks with AI
Jornada de presentació de les noves infraestructures de càlcul i emmagatzematge
Les subvencions del Departament de Cultura per a projectes relatius al patrim...
Presentació dels serveis d'eScire (patrocinador)
L'Arxiu Històric de la Biblioteca del Centre de Lectura de Reus

Recently uploaded (20)

PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
PDF
Electronic commerce courselecture one. Pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Building Integrated photovoltaic BIPV_UPV.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
sap open course for s4hana steps from ECC to s4
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
Electronic commerce courselecture one. Pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

Introduction to SLURM

  • 1. Introduction to SLURM Ismael Fernández Pavón Cristian Gomollón Escribano 08 / 10 / 2019
  • 3. What is SLURM? • Allocates access to resources for some duration of time. • Provides a framework for starting, executing, and monitoring work (normally a parallel job). • Arbitrates contention for resources by managing a queue of pending work. Cluster manager and job scheduler system for large and small Linux clusters.
  • 4. LoadLeveler (IBM) LSF SLURM PBS Pro Resource Managers Scheduler What is SLURM? ALPS (Cray) Torque Maui Moab
  • 5. ✓ Open source ✓ Fault-tolerant ✓ Highly scalable LoadLeveler (IBM) LSF SLURM PBS Pro Resource Managers Scheduler What is SLURM? ALPS (Cray) Torque Maui Moab
  • 7. Node CPU (Core) CPU (Thread) SLURM: Resource Management Nodes: • Baseboards, Sockets, Cores, Threads • CPUs (Core or thread) • Memory size • Generic resources • Features • State − Idle − Completing − Mix − Drain / ing − Alloc − Down
  • 8. SLURM: Resource Management Partitions: • Associatedwith specific set of nodes • Nodes can be in more than one partition • Job size and time limits • Access control list • State information − Up − Drain − Down Partitions
  • 9. Allocated cores SLURM: Resource Management Allocated memory Jobs: • ID (a number) • Name • Time limit • Size specification • Node features required • Other Jobs Dependency • Quality Of Service (QoS) • State (Pending, Running, Suspended, Canceled, Failed, etc.)
  • 10. Core used SLURM: Resource Management Memory used Jobs Step: • ID (a number) • Name • Time limit (maximum) • Size specification • Node features required in allocation
  • 11. SLURM: Resource Management FULL CLUSTER! ✓ Job scheduling
  • 12. SLURM: Job Scheduling Scheduling: The process of determining next job to run and on which resources. FIFO Scheduler Backfill Scheduler Resources Time
  • 13. SLURM: Job Scheduling Scheduling: The process of determining next job to run and on which resources. Backfill Scheduler: • Based on the job request, resources available, and policy limits imposed. • Starts with job priority. • Results in a resource allocation over a period.
  • 14. SLURM: Job Scheduling Backfill Scheduler: • Starts with job priority. Job_priority = site_factor + (PriorityWeightAge) * (age_factor) + (PriorityWeightAssoc) * (assoc_factor) + (PriorityWeightFairshare) * (fair-share_factor) + (PriorityWeightJobSize) * (job_size_factor) + (PriorityWeightPartition) * (partition_factor) + (PriorityWeightQOS) * (QOS_factor) + SUM(TRES_weight_cpu * TRES_factor_cpu, TRES_weight_<type> * TRES_factor_<type>, ...) - nice_factor
  • 15. •sbatch – Submit a batch script to Slurm. •salloc – Request resources to SLURM for an interactive job. •srun – Start a new job step. •scancel – Cancel a job. SLURM: Commands
  • 16. • sinfo – Report system status (nodes, queues, etc.). PARTITION AVAIL TIMELIMIT NODES STATE NODELIST rest up infinite 3 idle~ pirineusgpu[1-2],pirineusknl1 rest up infinite 1 idle canigo2 std* up infinite 11 idle~ pirineus[14,19-20,23,25-26,29-30,33-34,40] std* up infinite 18 mix pirineus[13,15-16,18,21-22,27-28,35,38-39,41-45,48-49] std* up infinite 7 alloc pirineus[17,24,31,36-37,46-47] gpu up infinite 2 alloc pirineusgpu[3-4] knl up infinite 3 idle~ pirineusknl[2-4] mem up infinite 1 mix canigo1 class_a up infinite 8 mix canigo1,pirineus[1-7] class_a up infinite 1 alloc pirineus8 class_b up infinite 8 mix canigo1,pirineus[1-7] class_b up infinite 1 alloc pirineus8 class_c up infinite 8 mix canigo1,pirineus[1-7] class_c up infinite 1 alloc pirineus8 std_curs up infinite 5 idle~ pirineus[9-12,50] gpu_curs up infinite 2 idle~ pirineusgpu[1-2] SLURM: Commands
  • 17. • sinfo – Report system status (nodes, queues, etc.). sinfo -Np class_a -O "Nodelist,Partition,StateLong,CpusState,Memory,Freemem" NODELIST PARTITION STATE CPUS(A/I/O/T) MEMORY FREE_MEM canigo1 class_a mixed 113/79/0/192 3094521 976571 pirineus1 class_a mixed 20/28/0/48 191904 120275 pirineus2 class_a mixed 24/24/0/48 191904 185499 pirineus3 class_a mixed 46/2/0/48 191904 54232 pirineus4 class_a mixed 38/10/0/48 191904 58249 pirineus5 class_a mixed 38/10/0/48 191904 58551 pirineus6 class_a mixed 36/12/0/48 191904 114986 pirineus7 class_a mixed 38/10/0/48 191904 58622 pirineus8 class_a allocated 48/0/0/48 191904 165682 SLURM: Commands
  • 18. 1193936 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority) 1195916 gpu A2B2_APO_n ubator01 PD 0:00 1 24 3900M (Priority) 1195915 gpu A2B2_APO_n ubator01 PD 0:00 1 24 3900M (Priority) 1195920 gpu A2B2_APO_n ubator01 PD 0:00 1 24 3900M (Priority) 1195927 gpu uncleaved_ ubator02 PD 0:00 1 24 3900M (Priority) 1195928 gpu uncleaved_ ubator02 PD 0:00 1 24 3900M (Priority) 1195929 gpu cleaved_wt ubator02 PD 0:00 1 24 3900M (Priority) 1138005 std U98-CuONN1 imoreira PD 0:00 1 12 3998M (Priority) 1195531 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority) 1195532 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority) 1195533 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority) 1195536 std g09d1 upceqt04 PD 0:00 1 16 32G (Priority) 1195597 std sh gomollon R 20:04:04 4 24 6000M pirineus[31,38,44,47] 1195579 class_a rice crag49366 R 6:44:45 1 8 3998M pirineus5 1195576 class_a rice crag49366 R 6:36:48 1 8 3998M pirineus2 1195578 class_a rice crag49366 R 6:37:53 1 8 3998M pirineus4 • squeue – Report job and job step status. SLURM: Commands
  • 19. • scontrol – Administrator tool to view and/or update system, job, step, partition or reservation status. scontrol hold <jobid> scontrol release <jobid> scontrol show job <jobid> SLURM: Commands
  • 20. JobId=1195597 JobName=sh UserId=gomollon(80128) GroupId=csuc(10000) MCS_label=N/A Priority=100176 Nice=0 Account=csuc QOS=test JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0 RunTime=20:09:58 TimeLimit=5-00:00:00 TimeMin=N/A SubmitTime=2019-10-07T12:21:29 EligibleTime=2019-10-07T12:21:29 StartTime=2019-10-07T12:21:29 EndTime=2019-10-12T12:21:30 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=std AllocNode:Sid=login2:20262 ReqNodeList=(null) ExcNodeList=(null) NodeList=pirineus[31,38,44,47] BatchHost=pirineus31 NumNodes=4 NumCPUs=24 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=24,mem=144000M,node=4 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryCPU=6000M MinTmpDiskNode=0 Features=(null) Gres=(null) Reservation=(null) OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=(null) WorkDir=/home/gomollon Power= SLURM: Commands
  • 23. How to launch jobs?
  • 24. Login on CSUC infrastructure • Login ssh –p 2122 username@hpc.csuc.cat • Transferfiles scp -P 2122 local_file username@hpc.csuc.cat:[path to your folder] sftp -oPort=2122 username@hpc.csuc.cat • Useful paths Name Variable Availability Quote/project Time limit Backup /home/$user $HOME global 4 GB unlimited Yes /scratch/$user $SCRATCH global unlimited 30 days No /scratch/$user/tmp/jobid $TMPDIR Local to each node job file limit 1 week No /tmp/$user/jobid $TMPDIR Local to each node job file limit 1 week No • Get HC consumption consum -a ‘any’ (group consumption) consum -a ‘any’ -u ‘nom_usuari’ (user consumption)
  • 25. Batch job submission: Default settings • 4Gb/core (excepting on mem partition). • 24Gb/core on mem partition. • 1 core on std and mem partitions. • 24 cores on gpu partition • The whole node on KNL partition • Non-exclusive, multinode job. • Scratch and Output directory are the submit directory.
  • 26. Batch job submission • Basic Linux commands: Description Command Exemple List files ls ls /home/user Making folders mkdir mkdir /home/prova Changing folder cd cd /home/prova Copy files cp cp nom_arxiu1 nom_arxiu2 Move file mv mv /home/prova.txt /cescascratch/prova.txt Delete file rm rm filename Print file content cat cat filename Find string into files grep grep ‘word’ filename List last lines on file tail tail filename • Text editors : vim, nano, emacs,etc. • More detailed info and options about the commands: ‘command’ –help man ‘command’
  • 27. Scheduler directives/Options • -c, --cpus-per-task=ncpus number of cpus required per task • --gres=list required generic resources • -J, --job-name=jobname name of job • -n, --ntasks=ntasks number of tasks to run • --ntasks-per-node=n number of tasks to invoke on each node • -N, --nodes=N number of nodes on which to run (N = min[-max]) • -o, --output=out file for batch script's standard output • -p, --partition=partition partition requested • -t, --time=minutes time limit (format: dd-hh:mm)
  • 28. • -C, --constraint=list specify a list of constraints(mem, vnc , ....) • --mem=MB minimum amount of total real memory • --reservation=name allocate resources from named reservation • -w, --nodelist=hosts... request a specific list of hosts • --mem-per-cpu=MB amount of real memory per allocated core Scheduler directives/Options
  • 29. #!/bin/bash #SBATCH–jtreball_prova #SBATCH-o treball_prova.log #SBATCH-e treball_prova.err #SBATCH-p std #SBATCH-n 48 module load mpi/intel/openmpi/3.1.0 cp –r $input $SCRATCH Cd $SCRATCH srun $APPLICATION mkdir -p $OUTPUT_DIR cp -r * $output Batch job submission Schedulerdirectives Setting up the environment Move the input files to the working directory Launch the application(similar to mpirun) Create the output folderand move the outputs
  • 30. Gaussian 16 Example #!/bin/bash #SBATCH-j gau16_test #SBATCH-o gau_test_%j.log #SBATCH-e gau_test_%j.err #SBATCH-p std #SBATCH-n 1 #SBATCH-c 16 module load gaussian/g16b1 INPUT_DIR=/$HOME/gaussian_test/inputs OUTPUT_DIR=$HOME/gaussian_test/outputs cd $SCRATCH cp -r $INPUT_DIR/*. g16 < input.gau > output.out mkdir -p $OUTPUT_DIR cp -r * $output
  • 31. Vasp 5.4.4 Example #!/bin/bash #SBATCH-j vasp_test_%j #SBATCH-o vasp_test_%j.log #SBATCH–e vasp_test_%j.err #SBATCH-p std #SBATCH-n 24 module load vasp/5.4.4 INPUT_DIR=/$HOME/vasp_test/inputs OUTPUT_DIR=$HOME/vasp_test/outputs cd $SCRATCH cp -r $INPUT_DIR/*. srun `which vasp_std` mkdir -p $OUTPUT_DIR cp -r * $output
  • 32. Gromacs Example #!/bin/bash #SBATCH--job-name=gromacs #SBATCH--output=gromacs_%j.out #SBATCH--error=gromacs_%j.err #SBATCH-n 24 #SBATCH--gres=gpu:2 #SBATCH-N 1 #SBATCH-p gpu #SBATCH-c 2 #SBATCH--time=00:30:00 module load gromacs/2018.4_mpi cd $SHAREDSCRATCH cp -r $HOME/SLMs/gromacs/CASE/*. srun `which gmx_mpi`mdrun -v -deffnm input_system -ntomp $SLURM_CPUS_PER_TASK -nb gpu -npme 12 -dlb yes -pin on –gpu_id 01 cp –r * /scratch/$USER/gromacs/CASE/output/
  • 33. ANSYS Fluent Example #!/bin/bash #SBATCH-j truck.cas #SBATCH-o truck.log #SBATCH-e truck.err #SBATCH-p std #SBATCH-n 16 module load toolchains/gcc_mkl_ompi INPUT_DIR=$HOME/FLUENT/inputs OUTPUT_DIR=$HOME/FLUENT/outputs cd $SCRATCH cp -r $INPUT_DIR/*. `/prod/ANSYS16/v162/fluent/bin/fluent3ddp –t $SLURM_NCPUS -mpi=hp -g -i input1_50.txt mkdir -p $OUTPUT_DIR cp -r * $output
  • 34. Best Practices • Use $SCRATCHas workingdirectory. • Move only the necessaryfiles(notall files in the folder each time). • Try to keep importantfiles only at $HOME • Try to choose the partition and resoruces whose mostfit to your job.