This video is about installation anf configuration of Slurm queuing system on Linux (Ubuntu here). You 'll also learn how to set up slurm on a cluster computer and submit jobs through it.
In this playlist, I talk about how to set up a cluster computing system using Ubuntu (Linux) and also setting up a queuing system for calculation submission.
The commands described in this video are given below:
Installing Slurm on the Compute Nodes ###
$ export MUNGEUSER=1001
$ sudo groupadd -g $MUNGEUSER munge
$ sudo useradd -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGEUSER -g munge -s /sbin/nologin munge
$ export SLURMUSER=1002
$ sudo groupadd -g $SLURMUSER slurm
$ sudo useradd -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm -s /bin/bash slurm
sudo apt-get install -y munge
Now copy the munge authentication key from /nfs/slurm/ on every node.
sudo scp /nfs/slurm/munge.key /etc/munge/
sudo chown munge:munge /etc/munge/munge.key
sudo chmod 400 /etc/munge/munge.key
sudo systemctl enable munge
sudo systemctl start munge
Start Installation of SLURM (On all compute nodes)
sudo apt-get install slurm-wlm
Copy slurm.conf and slurmdbd.conf from the login node to the compute nodes.
Copy both slurm.conf and slurmdbd.conf to each node at /etc/slurm
sudo scp /nfs/slurm/slurm.conf /etc/slurm
sudo scp /nfs/slurm/slurmdbd.conf /etc/slurm
On the compute nodes: (login as root and then run all the below commands till 3.5.10)
mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmd
chmod 755 /var/spool/slurmd
mkdir /var/log/slurm/
touch /var/log/slurm/slurmd.log
chown -R slurm:slurm /var/log/slurm/slurmd.log
chmod 755 /var/log/slurm
mkdir /run/slurm
touch /run/slurm/slurmd.pid (For compute node)
chown slurm /run/slurm
chown slurm:slurm /run/slurm
chmod -R 770 /run/slurm
nano /usr/lib/systemd/system/slurmd.service
echo CgroupMountpoint=/sys/fs/cgroup > > /etc/slurm/cgroup.conf
slurmd -C
systemctl enable slurmd.service
systemctl start slurmd.service
systemctl status slurmd.service
If the service is active, you are all good, otherwise just reboot the node and reconnect the NFS. Then check the status of slurmd.service. In any case, a reboot at this stage is necessary.
The following command can check the connectivity with the controller node:
scontrol ping
For more informative videos about computational chemistry and other important software tools like Gaussian, MS Word, Excel, PowerPoint, Endnote, ChemDraw etc. subscribe my channel. / wisdomcenter
Facebook Page / muhammadali.hashmi.33
Instagram / hashmi_photography
Email: muhammad.hashmi [at sign] ue.edu.pk
Смотрите видео How to Make a Cluster Computer | Part 06 - Installing Slurm on Computer Nodes/Worker Nodes онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Wisdom Center 20 Январь 2024, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 3,159 раз и оно понравилось 20 людям.