*** Under construction ***
HPC clusters and supercomputers resources are accounted in either core-hours or node-hours (and more generally CPU-hours). One core-hour is equal to one core used for one wall-clock hour computed from the time the core is allocated to the time it is deallocated.
The total resource available is computed by multiplying the total amount of nodes available for computations by the total allocation time in hours. For instance, a 400-node cluster will provide 400*24*365 = 3'504'000 node-hours over one year.
Users are typically given a percentage of the total resource, which has a node-hours equivalent. For instance, 10% of the HPC cluster means 40-node hours over 1 year = 350'400 node-hours.
use workload manager to dispatch jobs. Resources
The fair-share algorithm in SLURM is described at http://slurm.schedmd.com/fair_tree.html.
To see the share for your group you can use the "Sshare" command"
$ Sshare Account User Raw Shares Norm Shares Raw Usage Norm Usage Effectv Usage FairShare Level FS -------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- scitas-ge 1 0.007752 1376 0.000003 0.000005 1468.763590 scitas-ge aubort 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge clemenco 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge cubuk 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge culpo 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge degiorgi 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge eroche 1 0.043478 344 0.000001 0.250000 0.253333 0.173913 scitas-ge nvarini 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge qubit 1 0.043478 351 0.000001 0.255072 0.250000 0.170455 scitas-ge rezzonic 1 0.043478 681 0.000001 0.494928 0.246667 0.087848 scitas-ge richart 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge rmsilva 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge sue 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge topf 1 0.043478 0 0.000000 0.000000 0.290000 inf
The "Norm Shares" column on the first line is the proportion of the cluster which is allocated to the account and the shares are in terms of cores. Within a group all users have equal weight and so 1 share each.
The value used to decide the priority of a job is the "Level FS" and this is calculated based on the difference between the "Norm Shares" and "Effectv Usage" values. The higher the Level FS, the higher the priority.
A Level FS of less than 1 represents overconsumption. More than 1 means you are underconsuming.
Related articles