HPC clusters and supercomputers resources are accounted in either core-hours or node-hours (and more generally CPU-hours). One core-hour is equal to one core used for one wall-clock hour computed from the time the core is allocated to the time it is deallocated.
The total resource available is computed by multiplying the total amount of nodes available for computations by the total allocation time in hours. For instance, a 400-node cluster will provide 400*24*365 = 3'504'000 node-hours over one year.
Users are typically given a percentage of the total resource, which has a node-hours equivalent. For instance, 10% of the HPC cluster means 40-node hours over 1 year = 350'400 node-hours.
use workload manager to dispatch jobs. Resources
The fair-share algorithm in SLURM is described at http://slurm.schedmd.com/fair_tree.html.
SCITAS machines have a half-life of one week. |
To see the share for your group you can use the "Sshare" command"
$ Sshare Account User Raw Shares Norm Shares Raw Usage Norm Usage Effectv Usage FairShare Level FS -------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- scitas-ge 1 0.007752 1376 0.000003 0.000005 1468.763590 scitas-ge aubort 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge clemenco 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge cubuk 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge culpo 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge degiorgi 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge eroche 1 0.043478 344 0.000001 0.250000 0.253333 0.173913 scitas-ge nvarini 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge qubit 1 0.043478 351 0.000001 0.255072 0.250000 0.170455 scitas-ge rezzonic 1 0.043478 681 0.000001 0.494928 0.246667 0.087848 scitas-ge richart 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge rmsilva 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge sue 1 0.043478 0 0.000000 0.000000 0.290000 inf scitas-ge topf 1 0.043478 0 0.000000 0.000000 0.290000 inf |
The "Norm Shares" column on the first line is the proportion of the cluster which is allocated to the account and the shares are in terms of cores. Within a group all users have equal weight and so 1 share each.
The value used to decide the priority of a job is the "Level FS" and this is calculated based on the difference between the "Norm Shares" and "Effectv Usage" values. The higher the Level FS, the higher the priority.
Related articles appear here based on the labels you select. Click to edit the macro and add or change labels.
|