You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

*** Under construction ***


HPC clusters and supercomputers resources are accounted in either core-hours or node-hours (and more generally CPU-hours). One core-hour is equal to one core used for one wall-clock hour computed from the time the core is allocated to the time it is deallocated.

The total resource available is computed by multiplying the total amount of nodes available for computations by the total allocation time in hours. For instance, a 400-node cluster will provide 400*24*365 = 3'504'000 node-hours over one year.

Users are typically given a percentage of the total resource, which has a node-hours equivalent. For instance, 10% of the HPC cluster means 40-node hours over 1 year = 350'400 node-hours.



use workload manager to dispatch jobs. Resources 




The fair-share algorithm in SLURM is described at http://slurm.schedmd.com/fair_tree.html.

SCITAS machines have a half-life of one week.

To see the share for your group you can use the "Sshare" command"


$ Sshare 
             Account       User Raw Shares Norm Shares   Raw Usage  Norm Usage Effectv Usage  FairShare   Level FS  
-------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- 
scitas-ge                                1    0.007752        1376    0.000003      0.000005           1468.763590
 scitas-ge               aubort          1    0.043478           0    0.000000      0.000000   0.290000        inf 
 scitas-ge             clemenco          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge                cubuk          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge                culpo          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge             degiorgi          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge               eroche          1    0.043478         344    0.000001      0.250000   0.253333   0.173913
 scitas-ge              nvarini          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge                qubit          1    0.043478         351    0.000001      0.255072   0.250000   0.170455
 scitas-ge             rezzonic          1    0.043478         681    0.000001      0.494928   0.246667   0.087848
 scitas-ge              richart          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge              rmsilva          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge                  sue          1    0.043478           0    0.000000      0.000000   0.290000        inf
 scitas-ge                 topf          1    0.043478           0    0.000000      0.000000   0.290000        inf

 

The "Norm Shares" column on the first line is the proportion of the cluster which is allocated to the account and the shares are in terms of cores. Within a group all users have equal weight and so 1 share each.


The value used to decide the priority of a job is the "Level FS" and this is calculated based on the difference between the "Norm Shares" and "Effectv Usage" values. The higher the Level FS, the higher the priority.

A Level FS of less than 1 represents overconsumption. More than 1 means you are underconsuming.



  • No labels