Maintenance Windows
20th to 22nd of July
This year our annual maintenance will be strictly for work in the central storage and associated network.
This means it will be shorter than usual, only 3 days.
Since all the shared filesystems (/home, /work, /ssoft) will be unavailable all the clusters will be unavailable.
The /scratch filesystems will be inaccessible since the login nodes will be closed but will not be affected.
Jobs will not be running during this period but any jobs in the queues will resume once the maintenance is over as no user-visible changes will take place (except for the Node Allocation Policy change, see below).
Changes
Node Allocation policy change (parallel partitions - ALL clusters)
Until now, if you ran on less than a full node on the parallel partitions, the rest of the node would be available only to you. Unfortunately we found this heavily distorts the fairshare mechanism.
The possibility to do this will be removed after the Summer Maintenance. This means that on the parallel partitions you will always be allocated/charged the whole node.
If your jobs use less than a full node please move to the serial partition which is available since the end of June on the Fidis cluster. In this partition you can easily be allocated less than a full node (a single CPU core for example).
There are currently 60 nodes on the serial partition but the size will be adjusted based on the demand.
This table present the impact for some typical jobs:
Job Type | Partition | Allocation Before | Allocation Now | Action required? |
---|---|---|---|---|
1 node, 1 core | parallel | 1 core | 1 node | Yes, move to serial partition. |
1 node, 14 cores | parallel | 1 core | 1 node | Yes, move to serial partition. |
1 node, 28 cores | parallel | 28 cores | 1 node | No. |
2 nodes, 4 cores-per-node | parallel | 2 * 4 cores | 2 nodes | Maybe, evaluate if you really need the full resources of the node (memory, etc). |
2 nodes, 24 cores-per-node | parallel | 2 * 24 cores | 2 nodes | No. |
2 nodes, 18 cores-per-node | parallel | 2 * 28 cores | 2 nodes | No. |
Job Submit plugin (Fidis)
On Fidis single node jobs which use less than 75% of the cores of the node are now automatically redirected to the serial partition at submission time.
For example, on the table above, the two first jobs will automatically be moved to the serial partition at submission time (unless you explicitly request that the job has exclusive access to the node).
Questions?
Contact us via email: 1234@epfl.ch