Springe direkt zu Inhalt

Main Memory

The amount of main memory (or RAM) a job requires must be given in the job control script via the option --mem-per-cpu or --mem (memory per node). Care must be taken to specify the value as accurately as possible.

If the value given is too high, then

  • the job may have to wait longer than necessary before enough memory is available.
  • once the job starts, other jobs may have to wait, because although memory is available, it has been reserved for the running job. In the illustration below, although enough cores are available, the job shown cannot start because the queueing system cannot reserve enough memory on the node.

memory overestimation.png

If the value is too low, then

  • the job will be terminated once its memory requirement exceeds the amount of memory requested

The memory available on each node can be seen in the column MEMORY with the command:

snodes

This value is also displayed as RealMemory by the command scontrol show node <node name>, e.g.

$ scontrol show node c005
NodeName=c005 Arch=x86_64 CoresPerSocket=16
   CPUAlloc=32 CPUTot=32 CPULoad=32.40
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=c005 NodeHostName=c005 Version=18.08
   OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
   RealMemory=95000 AllocMem=65536 FreeMem=69249 Sockets=2 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=main
   BootTime=2019-07-12T13:23:09 SlurmdStartTime=2019-07-25T15:36:37
   CfgTRES=cpu=32,mem=95000M,billing=32
   AllocTRES=cpu=32,mem=64G
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Keywords