Announcements

9 October 2018

The next generation DAS, DAS-6, will get funding! For details, see the DAS Achievements page. DAS-6 is expected to become operational in the second half of 2019.

May, 2016

IEEE Computer publishes paper about 20 years of Distributed ASCI Supercomputer. See the DAS Achievements page.

Accelerators and special compute nodes

The standard compute node type of DAS-5 sites has a dual 8-core 2.4 GHz (Intel Haswell E5-2630-v3) CPU configuration and 64 GB memory. In addition, several DAS-5 sites include non-standard node types for specific research purposes.

To get a quick overview of the queues/partitions and node properties on a site, use the following commands:

[fs0]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite     68   idle node[001-068]
das4         up   infinite      2   idle node[069-070]
longq        up   infinite      4   idle node[072-075]
knlq         up   infinite      2   idle node[076-077]

[fs0]$ sinfo -o "%40N  %40f"
NODELIST                                  FEATURES                 
node[008-023,053-068]                     cpunode                  
node[030-045]                             cpunode,ssd              
node[069-070]                             cpunode,das4             
node[001-007,024-025,046-052]             gpunode,TitanX           
node026                                   gpunode,Titan,TitanX-Pascal            
node027                                   gpunode,K40              
node028                                   gpunode,K20,XeonPhi,michost             
node029                                   gpunode,GTX980,TitanX-Pascal
node071                                   gpunode                                 
node[072-075]                             fatnode                       
node[076-077]                             knlnode

Compute nodes are by default allocated from queue/partition "defq" containing the standard node types. To allocate a special resource in SLURM or prun, a so-called "constraint" for a required node property should be specified as follows:

  • -C GTX980
    nodes with an Nvidia GTX980 (with 4 GB onboard memory)
  • -C Titan
    nodes with an Nvidia GTX Titan (with 6 GB onboard memory)
  • -C K20
    nodes with an Nvidia Tesla K20 (with 6 GB onboard memory)
  • -C K40
    nodes with an Nvidia Tesla K40 (with 12 GB onboard memory)
  • -C TitanX
    nodes with an Nvidia GTX TitanX, Maxwell generation (with 12 GB onboard memory)
  • -C TitanX-Pascal
    nodes with an Nvidia GTX TitanX, Pascal generation (with 12 GB onboard memory)
  • -C cpunode
    regular node type that only offers cpus (ASUS RS720Q)
  • -C gpunode
    regular node type that potentially offers GPUs (ASUS ESC4000); initially there are not many GPUs on DAS-5 available yet and CPU performance is very similar on "cpunode" and "gpunode" types. However, the property is still available for enforcing runs on identical hardware.
  • -C fatnode
    a cpu node with non-default CPU and often extra memory; typically this will be in a special partition to avoid mixing it with other node types.
  • -C knlnode
    Intel Knights landing node; these are in a separate queue "knlq". See below how to use non-default queues.
  • -C ssd
    node with an additional SSD device, mounted as /local-ssd.

This resource selector should be added as

#SLURM -C resource
in a SLURM job script, or passed as
-native '-C resource'
option to prun/preserve.

To allocate a GPU on a node, besides specifying the GPU type, the option "--gres=gpu:1" should be added as well. Examples can be found on the DAS-5 GPU page.

Nodes that have a different CPU or node architecture than the default dual-8-core 2.4 GHz (E5-2630-v3) are typically in a different queue (SLURM calls them "partitions") to avoid unpredictable performance. To run a job on node in partition "part", add the following:

#SLURM -p part
in a SLURM job script, or passed as
-native '-p part'
option to prun/preserve. When specifying multiple constraints or partitions, group them all together as argument to the single -native option in prun as follows:
-native '-p part -C resource1,resource2'

VU University

At fs0.das5.cs.vu.nl the following special-purpose equipment is available for various experiments:

  • 16 of the GPU nodes have an NVidia GTX TitanX Maxwell GPU;
  • 2 of the GPU nodes have two (!) NVidia GTX Titan X Pascal GPUs;
  • 1 of the GPU nodes has an NVidia GTX980 GPU;
  • 1 of the GPU nodes has an NVidia GTX Titan GPU;
  • 1 of the GPU nodes has an NVidia Tesla K20 GPU;
  • 1 of the GPU nodes has an NVidia Tesla K40 GPU;
  • 16 of the regular CPU nodes have a 240 GB SSD drive mounted as /local-ssd;
  • node069 and node070: two previous-generation DAS-4 nodes, with extra disk capacity for specific experiments;
  • node076 and node077: Intel Knights Landing nodes.

See the "sinfo" overview for fs0.das5.cs.vu.nl at the top of this page.

Leiden University

fs1.das5.liacs.nl has 24 regular CPU nodes:

[fs1]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite     24   idle node[101-124]

University of Amsterdam

fs2.das5.science.uva.nl has 14 regular nodes, two nodes with two TitanX GPUs each, and in addition 2 fat nodes with dual 16-core CPUs (E5-2698-v3) and 256 GB memory that are in partition "fatq":

[fs2]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite     16   idle node[201-202,205-218]
fatq         up   infinite      2   idle node[203-204]

[fs2]$ sinfo -o "%40N  %25f %25G"
NODELIST                                  FEATURES                  GRES                     
node[205-206]                             gpunode,TitanX            gpu:2                    
node[201-202]                             gpunode                   (null)                   
node[207-218]                             cpunode                   (null)                   
node[203-204]                             fatnode                   (null)                

Delft University of Technology

fs3.das5.tudelft.nl has 48 regular nodes:
[fs3]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite     47   idle node[301-348]

University of Amsterdam - MultiMediaN

fs4.das5.science.uva.nl has 24 regular cpunodes, 6 GPU-capable nodes, and in addition one fat node with dual 16-core CPUs (E5-2698-v3) and 512 GB memory that is in partition "fatq":

[fs4]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite     30   idle node[401-405,407-431]
fatq         up   infinite      1   idle node406

[fs4]$ sinfo -o '%40N  %25f'
NODELIST                                  FEATURES                 
node[401-405]                             gpunode                  
node[407-431]                             cpunode                  
node406                                   fatnode                  

ASTRON

fs5.das5.astron.nl has 9 nodes, all with non-default CPU types and a 480 GB SSD:

  • 3 gpu-capable nodes (ASUS ESC8000) with dual 10-core 2.6 GHz nodes (E5-2660 v3), 128 GB memory; NOTE: these 3 nodes do not include the regular dual 4 TB disks on /local, just the SSD;
  • 2 gpu-capable nodes (SuperMicro SYS-7048GR-TR) with dual 10-core 2.6 GHz nodes (E5-2660 v3) and 128 GB memory;
  • 2 nodes with dual 14-core 2.6 GHz nodes (E5-2697 v3) and 512 GB memory;
  • 2 nodes with dual 8-core 2.6 GHz nodes (E5-2640 v3) and 128 GB memory.

[fs5]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite      2   idle node[508-509]
fat28        up   infinite      2   idle node[506-507]
fat20        up   infinite      5   idle node[501-505]

[fs5]$ sinfo -o '%40N %80f'
NODELIST                                 FEATURES                                
gpu01                                    gpunode,E5-2630,K10,K20                                     
node[508-509]                            cpunode,E5-2640-v3                                          
node[501-503]                            gpunode,E5-2660-v3,ESC8000                                  
node504                                  gpunode,E5-2660-v3,SYS-7048GR-TR                            
node505                                  gpunode,E5-2660-v3,SYS-7048GR-TR,TitanX,TitanX-Pascal       
node[506-507]                            fatnode,E5-2697-v3                    

NOTE: hyperthreading is enabled on the ASTRON nodes, so dual-10-core machines node501..node505 are in queue "fat20" (because there are 20 true cores), but the kernel will expose 40 cores due to hyperthreading. Similarly, the dual-14-core machines are in queue "fat28". The two dual-8-core machines are in "defq", but note that they are a bit faster than the default dual-8-core nodes on the other DAS-5 clusters.