The next generation DAS, DAS-6, will get funding! For details, see the DAS Achievements page. DAS-6 is expected to become operational in the second half of 2020.
DAS-5/VU has been extended with 4 TitanX-Pascal GPUs.
The ASTRON cluster is a hybrid cluster, that meets specific research needs (e.g., accelerated computing, energy-efficient computing, high-bandwidth networking). Most nodes are therefore "special nodes", that contain powerful CPUs with much memory, GPUs, or other accelerators. The ASTRON cluster complements the larger regular clusters of the other DAS-5 sites.
The following table summarizes the properties of the ASTRON DAS-5 nodes. Note that the transition from DAS-4 to DAS-5 is not complete, and that changes to the configuration are expected.
name | must reserve using slurm/prun | type | CPUs | cores/threads | nom. freq. (GHz) | RAM (GB) | HDD/SSD (TB) | GPUs etc. |
fs5 | SuperMicro SC846 | 2x Xeon E5-2660v3 | 20/40 | 2.6 | 128 | 96/2 | ||
node501 | * | ASUS ESC8000 G3 | 2x Xeon E5-2660v3 | 20/40 | 2.6 | 128 | 0/0.48 | |
node502 | * | ASUS ESC8000 G3 | 2x Xeon E5-2660v3 | 20/40 | 2.6 | 128 | 0/0.48 | Radeon R9 nano |
node503 | * | ASUS ESC8000 G3 | 2x Xeon E5-2660v3 | 20/40 | 2.6 | 128 | 0/0.48 | 8x Xeon Phi (31S1P) |
node504 | * | SuperMicro 7048GR-TR | 2x Xeon E5-2660v3 | 20/40 | 2.6 | 128 | 8/0.48 | FirePro S10000, W9100, W8100 |
node505 | * | SuperMicro 7048GR-TR | 2x Xeon E5-2660v3 | 20/40 | 2.6 | 128 | 8/0.48 | Titan X |
node506 | * | ASUS RS500 | 2x Xeon E5-2697v3 | 28/56 | 2.6 | 512 | 8/0.48 | |
node507 | * | ASUS RS500 | 2x Xeon E5-2697v3 | 28/56 | 2.6 | 512 | 8/0.48 | |
node508 | * | ASUS RS720Q | 2x Xeon E5-2640v3 | 16/32 | 2.6 | 128 | 8/0.48 | |
node509 | * | ASUS RS720Q | 2x Xeon E5-2640v3 | 16/32 | 2.6 | 128 | 8/0.48 | |
gpu01 | * | SuperMicro 7047GR-TRF | 2x Xeon E5-2630 | 12/24 | 2.3 | 64 | 1/0 | Tesla K10, Tesla K20 |
dsp01 | TI EVMK2H | 66AK2H14 | 4 ARM + 8 DSP | 1.4 + 1.2 | 2 | 0/0 | ||
jetson01 | NVIDIA Jetson TK1 | Tegra K1 | 4 ARM + 192 GPU | 2.3 + 0.852 | 2 | 0/0 |
node502 contains an AMD Radeon R9 nano, a highly energy-efficient GPU. See Section Using AMD GPUs for more info on using AMD GPUs. The instantaneous power consumption of the R9 nano can be monitored with the PowerSensor device via /dev/ttyUSB0.
node503 contains 8 mid-range Xeon Phis (31S1P), each providing 2 TFLOPS (single precision) or 1 TFLOPS (double precision) processing power. Each board has 8 GB of fast memory.
node504 contains AMD FirePro GPUs: a FirePro S10000, a W9100, and a W8100 (be careful that your program chooses the GPU(s) that you really want to use). See Section Using AMD GPUs for more info on using AMD GPUs. The instantaneous power consumption of the FirePro S10000 can be monitored with the PowerSensor device via /dev/ttyUSB0.
node505 contains a Titan X. The instantaneous power consumption of the Titan X can be monitored with the PowerSensor device via /dev/ttyUSB0.
node506 and node507 are "supernodes"; they contain CPUs with many cores and have 512 GB RAM.
node508 and node509 are regular nodes for nondemanding tasks.
All DAS-5 nodes at ASTRON are powered through Power Distribution Units that measure the power drawn from their outlets. Using the web interfaces of these PDUs, one can monitor the power at system level.
There are two web interfaces: one for the Schleifenbauer PDUs (pdu01), and one for the Racktivity PDU (pdu02). To use the webinterface of the Schleifenbauer pdu, create an ssh tunnel as follows:
ssh -L 8888:pdu01:80 fs5.das5.astron.nland in your local webbrowser, open http://localhost:8888/.
To use the webinterface of the Racktivity PDU, start firefox --no-remote http://pdu02/ on the fs5 headnode itself. Note that the web interface of the PDU uses an ancient security protocol; you first have to open about:config, and set security.tls.version.min to 0.
The PDUs are connected to the machines as follows (note that many machines have multiple (redundant) power supplies):
PDU | Address | outlet 1 | outlet 2 | outlet 3 | outlet 4 | outlet 5 | outlet 6 | outlet 7 | outlet 8 | outlet 9 |
pdu01 | 1 | node503 A | node503 B | node503 C | ||||||
pdu01 | 2 | node502 A | node502 B | node502 C | ||||||
pdu01 | 3 | node501 A | node501 B | node501 C | ||||||
pdu01 | 4 | Eth Switch A | node506 | node504 A | node504 B | ? | node508/node509 A | node508/node509 B | IB Switch B | |
pdu01 | 5 | Eth Switch B | ? | node507 | node505 A | node505 B | fs5 A | fs5 B | IB Switch A | |
pdu01 | 6 | fs5.das4.astron.nl | agc001 A | r815 A | phi | agc001 B | r815 B | |||
pdu02 | e6 | e7v3 | e7v3 | e5v3 | gpu01 | dsp01 | jetson01 |
AMD GPUs can be programmed in OpenCL; the SDK is in /opt/AMDAPPSDK-2.9-1.
You can also use the CodeXL visual profiler, but to do so, you must install CodeXL 1.7 on your local machine, create an ssh tunnel and start CodeXLRemoteAgent as follows:
ssh -t -X -o ProxyCommand="ssh fs5.das5.astron.nl ncat %h %p" -L 27015:node504:27015 node504 \ /opt/AMD_CodeXL_Linux_x86_64_1.7.7300/CodeXLRemoteAgent
Then, on your local machine, start CodeXL, create a project, and in the project settings, tick "Remote host" as target host and type "localhost" as remote host address. Running the graphical part of the profiler using X11 forwarding does NOT work, and CodeXL 1.5 and 1.6 are broken. Also, we found that device buffer allocation is NOT thread safe while the application is being profiled, so you have to make buffer allocation mutually exclusive if your program is multi-threaded.
Some of the accelerators are attached to a custom-built, micro-controller based device that monitors the power consumption at high time resolution. Both the current drawn from the external PCIe power cables and the power drawn from the PCIe slot is monitored, using a PCIe riser card. Actually, only the current is measured, and (assuming a constant voltage) converted to the instantaneous power consumption. The micro-controller reports the measurements to the host via USB, and the application can monitor the power consumption using a simple host library (found in ~romein/projects/PowerSensor).
The PowerSensor library can be used in C++ code as follows:
#include <libPowerSensor.h> ... PowerSensor powerSensor("/dev/ttyUSB0"); PowerSensor::State startState = powerSensor.read(); ... // offload work to the GPU/accelerator and wait PowerSensor::State stopState = powerSensor.read(); std::clog << "average power consumption is " << PowerSensor::Watt(startState, stopState) << 'W' << std::endl;