Announcements

6 Jul 2015

DAS-5 is now fully operational! To make room for DAS-5, DAS-4/UvA and DAS-4/ASTRON have been decomissioned, only their headnodes remain available.

28 Oct 2014

The Hadoop setup on DAS-4/VU has been updated to version 2.5.0.

30 Jan 2014

The Intel OpenCL package for Intel CPU's and Xeon Phi has been updated to version 3.2.1.

3 Sep 2013

The Nvidia CUDA development kit has been updated to version 5.5.

25 April 2013

Slides of the DAS-4 workshop presentations are now available.

14 Jan 2013

DAS-4/VU now has 8 new nodes with latest Nvidia K20 GPU.


Hadoop on DAS-4

Hadoop is the open source implementation of the popular Map/Reduce paradigm for scalable distributed computing.

Hadoop performance and scaling experiments

Users on DAS-4 can run their own Hadoop setup inside regular DAS-4 jobs to experiment with e.g. performance and scaling. For large-scale experiments involving lots of data it is then advisable to configure Hadoop such that HDFS data is written to the local disks (/local/$USER directory) in combination with a particular node allocation so that the multiple experiments can make use of the same dataset.

General purpose Hadoop usage

For users on DAS-4 that want to use Hadoop as a basic tool to get their MapReduce jobs done, DAS-4/VU currently supports a Hadoop 2.5.0 configuration that is available for production use. This Hadoop configuration is running on nodes node086-node093; see the special nodes page for details.

To use the DAS-4/VU Hadoop cluster, request das-account@cs.vu.nl to create an HDFS home directory for you. The necessary settings can be imported by means of

module load hadoop

The job tracker is available from fs0.das4.cs.vu.nl via fs0.das4.cs.vu.nl:8088, while the HDFS namenode is available via master.ib.cluster:50270. The other Hadoop settings used in the DAS-4/VU setup can be found in

/cm/shared/package/hadoop/hadoop-2.5.0/etc/hadoop