Alcyone Cluster

Introduction

Alcyone cluster has 860 processor cores (ie. 138 Xeon X5650 and 4 Xeon X7550 processors), 28 nVidia Tesla 2070 GPU cards and 12 nVidia Tesla 2090 GPU cards. The node with X7550 processors has breath-taking 1TB of memory.

Account allocation

Accounts are allocated to research scientists on a group-by basis.

To apply for an account, the group leader should fill in this form. Fill and sign it, and send it to Antti Kuronen (address is in the document). The form is also available in LaTeX format.

Note that the desired username should be the same as the University AD-account name (the one you use e.g. with the University webmail).

The user list should only contain scientists who need massive computational capacity.

After the initial application has been accepted, new users can be added to the group by a simple e-mail request from the group leader to Antti Kuronen.

The group leader has the responsibility to ensure that the cluster users in the group are aware of the rules of usage (listed below), and that they know enough of the use of Unix systems to be able to follow the rules.

Advice on usage

Basic usage instruction are at the FGI documentation page at https://confluence.csc.fi/display/fgi/Local+cluster++access+and+use .

Login by ssh’ing to alcyone.grid.helsinki.fi. Ssh connections to alcyone are only allowed from helsinki.fi domain. However, if you have an account on machines kruuna.helsinki.fi, ruuvi.it.helsinki.fi or punk.it.helsinki.fi you can use the one of the following ssh commands to connect to alcyone:

ssh -l USERNAME -tY kruuna.helsinki.fi /usr/bin/ssh -l ALCYONEUSERNAME -tX alcyone.grid.helsinki.fi
ssh -l USERNAME -tY ruuvi.it.helsinki.fi ssh -l ALCYONEUSERNAME -tY alcyone.grid.helsinki.fi
ssh -l USERNAME -tY punk.it.helsinki.fi ssh -l ALCYONEUSERNAME -tY alcyone.grid.helsinki.fi

Alcyone cluster has currently the GNU and Intel compilers installed. Selection of the program development environment is done by the module command that is also used in the CSC machines. Command module avail lists the available modules. For more information see the FGI documentation.

Batch job system is SLURM which is also in use in vuori.csc.fi.

A sample script for serial a serial job can be obtained here. If you want to use OpenMP use the script alcyone_submit_openmp. (Note that OpenMP is a thread-based parallelization method. It has nothing to do with the MPI implementation OpenMPI that is used in the package mentioned below.)

For parallel runs using MPI there is a tgz package containing a small example program and submit script.

SLURM provides means for the job script to detect if the time limit is exceeded (or user issues the scancel command). This means that even though the time limit is exceeded output files of your job are copied back to the login node. A version of the serial submit script that uses this feature can be downloaded here.

Note also that SLURM (or at least alcyone config) requires the queue (or partition as it is called in SLURM) name to be given in the submit script. I.e. you should give the option:

#SBATCH -p <queue_name>

in the submit script.

NOTE: SLURM’s sinfo command has stupid default settings: the queue names are shown with only 9 characters. To output the whole names use e.g. the following command:

sinfo -o "%13P %5a %.10l %.5D %6t %N"

You can, of course, define this as a shell alias. For bash, the following aliases might be useful (put them in your .bashrc file):

# SLURM-related aliases
alias mysqueue="squeue -o \"%7i %14P %20j %8u %12T %12M %9l %.6D %.R  %.10C\""
alias mysinfo="sinfo -o \"%16P %5a %.12l %.5D %6t %6s %N\""
alias myjobs="mysqueue -u $USER"

A two-page cheat sheet (by Andrey Ilinov) for running parallel jobs on alcyone can be found here.

Storage space

Each user has storage space on two disks:

/home/$USER
/scratch/$USER

Quotas apply to both places. Home is intended only for program source codes, scripts and such. Simulation output files should be stored on scratch.

Rules of usage for the alcyone computer cluster

The cluster is intended for the use of personnel of the Department of Physics, Department of Chemistry, Department of Computer Science, and HIP.

The cluster is administered by Administrators appointed by the Heads of the Departments.

Allowed use is research and education in physics, chemistry and computer science utilizing efficient simulation and numerical codes. Any large-scale simulations should be run with compiled software, that is, extensive runs using interpreting programs such as Matlab, Mathematica etc. or script languages such as awk and perl are not allowed unless an explicit exception is granted by one of the Administrators. Running password cracking, cryptography, and “seti@home”-kinds of programs on alcyone is naturally strictly prohibited.

Research use accounts are given on a group-by basis. Eligible groups are those working at the owning institutions. In unclear cases the Head of the respective owning institutions decides whether a group is eligible for an account.

To open a group research account, the group leader should fill in the initial application form, and send it to the Account Administrator. After the initial application has been accepted, new users can be added to the group by a simple e-mail request from the group leader.

Educational accounts might be allocated for the lecturer of a computational physics course requiring parallel computing resources for the period of the course, according to a separate agreement with one of the Administrators. The lecturer bears the responsibility that the use of the educational accounts is limited to proper course use, and for guiding the course students into proper use of the cluster.

As of now, there are no pre-defined limits for usage. Groups are expected to use the machine in a gentlemanly manner, not attempting to hoard as much computer capacity for themselves as possible at the expense of other groups. All CPU use of each group is logged, and if a single group has used what seems like an obviously unreasonable share of the cluster for a long period of time, the Adminstrators have the right to ask them to limit their use in the future. If after several warnings the group still uses unreasonable amounts of capacity, the group accounts can be closed for a fixed period of time.

The use of the machine should take into account hardware limitations such as memory and hard drive space limitations. Hard disk space for public use is allocated on the /home, and /tmp disks. Each user should keep their disk space usage to a reasonable minimum, and clean out stuff they no longer need. All long jobs should put their output to the /tmp disks, which are not backuped, and are not intended for long-term storage. Old files from the /tmp disks may be removed without prior warning to the user.

The cluster is intended for serial jobs and parallel runs using up to 24 processor cores. Detailed information of the parallel environment will be given separately. Running embarrasingly trivial parallel jobs using scripts is allowed within the limits set by the batch queue system on the number of jobs.

The group leader has the responsibility to ensure that the users in the group are aware of these rules, and that they know enough of the use of Unix systems to be able to follow these rules.

Any cluster user is allowed and indeed encouraged to report clear violations of these rules to the Administrators.

In case of clear violations of these rules, whether intentional or due to negligence or poor understading of the system, the Administrators can issue formal warnings to the group leader or course lecturer. If after two warnings the group still does not comply with the rules, the group account on the cluster will be closed for a fixed amount of time, or permanently.

Naturally you should also follow the University of Helsinki general rules of computer usage.

Persons in charge

Administrators of the cluster are:
  • Pekko Metsä, Tomas Lindén
  • Kai Ruusuvuori (batch queues)
  • Antti Kuronen (user accounts)

Email addresses of the administrators are of the form firstname.lastname@helsinki.fi with diacritics removed. However, note that in technical matters it is best to sent the message to the mailing list alcyone-admin[at]helsinki.fi.

Latest update: 16 Aug 2013, A. Kuronen