Quick Guide to Setting Up OpenPBS
Contents
- Before You Start
- Setting Up The Cluster
- Compiling the Code
- Setting Up the Execution Nodes
- Setting Up the Scheduler
- Setting Up the Queue Server
Before You Start
The PBS manual defines a number of terms. I'll just go into those terms which are relevant to this discussion:Definitions
- Server: The server is the machine that the job queue will be running on. This is the machine where jobs are submitted. Its good practise to make this machine the only in the cluster which is connected to the outside world. People should'nt be able to access the cluster nodes from the outside. But we'll get back to this a little later
- MOM: The MOM is just another term for an execution node; in other words MOM's are your compute nodes. I'll use the term MOM and execution node interchangebly. If your cluster is small and you dont want to waste resources, its possible to have the server machine act as an execution node.
- Scheduler: This is a program which runs on the server machine and handles the job of deciding which node will get a job. OpenPBS supplied several schedulers including one which can be scripted to fit your purposes.
- Virtual processors:
- Time shared host: This term is used to indicate that a host can run multiple jobs. In such a case, specifyin virtual processors is meaningless.
Paths & Variables
For the purpose of this article we'll decide on some variable (like host names, home directories etc).- The server will be called zeus
- The internal interface (see pg ) for zeus will 192.168.1.1 and will be called zeus1.
- The nodes will be called zeus2, zeus3 etc
- The variable $PBS_HOME will be set to /usr/local/spool/pbs which is where the various configuration and log files will be stored.
- OpenPBS binaries will be installed under /usr/local/bin and /usr/local/sbin.
Setting Up The Cluster
- The node machines should be placed on an internal network (192.168.1.*) and connected to the server by a switch.
- The server's internal interface (192.168.1.1) was specified as the gateway for the node machines. The gateway for the internal interface of the server was set to the external interface of the server
- On the server set up
ipchains
to allow the
internal machines to connect to the rest of the world.
The following ipchain commands were placed in /etc/rc.d/rc.local on the
server:
/sbin/modprobe ipchains /sbin/sbin/ipchains -F forward /sbin/ipchains -P forward DENY /sbin/ipchains -A forward -s 192.168.1.0/24 -j MASQ /sbin/ipchains -A forward -i eth1 -j MASQ echo 1 > /proc/sys/net/ipv4/ip_forward
- Export the /home directory on the server by editing
/etc/exports (the nodes are on a 192.168.1.* net)
/home 192.168.1.*(rw)
- Export the directory to the nodes with
exportfs -vr
- Mount the home directory from zeus on the nodes with
mount zeus:/home /home -t nfs
- Next add the following line to /etc/fstab on
each node:
zeus:/home /home nfs defaults 0 0
- Finally make sure that the programs that will be run on the cluster are available to all the nodes, either by copying them onto the nodes or else by making the binary directory from the server available to the nodes.
Compiling the Code
- Get the tarballs from OpenPBS
- It seems gcc 3.2 cannot compile OpenPBS. Until it can you will need gcc 2.95 or gcc 2.96. If you have both (ie 2.9x & 3.2) on your system then make sure that /usr/bin/gcc refers to version 2.95 or 2.96.
- Compile PBS for the server machine with
./configure --set-default-server=zeus --enable-syslog --with-scp \ --set-server-home=/usr/local/spool/pbs --enable-clients
This will- Set the default server name to zeus (the hostname of the server machine without the domain)
- All the PBS stuff such as logs and configuration files will be under /usr/local/spool/pbs. Its a good idea to make this the same for all the machines. Also remember that each machine should have its own PBS home directory - ie, it should'nt be mounted from the server.
- Since we decided that the sever machine will also act as an execution node, we compile the MOM as well
- The system will log both in the PBS home directory as well as to syslog (usually /var/log/messages on a RedHat system).
- All the clients (like qsub,qdel,pbsnodes etc) will be compiled. This also includes the GUI programs which need Tcl/Tk to compile.
- ssh will be used for file transfers rather than rcp
- Do the actual compile and install by
make && make install
- Next
copy the source tarball to each node machine and run configure
with the following command:
./configure --disable-gui --disable-server --set-default-server=zeus1 \ --enable-syslog --with-scp --set-server-home=/usr/local/spool/pbs \ --enable-mom --enable-clients
This configure command will- Prevent compilation of the PBS server
- Prevent compilation of the GUI clients
- Enables logging to syslog and enables the use of scp rather than rcp.
- The other switches mean the same as they did for the server
- Once again do the compile and install with
make && make install
Setting Up the Execution Nodes
- On each execution
node open up the file
$PBS_HOME/mom_priv/config.
(The first
time round, it wont exist so just make a file with that
name). Add the following lines:
$logevent 0x1ff $clienthost zeus1 $clienthost zeus.chem.psu.edu $max_load 2.0 $ideal_load 1.0 $usecp zeus.chem.psu.edu:/home /home
A quick description of the keywords:
- $logevent: This indicates the level of logging that the MOM should do.
- $clienthost: The value of this keyword indicates which machines in the cluster are allowed to contact the MOM to shedule a job for example. In our case, the only machine that was meant to contact a MOM was the server node - thus we specify the names of the internal and external interface of the server machine (zeus1 and zeus.chem.psu.edu respetively).
- $max_load: The value of this keyword plays a role in scheduling which I'll talk about in more detail in the section discussing the scheduler. When the system load average goes above this value the MOM will refuse any new jobs.
- $ideal_load: The value of this keyword also plays a role in scheduling. After the load average has crossed $max_load, the MOM will only take on new jobs after the load average has gone below this value.
- $usecp: This is a very important keyword for the setup described
here. After a job has finished the MOM will transfer the output files etc back to the
host which supplied it the job (in this case the server zeus). Since we compiled the code
with the -with-scp flag it will try and use scp. However, if you have not set up
passwordless logins, then the scp will fail and you will not get back your output files.
Hence, this keyword instructs the MOM that if the host to which it is trying to transfer files to matches the above host (the matching rule is described in a little more detail in the manual), then it should use cp, which is OK for us since our home directories are mounted via NFS. The paths specified indicate that anything under /home on the execution node should be copied under /home on the remote host (ie zeus.chem.psu.edu in this case).
It is important to specify the FQDN for the server. This is because the server gets its hostname with the gethostbyname() C library call and uses that in communications with the MOM's. Hence if you just gave zeus as the host it would never match the server supplied hostname and thus would use scp rather than cp and you would be scratching your head! Also note that if the MOM detects that it is going to be copying files to the local machine (ie itself) then it will use cp by default. Thus if you run a MOM on the server machine you can skip this line.
- The values for $max_load and $ideal_load given above are examples. You'll probably want to tweak the values.
- After you have written the configuration file, you can start the MOM
by
/usr/local/sbin/pbs_mom
- Remember that you should start the MOM as root. It will become a daemon and drop root privileges. To make sure that the MOM will restart automatically in the event of a reboot, make sure to execute it from /etc/rc.d/rc.local
Setting Up the Scheduler
I've only played with the default C scheduler so thats what I'll be
discussing. I hope to play with the Python scheduler and I'll put up
notes when I do that.
- To set up the C scheduler to do load balancing edit the file
$PBS_HOME/sched_priv/sched_config and add.
load_balancing: true ALL
- Start the scheduler with
/usr/local/sbin/pbs_sched
Setting Up the Queue Server
- Start the server with
/usr/local/sbin/pbs_server -t create
- Make the server active by
qmgr -c "set server scheduling=true"
- Create the queue by
create queue qsar queue_type=execution set server default_queue=qsar
- Configure the server
set server default_node=zeus2 set server acl_hosts=zeus.chem.psu.edu,ra.chem.psu.edu set server acl_host_enable=true set server managers=rajarshi@zeus.chem.psu.edu set server query_other_jobs=true
- Set some default values for the server with
set server resources_defaults.cput=1:00:00 set server resources_defaults.mem=4mb
- Configure the queue with
set queue qsar resources_min.cput=1, resources_max.cput=240:00:00 set queue qsar resources_default.cput=120:00:00 set queue qsar enabled=true, started=true
- Add the nodes using the following
create node zeus2 ntype=time-shared,properties="fast,big"
Just enter as many create node statements as you have nodes. Note that this will set all nodes as time shared (which is OK if you want to load balance over all the nodes).