Admin How-To Guide
NOTE: This is an archived version of the admin how-to guide. It is out of date and no longer valid for our current setup, but is kept for reference purposes. Please access the new guide here. This guide is valid for this old configuration, also archived.
This guide is meant for UMD admins and as a thoroughly documented single use case of a USCMS Tier 3 site. The solutions used here are not recommended to other sites as each site will have different needs - they are presented for example purposes only. File paths, hardware configuration, cluster management software (we use Rocks), etc, are UMD specific and will need to be adapted to your site's needs.
These instructions do not have to be followed sequentially, dependencies are listed at the beginning of every set of instructions. We update the instructions every time we install a new software release, but some upgrades are performed rarely. Links are given to the original source of the instructions and should be consulted for any changes. If any links have expired, any errors are found, or some points are unclear, please notify Marguerite Tonjes.
Last edited August 20, 2011
Table of Contents
- Connect to the switch
- Install Rocks 4.3 with SL4.5
- Modify Rocks
- Upgrade RAID firmware & drivers
- Configure the big disk array
- Instrument & monitor
- Install CMSSW
- Install CRAB
- Install OSG
- Install PhEDEx
- Install/configure other software
- Backup critical files
- Recover from HN failure
- Solutions to encountered errors
Connect to the switch
This is a guide intended for basic setup in a Rocks cluster. The Dell 6224 (a rebranded Cisco) is a fully managed switch, meant for use in a larger switching fabric, so it has many powerful features (most of which will not be covered here). The specific configuration details may vary, depending on your local environment. If in doubt, please consult your local network administrator. We first connect to the switch via a direct serial connection to get it to issue DHCP requests. We then get Rocks to listen to the DHCP request and assign an IP address, then do final configuration via a web browser.
In addition to the information we provide, all of the Dell 6224 manuals can be downloaded here.
Direct serial connection
1. The VT100 emulator:
First, connect the switch and headnode (or computer of choice), using the manufacturer supplied serial cable. A terminal program, such as 'minicom' (available in most Linux distros) can be used to talk to the switch. It must be noted here, that we were unable to get our headnode to communicate with the switch over the serial console using minicom, so instead a laptop w/ serial port running Linux was used (this is a local anomaly, and should not be considered a default).
Alternative terminal programs for serial console:
- Windows = Hyperterminal (available in all distributions)
- Linux w/ GUI = gtkterm (available in most distros (except SL); if not, it is easily found)
2. Settings for serial console:
The most common configuration for asynchronous mode are used: 8-N-1.
8 = 8 data bits
N = no parity bits
1= 1 stop bit
Most console programs will default to these settings. Additionally, the communication speed should be set to at least 9600 baud.
3. Initial setup:
Power on the switch and wait for startup to complete. The Easy Setup Wizard will display on an unconfigured switch. These are the important points:
- Would you like to set up the SNMP management interface now? [Y/N] N
Choose no. (unless you have centralized Dell OpenManage, or other management) - To set up a user account: The default account is 'admin', but anything may be used.
- To set up an IP address: Choose 'DHCP', as Rocks will handle address assignments in the cluster.
- Select 'Y' to save the config and restart.
We also experimented with dividing certain types of traffic into separate VLANs. It was deemed unnecessary, given the present size of our cluster, but may be revisited should we add considerably more nodes, or if network traffic control proves problematic.
4. Network connections:
Now get Rocks to recognize the DHCP request issued by the switch by proceeding with step 9 of the Rocks installation instructions. In short, after Rocks has been installed on the HN:
insert-ethers
Select 'Ethernet switches'
Wait at least 30 mins after powering the switch for it to issue the DHCP request
After Rocks assigns an IP to the switch, it can be configured over telnet, SSH, and HTTP, from the headnode. The default name for the switch is network-0-0.
Using a graphical browser:
As outlined in step 9 of the Rocks installation instructions, the Spanning Tree Protocol (STP) must be disabled. It is often recommended to configure STP, which we did initially. We could not get the worker nodes to pull an address from DHCP. After some experimentation, all ports on the switch were set to 'portfast' mode, which solved the problem. However, this is essentially the same as turning STP off completely, which also works just fine. The problem is that links will go up and down a few times during the DHCP request, and STP won't properly activate a port until it has been up for several seconds. So, Rocks would never see the end nodes. This can be done from the command line, but it is simpler to use the web-enabled interface from a browser on the headnode (or over x-forwarding from the command line).
From the head node, open a graphical browser and enter the IP address: 10.255.255.254. The user name and password can be given by Marguerite Tonjes. This is a semi-dynamically allocated IP, so in rare cases, the IP may be re-assigned. If this IP does not connect you to the switch, issue the command 'dbreport dhcpd' and look for the network-0-0.local bracket, where the local IP address will be listed. If the network-0-0.local bracket does not exist, a portion of the Rocks install must be redone (see "Install Rocks" below, instruction 9). Under Switching->Spanning Tree->Global Settings, select Disable from the "Spanning Tree Status" drop down menu. Click "Apply Changes" at the bottom.
If, for some reason, the browser method doesn't work, type these commands at the VT100 console provided by minicom or similar software:
console#config
console(config)#spanning-tree disable <port number>
(this will have to be done for all 24 ports!)
console(config)#exit
console#show spanning-tree
Spanning tree Disabled mode rstp
console#quit
Install Rocks
These instructions are for installing Rocks 4.3 using Scientific Linux 4.5, x86_64 architecture, adding the condor roll. Rocks downloads are available here, SL4.5 is available here. The Rocks 4.3 user's guide is available here.
- Download the Rocks Kernel/Boot roll
- Download the Rocks Core roll (includes the required rolls of base, hpc, and web-server, with a few nice extras)
- Download the Rocks Condor roll
- Download Scientific Linux 4.5 (all disks)
- Download a special SL4.5 patch for Rocks, labeled the comps roll.
- Burn all the .iso's to disks (Windows .iso burner, BurnCDCC).
- Follow the Rocks 4.3 user's guide to install Rocks on the head node. Additions to the guide:
- Our network configuration is detailed here. The initial boot phase is on a timer and will terminate if you do not enter the network information quickly enough.
- We selected the base, ganglia, hpc, java, and web-server rolls from the Core CD. We believe the grid roll may actually be counter-productive, as it attempts to set up the cluster as a certificate authority, which may interfere with the OSG configuration.
- Be sure to add the kernel, comps and condor rolls.
- Insert each SL4.5 disk in turn and select the LTS roll listed.
- As far as we know, the questions about certificate information on the "Cluster Information" screen is not used by any applications that we install. We entered the following, which may or may not be correct:
FQHN: HEPCMS-0.UMD.EDU
Name: UMD HEP CMS T3
Certificate Organization: DOEgrids
Certificate Locality: College Park
Certificate State: Maryland
Certificate Country: US
Contact: mtonjes@nospam.umd.edu (w/o the nospam)
URL: http://hep-t3.physics.umd.edu
Latitude/Longitude: 38.98N -76.92W
- Select manual partitioning and allocate the following partition table (if you wish to preserve existing data, be sure to restore the partition table and don't modify any you wish to keep):
/dev/sda :
/ 8189 /sda1 ext3
swap 8189 /sda2 swap
/var 4095 /sda3 ext3
/sda4 is the extended partition which includes /sda5
/scratch 48901 /sda5 ext3 (fill to max available size)
/dev/sdb 418168, RAID-5 408.38 GB physical disks 0:0:2, 0:0:3, 1:0:4, 1:0:5 :
/export 418168 /sdb1 ext3 (fill to max available size)
Leave /dev/sdc (the big disk array) alone as it is a logical volume and Rocks cannot handle logical volumes at the install stage. - In some cases, Rocks does not properly eject the boot disk before restarting. Be sure to eject the disk after Rocks is done installing, but before the reboot sequence completes and goes to the CD boot.
- On your first login to the HN, you will be prompted to generate rsa keys. You should do so (the default file is fine, as well as using the same password).
- Read the Rocks 4.3 user's guide on how to change the partition tables on the worker nodes. Note the code below may not work if you have existing partitions on any of the WNs. Rocks tries to preserve existing partitions when it can. If the code below does not work (symptoms include pop-ups during install complaining about FDiskWindow & Kernel panics from incorrectly synced configs after install is complete), try forcing the default partitioning scheme & modifying the Rocks WN partitions after install. In this case, you will probably lose any existing data on the WNs and should use the Rocks boot disk rather than PXE boot. Additionally, our setup somehow causes LABEL synchronization issues on subsequent calls to shoot-node; we must add some commands to extend-compute.xml to fix this issue. The necessary commands to set the WN partitions prior to the first WN Rocks installation:
- cd /home/install/site-profiles/4.3/nodes/
- cp skeleton.xml replace-auto-partition.xml
- Edit the <main> section of replace-auto-partition.xml:
<main>
<part> / --size 8192 --ondisk sda </part>
<part> swap --size 8192 --ondisk sda </part>
<part> /var --size 4096 --ondisk sda </part>
<part> /scratch --size 1 --grow --ondisk sda </part>
<part> /tmp --size 1 --grow --ondisk sdb </part>
</main> - cp skeleton.xml extend-compute.xml
- Edit the <post> section of extend-compute.xml and add:
e2label /dev/sda1 /
cat /etc/fstab | sed -e s_LABEL=/1_LABEL=/_ > /tmp/fstab
cp -f /tmp/fstab /etc/fstab
cat /boot/grub/grub-orig.conf | sed -e s_LABEL=/1_LABEL=/_
> /tmp/grub.conf
cp -f /tmp/grub.conf /boot/grub/grub-orig.conf
- cd /home/install
- rocks-dist dist
- Follow the Rocks 4.3 user's guide to set up the worker nodes.
Additions to the guide:
- If you have not already done so, be sure to configure the switch via the serial cable to get its IP via DHCP and set a login name and password (for internet management).
- We do have a managed switch, so the first task, done by selecting 'Ethernet switches' in the insert-ethers menu, should be performed. The switch takes a long time to issue DHCP requests after powering up; wait at least 30 mins.
- Quit insert-ethers using the F11 key, not F10.
- Once insert-ethers has detected the switch, open an internet browser and log into the switch (typically 10.255.255.254, but dbreport dhcpd lists the switch' local IP inside the network-X-Y bracket). The user name and password can be provided to you by Marguerite Tonjes.
- Under Switching->Spanning Tree->Global Settings, select Disable from the "Spanning Tree Status" drop down menu. Click "Apply Changes" at the bottom.
- Continue with the remainder of the Rocks WN instructions.
- PXE boot can be initiated on all the WNs by striking the F12 key at the time of boot. Alternatively, insert the Rocks Kernel/Boot CD into each WN shortly after pressing the power button.
- Security needs to be configured (quickly). Instructions to do so are located in the file ~root/security.txt, readable only by root. If this file was lost during Rocks install, contact Marguerite Tonjes for the backup. If you are another site following these instructions, you can contact Marguerite Tonjes for a copy, on which to start configuration of your local site security (security tends to be site specific and we don't claim our security is fool-proof). Your identity will need to be confirmed by Marguerite Tonjes.
Modify Rocks
- Modify cluster database
- Prevent automatic WN re-install
- WN re-installation
- Modify WN partitions
- Configure WN external network
- Add new users
- Modify users
- Add rolls
Modify cluster database
The information stored in the Rocks cluster database can be viewed and edited here, user name and password can be obtained from Marguerite Tonjes. The MySQL DB can be restarted by issuing the command /etc/init.d/mysqld restart from the HN as root (su -).
Prevent automatic WN re-install
Rocks will automatically re-install WNs after they have experienced a hard reboot (such as power failure). This is a useful feature during installation stages, but can be a performance issue once the cluster is in a stable configuration. Simply follow the instructions in this Rocks FAQ to disable this feature. Be sure to re-install the WNs to get the changes to propagate. After removing this feature, shoot-node and cluster-kickstart commands will issue the error:
cannot remove lock: unlink failed: No such file or directory
error reading information on service rocks-grub: No such file or directory
cannot reboot: /sbin/chkconfig failed: Illegal seek
which can be safely ignored.
WN re-installation
Some modifications will require the WNs to be reinstalled. This tends to be true in cases which require you to issue the command 'rocks-dist dist,' typically because you edited an .xml file. In most cases, this involves simply issuing:
ssh-agent $SHELL
ssh-add
shoot-node compute-0-0
(repeat to compute-0-7)
An alternative method of re-shooting the nodes is shown below. It is not clear which approach is superior.
ssh-agent $SHELL
ssh-add
ssh compute-0-0 'sh /home/install/sbin/nukeit.sh'
(repeat through compute-0-7 or use cluster-fork)
ssh compute-0-0 '/boot/kickstart/cluster-kickstart'
(repeat through compute-0-7 or use cluster-fork)
If you have not yet made nukeit.sh, see the instructions to modify WN partitions.
Since Rocks requires a reinstall of the WNs every time a change is made to the kickstart files and our WNs are also interactive nodes, you may want to wait until a scheduled maintenance time to reinstall. The cluster-fork command is useful to get the desired functionality prior to reinstall:
ssh-agent $SHELL
ssh-add
cluster-fork "command"
"command" can be anything you'd like run on each WN individually, which could include a network-mounted shell script.
At some point, our WNs stopped getting the ssh key at the appropriate point in the Rocks Kickstart. Thus, shoot-node will no longer pop the installation monitoring window (and will give permission denied errors). To use cluster-fork easily after WN reinstall, the ssh key must be copied to the WNs. After every WN reinstall, copy the ssh key to all the WNs, manually supplying the password for each one (so that we don't have to manually supply the password in future cluster-fork calls):
ssh-agent $SHELL
ssh-add
cluster-fork "scp HEPCMS-0:/root/.ssh/* /root/.ssh"
OMSA cannot be fully installed as a part of the Rocks Kickstart. Be sure to follow the instructions for OMSA WN installation in step 3 after every WN reinstall.
PhEDEx services must be restarted after shoot-node phedex-node-0-7.
After every major WN reinstall, in addition to testing whatever changes were made, we like to test a few basic capabilities to make sure nothing was broken. A general outline of the tests we perform:
- We first check that the nodes are reporting to Ganglia. Failure to do so indicates a serious problem, which will probably only be resolved by going to the RDC to examine the hardware and perform another WN reinstall (after tracking the problem down and fixing it).
- We have a "Hello World" C++ program which we compile and run. Failure typically indicates some sort of endemic, low-level problem, which will probably only be solved by another WN reinstall (after you've tracked the problem down and fixed it). Note we do need additional C++ compilers as a part of the Rocks kickstart.
- We have a "vanilla" Condor .jdl file which simply executes sleep. We check both that it ran and that it was submitted to nodes other than the submitting node (submit more than 8 jobs - if they all run simultaneously, the jobs were successfully submitted to more than one node). Failure typically indicates a problem with the condor configuration, controlled via a Rocks kickstart file. It may also indicate an error with the network configuration.
- We have a very simple CMSSW config that generates a handful of events using only Configuration/StandardSequences (no custom C++ code). This CMSSW program uses Frontier conditions to test our Squid server. It also sends output to a variety of locations to test disk mounts. CMSSW and Squid are installed only on the HN, so WN reinstall should not damage the installations. Failure of the CMSSW program may be indicative of a problem with network disk mounts or PATHs. Failure of Squid during cmsRun (which typically prints errors, but does not quit) typically indicates a network problem.
- We have a very simple CMSSW program that analyzes DBS events hosted on the cluster (a basic EDAnalyzer). We do not run the CMSSW program locally. Instead we run the CMSSW job via CRAB, which will test a number of important services all at once. Failure could be due to any number of issues including, but not limited to, gLite, CRAB, or OSG interaction with the WNs. We set the following values in our crab.cfg file:
- pset, output_file : the CMSSW config and name(s) of output file(s)
- se_white_list = UMD.EDU, ce_white_list = UMD.EDU : this tests that we can run jobs in addition to submitting them
- datasetpath : any DBS dataset known to be hosted at the cluster; primarily tests that nodes can access files in the 'file catalog'
- scheduler : we use condor_g the first time for rapid-response debugging. Once the condor_g jobs have completed successfully, we sometimes submit a second CRAB job with glite as the scheduler, particularly if we've made any changes to the gLite-UI install.
- return_data = 0, copy_data = 1, storage_element = hepcms-0.umd.edu, storage_path = /srm/v2/server?SFN=/data/users/srm-drop : tests both our ability to stage-out files using the srm-client from Fermilab and to receive files using the BeStMan server.
Modify WN partitions
These instructions are based on this Rocks guide. You will probably lose any existing data on the WNs. Additionally, our setup somehow causes LABEL synchronization issues on subsequent calls to shoot-node; we must add some commands to extend-compute.xml to fix this issue.
As root (su -) on the HN:
cd /home/install/site-profiles/4.3/nodes/
cp skeleton.xml replace-auto-partition.xml
If extend-compute.xml does not yet exist:
cp skeleton.xml extend-compute.xml
Edit the <main> section of replace-auto-partition.xml:
<main>
<part> / --size 8192 --ondisk sda </part>
<part> swap --size 8192 --ondisk sda </part>
<part> /var --size 4096 --ondisk sda </part>
<part> /scratch --size 1 --grow --ondisk sda </part>
<part> /tmp --size 1 --grow --ondisk sdb </part>
</main>
Edit the <post> section of extend-compute.xml and add:
e2label /dev/sda1 /
cat /etc/fstab | sed -e s_LABEL=/1_LABEL=/_ > /tmp/fstab
cp -f /tmp/fstab /etc/fstab
cat /boot/grub/grub-orig.conf | sed -e s_LABEL=/1_LABEL=/_
> /tmp/grub.conf
cp -f /tmp/grub.conf /boot/grub/grub-orig.conf
cd /home/install
rocks-dist dist
rocks remove host partition compute-0-0
(repeat through compute-0-7)
Create the /home/install/sbin/nukeit.sh script:
for i in `df | awk '{print $6}'`
do
if [ -f $i/.rocks-release ]
then
rm -f $i/.rocks-release
fi
done
ssh compute-0-0 'sh /home/install/sbin/nukeit.sh'
(repeat through compute-0-7 or use cluster-fork)
ssh compute-0-0 '/boot/kickstart/cluster-kickstart'
(repeat through compute-0-7 or use cluster-fork)
In some cases the partitions aren't done properly; it is unclear why. Kernel panics when the node attempts to boot are an indicator of this issue (the node will never reconnect, you must physically go to the node to ascertain this condition). In such a case, it is best to force the default partitioning scheme on these nodes, install, then try again with the preferred partitioning scheme. Use the Rocks boot disk as PXE boot does not seem sufficient. To do the default partitioning scheme, simply replace all the <part> lines in replace-auto-partition.xml with:
<part> force-default </part>
You will lose all data on the WNs for which you force the default scheme.
Configure external network for the WNs:
- Follow this Rocks guide for activating and configuring the second WN ethernet interface.
- Use our network configuration to determine the appropriate values to enter. Alternatively, call the script /root/configure-external-network.sh.
- These instructions state how to re-install the WNs.
Add new users
First set the default shell for all new users to tcsh. Edit /etc/default/useradd and change the line SHELL to:
SHELL=/bin/tcsh
This is optional, but commands in this guide assume that root uses a bash shell and all other users use a c-shell.
useradd -c "Full Name" -n username
passwd username (select an initial password)
chage -d 0 username
ssh-agent $SHELL
ssh-add (enter the root password)
rocks sync config
rocks sync users
If the big disk array has already been mounted, give the user their own directory:
mkdir /data/users/username
chown username:users /data/users/username
Some notes:
- User instructions for first-time logging in are given here.
- All the files inside /etc/skel (such as .bashrc & .cshrc) are copied to each new user's /home area. If files in /etc/skel are modified after users have already been made, the existing users need to be informed of the changes. This guide puts environment variables and aliases in /etc/skel so that users can see where important programs are located. Alternatively, environment variables can be placed in /etc/profile (for bash) and /etc/csh.login (for c-shells) and aliases can be placed in /etc/bashrc (for bash) and /etc/csh.cshrc (for c-shells).
Modify users
As root (su -), first utilize standard Linux commands to modify the user (system-config-users provides a GUI if desired). Then update Rocks:
ssh-agent $SHELL
ssh-add
rocks sync config
rocks sync users
Note that to delete a user's home area, you must remove it from /export/home manually. You must also remove the relevant lines in /etc/auto.home:
chmod 744 /etc/auto.home
remove the line with the user's name
chmod 444 /etc/auto.home
make -C /var/411
You should also remove their space in /data:
rm -rf /data/users/username
Add rolls
Download the appropriate .iso file from Rocks. We'll call it rollFile.iso which corresponds to rollName.
As root (su -):
mount -o loop rollFile.iso /mnt/cdrom
cd /home/install
rocks-dist --install copyroll
umount /mnt/cdrom
rocks-dist dist
kroll rollName | bash
init 6
You can check that the roll installed successfully:
dbreport kickstart HEPCMS-0 > /tmp/ks.cfg
Look in /tmp/ks.cfg for something like:
# ./nodes/somefiles.xml (rollName)
While documentation on this is poor, it seems wisest to re-install the WNs to insure the changes are propagated to the WNs.
Upgrade RAID firmware & drivers
Updating firmware will require shutdown of various services as well as reboot of the HN. Be sure to schedule all firmware and driver updates in advance. The instructions below provide details for handling the big disk array (/data), but do not require that it be configured properly before upgrade; indeed, it is recommended that the RAID firmware and drivers be upgraded before mounting the big disk.
- Go to www.dell.com
- Under support, enter the HN Dell service tag
- Select drivers & downloads
- Choose RHEL4.5 for the OS
- Select SAS RAID Controller for the category
- Select the drivers and firmware for PERC 6/E Adaptor and PERC 6/i Integrated for download.
- Follow the PhEDEx instructions to stop all PhEDEx services.
- Stop OSG services:
/etc/rc3.d/S97bestman stop
cd /share/apps/osg
. setup.sh
vdt-control --off
- Stop what file services we can:
omconfig system webserver action=stop
/etc/init.d/dataeng stop
cd /share/apps/osg
. setup.sh
vdt-control --off
/etc/rc.d/init.d/nfs stop
umount /data
cluster-fork "umount /data" - As root (su -) on the HN, install the firmware:
- The firmware download link should go to an executable, which is not the right file to install in Linux. From the executable name and location and by browsing the ftp server, you can extrapolate the location of the READMEs, e.g.:
wget "ftp://ftp.us.dell.com/SAS-RAID/R216021.txt"
wget "ftp://ftp.us.dell.com/SAS-RAID/R216024.txt" - By reading the READMEs, you can extrapolate the location of the correct binaries, e.g.:
wget "ftp://ftp.us.dell.com/SAS-RAID/RAID_FRMW_LX_R216021.BIN"
wget "ftp://ftp.us.dell.com/SAS-RAID/RAID_FRMW_LX_R216024.BIN" - Make the binaries executable:
chmod +x RAID_FRMW_LX_R216021.BIN
chmod +x RAID_FRMW_LX_R216024.BIN - Follow the instructions in the READMEs.
- Reboot after each firmware upgrade is complete, stopping all relevant services each time the HN comes back up.
- The firmware download link should go to an executable, which is not the right file to install in Linux. From the executable name and location and by browsing the ftp server, you can extrapolate the location of the READMEs, e.g.:
- As root (su -) on the HN, install the driver:
- The driver download link should go to the README. From the README name and location and by browsing the ftp server, you can extrapolate the location of the tarball, e.g.:
wget "ftp://ftp.us.dell.com/SAS-RAID/megaraid_sas-v00.00.03.21-4-R193772.tar.gz" - Unpack the tarball:
tar -zxvf megaraid_sas-v00.00.03.21-4-R193772.tar.gz - Print the current status:
modinfo megaraid_sas
- Install the appropriate rpms:
rpm -ivh dkms-2.0.19-1.noarch.rpm
rpm -ivh megaraid_sas-v00.00.03.21-4.noarch.rpm - Print the new status (output should have changed):
modinfo megaraid_sas
dkms status - Reboot the HN:
reboot
- The driver download link should go to the README. From the README name and location and by browsing the ftp server, you can extrapolate the location of the tarball, e.g.:
- Reboot all the WNs, as they may have difficulties accessing the network mounted files on the HN:
ssh-agent $SHELL
ssh-add
cluster-fork "reboot" - Be sure to restart the PhEDEx services after WN reboot.
Configure the big disk array
It is recommended but not required that the RAID firmware and drivers are updated prior to configuring the disk array. We chose to use LVM2 on a single partition for the large data array. This will allow for future expansion and simple repartitioning, as the need arises. While it is possible to use 'fdisk' to partition the array, it is not advisable as 'fdisk' does not play nicely with LVM and our total volume size exceeds the 2TB limit. It is also possible to create several smaller partitions and group them together with the 'vgcreate' command, we considered that solution to be overly complicated. We also used the XFS disk format as it is optimized for large disks and works well with Bestman.
Create, format & mount the disk array on the HN:
As root (su -):
- Install XFS:
rpm -ivh "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/contrib/RPMS/xfs/kernel-module-xfs-2.6.9-55.ELsmp-0.4-1.x86_64.rpm"
rpm -ivh "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/contrib/RPMS/xfs/xfsprogs-2.6.13-1.SL.x86_64.rpm" - Identify the array's hardware designation with fdisk:
fdisk -l
Our disk array is currently /dev/sdc.
- Use GNU Parted to create the partition:
parted /dev/sdc
At the parted command prompt:
mklabel gpt - This changes the partition label to type GUID Partition Table.
mkpart primary 0 9293440M - This creates a primary partition which starts at 0 and ends at 9293440MB.
print - This confirms the creation of our new partition; output should look similar to:
Disk geometry for /dev/sdc: 0.000-9293440.000 megabytes
Disk label type: gpt
Minor Start End Filesystem Name Flags
1 0.017 9293439.983
quit - Assign the physical volumes (PV) for a new LVM volume group (VG):
pcvreate /dev/sdc1 - Create a new VG container for the PV. Our VG is named 'data' and contains one PV:
vgcreate data /dev/sdc1 - Create the logical volume (LV) with a desired size. The command takes the form:
lvcreate -L (size in KB,MB,GB,TB,etc) (VG name)
So, in our case:
lvcreate -L 9293440MB data
On this command, we receive the error message: Insufficient free extents (2323359) in volume group data: 2323360 required. Sometimes, it is simpler to enter the value in extents (the smallest logical units LVM uses to manage volume space). We will use a '-l' instead of '-L':
lvcreate -l 2323359 data - Confirm the LV details:
vgdisplay
The output should look like:--- Volume group --- VG Name data System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 2 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 8.86 TB PE Size 4.00 MB Total PE 2323359 Alloc PE / Size 2323359 / 8.86 TB Free PE / Size 0 / 0 VG UUID tcg3eq-cG1z-czIn-7j5a-YVM1-MT70-sqKAUY
- After these commands, the location of the volume is /dev/mapper/data-lvol0 (ascertain by examining the contents of /dev/mapper). Create a filesystem:
mkfs.xfs /dev/mapper/data-lvol0 - Create a mount point, edit /etc/fstab, and mount the volume:
mkdir /data
Add the following line to your /etc/fstab:
/dev/mapper/data-lvol0 /data xfs defaults 1 2
And mount:
mount /data - Confirm the volume and size:
df -h
Output should look like:
/dev/mapper/data-lvol0 8.9T 528K 8.9T 1% /data - Create subdirectories and set permissions:
mkdir /data/se
mkdir /data/se/store
cd /
ln -s /data/se/store
mkdir /data/users
For all currently existing users:
mkdir /data/users/username
chown username:users /data/users/username
Network mount the disk array on all the nodes
These commands network mounts /data on all nodes. As root (su -) on the HN:
- Edit /etc/exports:
chmod +w /etc/exports
Add this line to /etc/exports: /data 10.0.0.0/255.0.0.0(rw,async)
chmod -w /etc/exports - Restart the HN NFS service:
/etc/init.d/nfs restart - Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml and place the following commands inside the <post></post> brackets:
<file name="/etc/fstab" mode="append">
HEPCMS-0:/data /data nfs rw 0 0
</file>
mkdir /data
mount /data
cd /
ln -s /data/se/store
cd -
- Create the new distribution:
cd /home/install
rocks-dist dist - Re-shoot the nodes following these instructions.
Instrument & monitor
We configure the Baseboard Management Controllers (BMCs) on the WNs to respond to manual ipmish calls from the HN. However, for automation, we opted to have every node self-monitor, so every node also installs Dell's Open Manage Server Administrator (OMSA). We configured the BMCs on the WNs and IPMI on the HN prior to installing OMSA, but BMC documentation suggests it should be possible to configure the BMCs via OMSA. So the steps we took to configure the BMCs, including machine reboot and changing the settings manually on every node, may not be required. We installed OMSA 5.5 on all nodes and OpenIPMI on the HN from the disk which came with our system, packaged with OpenManage 5.3. Dell has not released OpenIPMI specifically for OpenManage 5.5, but we have not experienced any version mismatches by using the older OpenIPMI client.
- Install OpenIPMI on the HN
- Configure the WN BMCs
- Install & configure OMSA on the HN
- Install & configure OMSA on the WNs
Install OpenIPMI on the HN:
At the RDC, start the OS GUI on the HN as root (startx). Insert the Dell OpenManage DVD into the HN drive (labelled Systems Management Tools and Documentation). Install Dell's management station software:
- Navigate to /media/cdrecorder/SYSMGMT/ManagementStation/linux/bmc
- Install:
rpm -Uvh osabmcutil9g-RHEL-3.0-11.i386.rpm - Navigate to /media/cdrecorder/SYSMGMT/ManagementStation/linux/bmc/ipmitool/RHEL4_x86_64
- Install:
rpm -Uvh *rpm - Start OpenIPMI:
/etc/init.d/ipmi start
Configure the WN BMCs:
To configure the BMC's to respond to ipmish command line calls from the HN, reboot each one and configure BIOS and remote access setup.
At boot time, press F2 to enter the BIOS configuration. Set the following:
- Serial communication: On with console redirection via COM2
- External Serial Communication: leave as COM1
- Failsafe Baud Rate: 57600
- Remote Terminal Type: leave as VT100/VT200
- Redirection after Boot: leave enabled
Enter the remote access setup shortly after BIOS boot by typing Ctrl-E. Set the following:
- IPMI Over Lan: On
- NIC Selection: Failover
- LAN Parameters:
- RMCP + Encryption Key: leave
- IP Address Source: DHCP
- DHCP host name: hepcms-0
- VLAN Enable: leave off
- LAN Alert Enabled: on
- Alert Policy Entry 1: 10.1.1.1
- Host Name String: compute-x-y bmc
- LAN User Configuration: see /root/bmc.txt on the HN (hidden for security)
Before exiting the remote access setup, or as soon as possible afterwards, tell the HN to listen for DHCP requests coming from the BMC. As root (su -) on the HN:
- insert-ethers
- Select Remote Management
- After Rocks recognizes the BMC, exit with the F11 key.
You may need to reboot the WN to get all the new settings to work. To test that it's worked, execute from the HN:
ipmish -ip manager-x-y -u ... -p ... sensor temp
Install & configure OMSA on the HN
Install Dell OpenManage Server Administrator:
- Set up the environment:
mkdir /share/apps/OpenManage-5.5
cd /share/apps/OpenManage-5.5
- Download OMSA:
wget "http://ftp.us.dell.com/sysman/OM_5.5.0_ManNode_A00.tar.gz"
tar -xzvf OM_5.5.0_ManNode_A00.tar.gz - Fool OpenManage into thinking we have a valid OS (which we do):
echo Nahant >> /etc/redhat-release - Install OMSA:
cd linux/supportscripts
./srvadmin-install.sh
Choose "Install all" - Start OMSA:
srvadmin-services.sh start - Check it's running and reporting:
omreport system summary
Navigate to https://hepcms-0.umd.edu:1311 - The files created from unpacking the tarball can be deleted if desired, they were for installation purposes only.
Create the executables which will be called in the event of OMSA detected warnings and failures. We issue notifications via email, including cell phone emails (which can be looked up on your cell phone provider's website):
- Create /share/apps/OpenManage-5.5/warningMail.sh:
# /bin/sh
echo "Dell OpenManage has issued a warning on" `hostname` > /tmp/OMwarning.txt
echo "If HN: https://hepcms-0.umd.edu:1311" >> /tmp/OMwarning.txt
echo "If WN: use ipmish from HN or omreport from WN" >> /tmp/OMwarning.txt
mail -s "hepcms warning" email1@domain1.com email2@domain2.net </tmp/OMwarning.txt>/share/apps/OpenManage-5.5/warningMailFailed.txt 2>&1 - Create /share/apps/OpenManage-5.5/failureMail.sh:
# /bin/sh
echo "Dell OpenManage has issued a failure alert on" `hostname` > /tmp/OMfailure.txt
echo "Immediate action may be required." >> /tmp/OMfailure.txt
echo "If HN: https://hepcms-0.umd.edu:1311" >> /tmp/OMfailure.txt
echo "If WN: use ipmish from HN or omreport from WN" >> /tmp/OMfailure.txt
mail -s "hepcms failure" email1@domain1.com email2@domain2.net </tmp/OMfailure.txt>/share/apps/OpenManage-5.5/failureMailFailed.txt 2>&1 - Make them executable and create the error log files:
chmod +x /share/apps/OpenManage-5.5/warningMail.sh
chmod +x /share/apps/OpenManage-5.5/failureMail.sh
touch /share/apps/OpenManage-5.5/warningMailFailed.txt
touch /share/apps/OpenManage-5.5/failureMailFailed.txt
Configure OMSA to handle warnings and failures:
- Navigate to https://hepcms-0.umd.edu:1311 and log in
- To configure the HN to automatically shutdown in the event of temperature warnings:
- Select the Shutdown tab and the "Thermal Shutdown" subtab
- Select the Warning option and click the "Apply Changes" button
- Under the "Alert Management" tab, we set the following warning actions to execute application /share/apps/OpenManage-5.5/warningMail.sh:
- Temperature Probe Warning
- Memory Pre-failure
- Processor Warning
- Power Supply Warning
- Battery Probe Warning
- Storage Controller Warning
- Physical Disk Warning
- Storage Controller Battery Warning
- Under the "Alert Management" tab, we set the following failure actions to execute application /share/apps/OpenManage-5.5/failureMail.sh:
- Power Supply Critical
- Temperature Probe Detects a Failure
- Fan Probe Detects a Failure
- Voltage Probe Detects a Failure
- Memory Failure
- Hardware Log Failure
- Processor Failure
- Battery Probe Detects a Failure
- Storage System Failure
- Storage Controller Failure
- Physical Disk Failure
- Virtual Disk Failure
- Enclosure Failure
- Storage Controller Battery Failure
Install & configure OMSA on the WNs:
We install and configure OMSA via Rocks Kickstart. As root (su -) on the HN:
- Place the appropriate installation files to be served from the HN:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/compat-libstdc++-33-3.2.3-47.3.i386.rpm"
wget "http://ftp.us.dell.com/sysman/OM_5.5.0_ManNode_A00.tar.gz" - Add the text in this xml fragment to the <post></post> section of /home/install/site-profiles/4.3/nodes/extend-compute.xml. If you are performing the OMSA install manually from the command line, you can reference the text in the xml fragment to see the commands executed to perform the install. The xml fragment is effectively a shell script, with & characters replaced by & and > by > .
- The OMSA install cannot be completed entirely in the Rocks Kickstart.
- Create a shell script which will complete the installation, /home/install/sbin/OMSAinstall.sh:
cd /scratch/OpenManage-5.5/linux/supportscripts
./srvadmin-install.sh -b
srvadmin-services.sh start & - And a shell script which will configure OMSA, /home/install/sbin/OMSAconfigure.sh.
- Make them executable:
chmod +x /home/install/sbin/OMSAinstall.sh
chmod +x /home/install/sbin/OMSAconfigure.sh - And execute them after every WN reinstall:
ssh-agent $SHELL
ssh-add
cluster-fork "/home/install/sbin/OMSAinstall.sh
cluster-fork "/home/install/sbin/OMSAconfigure.sh
- Create a shell script which will complete the installation, /home/install/sbin/OMSAinstall.sh:
Install CMSSW
Production releases of CMSSW can be installed automatically via OSG tools (email Bockjoo Kim to do so). We choose not to since the size of all the CMSSW production releases plus any additional releases we would need to install manually for our users could exceed the available disk space for CMSSW. These instructions are taken from this guide. Instructions below may be out of date, so please check it for the most recent details.
Prepare the environment:
- As root (su -), install apt-rpm (apt-get):
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/apt-0.5.15cnc6-9.SL.x86_64.rpm"
rpm -i -v apt-0.5.15cnc6-9.SL.x86_64.rpm
- Create a user specifically for CMSSW installs, whom we will call cmssoft, following the instructions for adding new users.
- As root (su -), create and network mount the appropriate directories:
- Create /scratch/cmssw and cede control to cmssoft:
mkdir /scratch/cmssw
chown -R cmssoft:users /scratch/cmssw - While we're at it, create /scratch/other:
mkdir /scratch/other - Edit /etc/exports:
chmod 744 /etc/exports
Add the line: /scratch 10.0.0.0/255.0.0.0(rw,async)
chmod 444 /etc/exports - Create /etc/auto.software file with the content:
cmssw HEPCMS-0.local:/scratch/cmssw
other HEPCMS-0.local:/scratch/other
And change the permissions:
chmod 444 /etc/auto.software - Edit /etc/auto.master:
chmod 744 /etc/auto.master
Add the line: /software /etc/auto.software --timeout=1200
chmod 444 /etc/auto.master - Restart NFS:
/etc/rc.d/init.d/nfs restart
/etc/rc.d/init.d/portmap restart
service autofs reload - Inform 411 of the change:
cd /var/411
make clean
make - Tell WNs to restart their own NFS service:
ssh-agent $SHELL
ssh-add
cluster-fork '/etc/rc.d/init.d/autofs restart'
Note: Some directory restarts may fail because they are in use. However, /software should get mounted regardless.
- Create /scratch/cmssw and cede control to cmssoft:
- As cmssoft, prepare for CMSSW installation:
chmod 755 /scratch/cmssw
setenv VO_CMS_SW_DIR /software/cmssw
setenv SCRAM_ARCH slc4_ia32_gcc345
setenv LANG "C"
wget -O $VO_CMS_SW_DIR/bootstrap.sh http://cmsrep.cern.ch/cmssw/cms/bootstrap.sh
mkdir /tmp/cmssoft
sh -x $VO_CMS_SW_DIR/bootstrap.sh setup -path $VO_CMS_SW_DIR -arch $SCRAM_ARCH >& $VO_CMS_SW_DIR/bootstrap_$SCRAM_ARCH.log - Add the following lines to ~cmssoft/.cshrc (get the correct apt version sub-directory):
setenv VO_CMS_SW_DIR /software/cmssw
setenv SCRAM_ARCH slc4_ia32_gcc345
source $VO_CMS_SW_DIR/$SCRAM_ARCH/external/apt/0.5.15lorg3.2-cms3/etc/profile.d/init.csh
apt-get update
source $VO_CMS_SW_DIR/cmsset_default.csh
- As root, edit /etc/skel/.cshrc to include the lines:
# CMSSW
setenv VO_CMS_SW_DIR /software/cmssw
source $VO_CMS_SW_DIR/cmsset_default.csh - And as root, edit ~root/.bashrc & /etc/skel/.bashrc:
# CMSSW
export VO_CMS_SW_DIR=/software/cmssw
. $VO_CMS_SW_DIR/cmsset_default.sh - If OSG has been installed (instructions below are repeated under OSG installation):
- Inform BDII that we have the slc4_ia32 environment. Edit /share/apps/osg-app/etc/grid3-locations.txt to include the line:
VO-cms-slc4_ia32_gcc345 - Create a link to CMSSW in the osg-app directory:
cd /share/apps/osg-app
mkdir cmssoft
ln -s /software/cmssw cmssoft/cms
- Inform BDII that we have the slc4_ia32 environment. Edit /share/apps/osg-app/etc/grid3-locations.txt to include the line:
Install Squid
The conditions database is managed by Frontier, which requires a Squid web proxy to be installed. We choose to install it on the HN. These instructions are based on these two (1, 2) Squid for CMS guides, be sure to check them for the most recent details.
As root (su -) on the HN:
- First create the Frontier user and give it ownership of the Squid installation and cache directory. As root (su -) on the HN:
useradd -c "Frontier Squid" -n dbfrontier -s /bin/bash
passwd dbfrontier
ssh-agent $SHELL
ssh-add
rocks sync config
rocks sync users
mkdir /scratch/squid
chown dbfrontier:users /scratch/squid - Login as the Frontier user (su - dbfrontier).
- Download and unpack Squid for Frontier (check this link for the latest version):
wget "http://frontier.cern.ch/dist/frontier_squid-4.0rc6.tar.gz"
tar -xvzf frontier_squid-4.0rc6.tar.gz
cd frontier_squid-4.0rc6
- Configure Squid by calling the configuration script:
./configure
providing the following answers:- Installation directory: /scratch/squid
- Network & netmask: 128.8.164.0/255.255.255.192 10.0.0.0/255.0.0.0
- Cache RAM (MB): 256
- Cache disk (MB): 5000
- Install:
make
make install - Start the Squid server:
/scratch/squid/frontier-cache/utils/bin/fn-local-squid.sh start - You can start the Squid server at boot time. As root (su -):
cp /scratch/squid/frontier-cache/utils/init.d/frontier-squid.sh /etc/init.d/.
/sbin/chkconfig --add frontier-squid.sh - Create a cron job to rotate the logs:
crontab /scratch/squid/frontier-cache/utils/cron/crontab.dat - We choose to restrict Squid access to CMS Frontier queries, since the IPs allowed by Squid include addresses not in our cluster. Edit /scratch/squid/frontier-cache/squid/etc/squid.conf and add the line:
http_access deny !CMSFRONTIER
which should be placed immediately before the line:
http_access allow NET_LOCAL
Then tell Squid to use the new configuration:
/scratch/squid/frontier-cache/squid/sbin/squid -k reconfigure - Test Squid with Frontier
- Register your server
To provide new configuration options, call make clean before make to get a fresh install. Be sure to stop the Squid server first (/scratch/squid/frontier-cache/utils/bin/fn-local-squid.sh stop).
We create the site-local-config.xml file as a part of the PhEDEx installation, but it can be created right away. It should be stored in /software/cmssw/SITECONF/T3_US_UMD/JobConfig and /software/cmssw/SITECONF/local/JobConfig. Links provided as a part of the PhEDEx instructions:
- All CMS sites SITECONF directory
- The T3_US_UMD site-local-config.xml
- Twiki about site-local-config.xml
Install a CMSSW release:
- Login as cmssoft to the HN.
- The available CMSSW releases can be listed by:
apt-cache search cmssw | grep CMSSW - Follow these instructions, some notes:
- Due to /software being an auto-network-mounted directory, RPM queries to see available disk space will fail. To turn off RPM disk space queries via apt-get:
apt-get -o RPM::Install-Options::="--ignoresize" install cms+cmssw+CMSSW_X_Y_Z
Since RPM won't check if the space is available first, we must do so manually:
df -h
Check the space available on /scratch (/dev/sda5). Keep in mind that Squid is also on /scratch and needs 5GB of space reserved for its use at all times. To determine the space Squid is currently using:
du -hs /scratch/squid - This process takes about an hour, depending on the quantity of data you'll need to download.
- For CMSSW_2_0_0_pre1 and older:
apt-get --reinstall install cms+cms-common+1.0-cms2
- Due to /software being an auto-network-mounted directory, RPM queries to see available disk space will fail. To turn off RPM disk space queries via apt-get:
- If OSG has been installed:
- Inform BDII that this release of CMSSW is available. As root (su -), edit /share/apps/osg-app/etc/grid3-locations.txt to include the line:
VO-cms-CMSSW_X_Y_Z CMSSW_X_Y_Z /software/cmssw - Edit the grid policy and home page and add the version installed.
- Inform BDII that this release of CMSSW is available. As root (su -), edit /share/apps/osg-app/etc/grid3-locations.txt to include the line:
Uninstall a CMSSW release
- Login as cmssoft to the HN.
- List the currently installed CMSSW versions:
scramv1 list | grep CMSSW - If OSG has been installed:
- Inform BDII that this release of CMSSW is no longer available. As root (su -), edit /share/apps/osg-app/etc/grid3-locations.txt and remove the line:
VO-cms-CMSSW_X_Y_Z CMSSW_X_Y_Z /software/cmssw - Edit the grid policy page to remove the version and the home page to announce its removal.
- Inform BDII that this release of CMSSW is no longer available. As root (su -), edit /share/apps/osg-app/etc/grid3-locations.txt and remove the line:
- Remove a CMSSW release:
apt-get remove cms+cmssw+CMSSW_X_Y_Z
Install CRAB
We install CRAB with gLite-UI on all the WNs, but not on the HN. This is because the gLite-UI services and software conflict with the HN configuration. CRAB can be installed without gLite-UI and may even be able to submit jobs to EDG (European) sites via CrabServer. However, this is not confirmed and may be unstable/impossible. Although we install CRAB on the WNs, the commands below are still performed from the HN because Rocks manages the WNs from the HN. The XML code and instructions are adapted from four (1, 2, 3, 4) gLite guides, this YAIM guide, and this CRAB guide. Be sure to check them for the latest details. Additionally, this guide on the Rocks Kickstart XML syntax is helpful. It's also useful to keep in mind that XML has a few reserved characters.
On the HN as root (su -):
- Navigate to the gLite-UI tarball repository and select your desired version of gLite-UI. These instructions are for 3.1.28-0, though they can be adapted for other releases.
- Download the lcg-CA yum repo and tarballs where they can be served from the HN:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.1/lcg-CA.repo"
wget "http://grid-deployment.web.cern.ch/grid-deployment/download/relocatable/glite-UI/SL4_i686/glite-UI-3.1.28-0.tar.gz"
wget "http://grid-deployment.web.cern.ch/grid-deployment/download/relocatable/glite-UI/SL4_i686/glite-UI-3.1.28-0-external.tar.gz" - Navigate to the CRAB download page and select your desired version of CRAB. These instructions are for 2_5_0, though they can be adapted for other releases.
- Download the tarball where it can be served from the HN:
cd /home/install/contrib/4.3/x86_64/RPMS
wget --no-check-certificate "http://cmsdoc.cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_5_0.tgz" - To install on the WNs via Rocks, add the text in this xml fragment to the <post></post> section of /home/install/site-profiles/4.3/nodes/extend-compute.xml. If you are performing the gLite-UI & CRAB installs manually from the command line, you can reference the text in the xml fragment to see the commands executed to perform the install. The xml fragment is effectively a shell script, with > characters replaced by > , & replaced by & and <file> syntax used to edit grid-env.sh and create site-info.def.
- Create the new distribution:
cd /home/install
rocks-dist dist - Check that the new XML code can be read successfully:
rocks list appliance xml compute
If it prints and does not throw an exception, the code is up to XML spec, although the install itself could still fail for other reasons. - Reinstall the WNs.
User instructions for getting the gLite-UI & CRAB environment are here.
Install OSG
These instructions assume you have already installed Pacman and have a personal grid certificate.
Request host certificates:
Follow these instructions. Some notes:
- Our full hostname is hepcms-0.umd.edu
- Enter osg as the registration authority
- Enter cms as our virtual organization (VO)
- Be sure to run the second request for the http certificate
- Once you've received your certificates, copy them to the appropriate directories:
cp ~root/hepcms-0cert.pem /etc/grid-security/hostcert.pem
cp ~root/hepcms-0key.pem /etc/grid-security/hostkey.pem
cp ~root/hepcms-0cert.pem /etc/grid-security/containercert.pem
cp ~root/hepcms-0key.pem /etc/grid-security/containerkey.pem
cp ~root/http-hepcms-0cert.pem /etc/grid-security/http/httpcert.pem
cp ~root/http-hepcms-0key.pem /etc/grid-security/http/httpkey.pem
chown daemon:daemon /etc/grid-security/containercert.pem
chown daemon:daemon /etc/grid-security/containerkey.pem
chown daemon:daemon /etc/grid-security/http/httpcert.pem
chown daemon:daemon /etc/grid-security/http/httpkey.pem
Install the OSG WN client, CE, and BeStMan
The OSG installation and configuration is based on this OSG guide. OSG is built on top of services provided by VDT, so VDT documentation may be helpful to you. These instructions are for OSG 1.0. We originally installed OSG 0.8 and have archived the instructions. Our BeStMan instructions are for a very old BeStMan release (before BeStMan-Gateway was developed independent of BeStMan). To use a much more modern version of BeStMan-Gateway, follow the instructions in the OSG Twiki. As root on the HN (su -):
- Prepare for install (the version numbers provided here are as of Oct. 27, 2008, except for BeStMan):
- Create these special users:
useradd -c "Monitoring information service" -n mis -s /bin/true
useradd -c "CMS grid jobs" -n uscms01 -s /bin/true
useradd -c "Monitoring from ops" -n ops -s /bin/true
useradd -c "RSV monitoring user" -n rsvuser
passwd rsvuser
ssh-agent $SHELL
ssh-add
rocks sync config
rocks sync users
Setting their shell to true is a security measure, as these user accounts should never actually ssh in. - Create these needed directories:
- Software directories:
cd /share/apps
mkdir wnclient-1.0.1
ln -s wnclient-1.0.1 wnclient
mkdir bestman-2.2.0.11
ln -s bestman-2.2.0.11 bestman
mkdir osg-1.0.0
ln -s osg-1.0.0 osg
mkdir -p osg-app-1.0.0/etc
chmod 775 osg-app-1.0.0 osg-app-1.0.0/etc
ln -s osg-app-1.0.0 osg-app
- Storage directories:
cd /data/se
mkdir -p osg/vos
chown root:users osg osg/vos
chmod 775 osg osg/vos
mkdir log
chown root:users log
chmod 775 log
mkdir -p replica/uscms01
chown root:users replica
chown uscms01:users replica/uscms01
chmod 775 replica replica/uscms01
mkdir -p custodial/uscms01
chown root:users custodial
chown uscms01:users custodial/uscms01
chmod 775 custodial
cd /data/users
mkdir srm-drop
chown uscms01:users srm-drop
chmod 775 srm-drop
- Create a cron job to clean files from /data/se/srm-drop and /data/se/osg on a regular basis. If you haven't done so already, configure cron to garbage collect /tmp on all of the nodes. Edit /var/spool/cron/root and add the line:
49 02 * * * find /data/users/srm-drop -mtime +7 -type f -exec rm -f {} \;
This will remove week-old files from /data/users/srm-drop every day at 2:49am.
06 05 * * 0 find /data/se/osg -mtime +7 -type f -exec rm -f {} \;
This will remove week-old files in /data/se/osg every Sunday at 5:06am.
- Software directories:
- Set a variable needed during VDT install:
export VDTSETUP_CONDOR_LOCATION=/opt/condor
- Create these special users:
- WN client:
Note: Because we install the WN client on the same network mount as the CE, we have the CE handle certificates. This is option 2 in the Twiki. Therefore, we must wait on turning on the WN client services until after the CE install.
cd /share/apps/wnclient
pacman -get OSG:wn-client
Answers to questions:
Add to trusted.caches? yall
Agree to license? y
Cron rotation of VDT logs? y
Update CRLs automatically? n
Update CA certificates automatically? n
Where to store CA files? l (lowercase L, local) - Install the CE:
- Download:
cd /share/apps/osg
pacman -get OSG:ce
Answer yall when asked if you want to add sites to trusted.caches - Have users get the OSG environment:
Add to /etc/skel/.bashrc:
. /share/apps/osg/setup.sh
Add to /etc/skel/.cshrc:
source /share/apps/osg/setup.csh - Source the OSG CE environment:
. /share/apps/osg/setup.sh
Note: Don't add this to root's .bashrc. Depending on the task at hand, root will either source /share/apps/osg/setup.sh or /share/apps/wnclient/setup.sh, setting $VDT_LOCATION to /share/apps/osg for CE upgrades and installs or /share/apps/wnclient for the WN client. - Get the OSG-CondorG package:
pacman -get OSG:Globus-Condor-Setup - Get and turn on ManagedFork for Condor:
pacman -get OSG:ManagedFork
$VDT_LOCATION/vdt/setup/configure_globus_gatekeeper --managed-fork y --server y - srm is hard coded in many places (some which we do not control) to be on port 8443. Change the port that CEmon listens on by replacing 8443 in $VDT_LOCATION/tomcat/v55/conf/server.xml with 7443. We must do this since our HN is both our CE and SE. The line:
enableLookups="false" redirectPort="8443" debug="0"
should become:
enableLookups="false" redirectPort="7443" debug="0" - Similarly, change the ports that Apache listens on by replacing 8080 in $VDT_LOCATION/apache/conf/httpd.conf with 6060. The line:
Listen 8080
should become:
Listen 6060
And edit the file $VDT_LOCATION/apache/conf/extra/httpd-ssl.conf to change port 8443 to port 7443. The lines:
Listen 8443
RewriteRule (.*) https://%{SERVER_NAME}:8443$1
<VirtualHost _default_:8443>
ServerName www.example.com:8443
should become:
Listen 7443
RewriteRule (.*) https://%{SERVER_NAME}:7443$1
<VirtualHost _default_:7443>
ServerName www.example.com:7443 - Configure for Grid3 authorization mode (local grid-mapfile):
- Edit the sudoers file for WS-GRAM services:
- Copy the Twiki text under "WS-GRAM services sudoers file" in the code box (with text such as daemon ALL=(GLOBUSUSERS)...).
- Edit the sudo file:
visudo
a - Paste the text by right clicking.
- Write the file and quit:
Esc
:wq!
- Edit $VDT_LOCATION/edg/etc/edg-mkgridmap.conf and remove all lines but those for the mis, uscms01, and ops users.
- Have the service run automatically:
vdt-register-service --name edg-mkgridmap --enable
(vdt-control --list will state which services want to run, although it does not show which services are currently running) - Call it once to create some files needed for OSG configuration:
cd $VDT_LOCATION/edg/sbin
./edg-mkgridmap --output=test.out
This will print
WARNING: Could not locate
/share/apps/osg-1.0.0/monitoring/gip-attributes.conf
which can be safely ignored as we have not yet configured OSG. As long as the file $VDT_LOCATION/monitoring/osg-user-vo-map.txt has been made, this command has accomplished its intended purpose for the time being.
- Edit the sudoers file for WS-GRAM services:
- Configure OSG:
- Edit $VDT_LOCATION/monitoring/config.ini (our config.ini). Backing up this file before you edit it may be advisable. For a SE, you will need additional parameters specified in full-config.ini (and our config.ini).
- Execute the configuration script:
$VDT_LOCATION/monitoring/configure-osg.py -c
- Edit $VDT_LOCATION/monitoring/config.ini (our config.ini). Backing up this file before you edit it may be advisable. For a SE, you will need additional parameters specified in full-config.ini (and our config.ini).
- Some helpful material on how to use the various services is in $VDT_LOCATION/post-install/README and is worth reading. In particular, it directs us to download certs:
- Edit $VDT_LOCATION/vdt/etc/vdt-update-certs.conf and remove the # (comment) in front of:
cacerts_url = http://software.grid.iu.edu/pacman/cadist/ca-certs-version - Execute:
. $VDT_LOCATION/vdt-questions.sh; $VDT_LOCATION/vdt/sbin/vdt-setup-ca-certificates
vdt-control --on vdt-update-certs - This will create a cron job that will run every hour. For the impatient:
crontab -l
find the call to vdt-update-certs-wrapper and call it yourself.
- Edit $VDT_LOCATION/vdt/etc/vdt-update-certs.conf and remove the # (comment) in front of:
- Place the grid user map where it will be needed later (this is typically done automatically at regularly scheduled itervals, but will be needed by several services right away):
$VDT_LOCATION/edg/sbin/edg-mkgridmap --output /etc/grid-security/grid-mapfile
- Download:
- BeStMan:
*Note: Our BeStMan instructions are for a very old BeStMan release (before BeStMan-Gateway was developed independent of BeStMan). To use a much more modern version of BeStMan-Gateway, follow the instructions in the OSG Twiki.- Download & install BeStMan:
cd /share/apps
wget "http://datagrid.lbl.gov/bestman/pkg/bestman-2.2.0.11.tar.gz"
tar -xzvf bestman-2.2.0.11.tar.gz
To get the latest release of BeStMan, simply use bestman-latest.tar.gz. Note that configuration options have evolved significantly of late, as well as requiring JDK 1.6. - Configure BeStMan:
cd bestman/setup
./configure \
--with-replica-storage-path=/data/se/replica \
--with-replica-storage-size=1572864 \
--with-custodial-storage-path=/data/se/custodial \
--with-custodial-storage-size=5767168 \
--with-eventlog-path=/data/se/log \
--with-cachelog-path=/data/se/log \
--with-http-port=7070 \
--with-https-port=8443 \
--with-globus-tcp-port-range=20000,25000 \
--with-globus-tcp-source-range=20000,25000 \
--enable-srmcache-keyword yes \
--with-srm-name=server \
--with-globus-location=/share/apps/osg/globus \
--with-java-home=/share/apps/osg/jdk1.5 \
--enable-sudofsmng yes
Both enable-srmcache-keyword and with-srm-name options are set to these values by default already. We're simply pointing these options out as worthy of further reading in the BeStMan admin manual. - Make BeStMan start whenever the HN starts:
ln -s /share/apps/bestman/sbin/SXXbestman /etc/rc3.d/S97bestman
- Download & install BeStMan:
- BeStMan SEs need some values set in the $SRM_CONFIG file used by the FNAL srm-client. We edit the $SRM_CONFIG file in the OSG CE installation and point the WN client's $SRM_CONFIG to the OSG CE's. If your OSG CE software isn't network mounted, both files will need to be edited:
- Edit the file:
. /share/apps/osg/setup.sh
echo $SRM_CONFIG
In $SRM_CONFIG (/share/apps/osg/srm-client-fermi/etc/config-2.xml):
<pushmode>: change false to true
<access_latency>: change null to ONLINE
pushmode is needed for third party transfers (srmcp srm://... srm://...) between BeStMan & dCache SEs. access_latency is needed for getting files hosted on a BeStMan SE.
- Tell the WN client to grab the srm configuration from the OSG CE:
rm /share/apps/wnclient/srm-client-fermi/etc/config-2.xml
ln -s /share/apps/osg/srm-client-fermi/etc/config-2.xml /share/apps/wnclient/srm-client-fermi/etc/config-2.xml
- Edit the file:
- Tell the WN client to grab certificates from the OSG CE:
cd /share/apps/wnclient/globus
unlink TRUSTED_CA
ln -s /share/apps/osg/globus/share/certificates TRUSTED_CA - Set up rsvuser's account (we use a personal cert, but RSV allows a service cert):
- Place your personal usercert.pem and userkey.pem files into ~rsvuser/.globus and give rsvuser ownership:
chown rsvuser:users .globus/* - As rsvuser (su - rsvuser), edit ~/.cshrc:
# RSV
source /share/apps/osg/setup.csh
source $VDT_LOCATION/vdt/etc/condor-cron-env.csh - Create the proxy as rsvuser:
source ~/.cshrc
voms-proxy-init -voms cms -out /home/rsvuser/x509up_rsv -hours 1000
Make note of the expiration date and be sure to log back on as rsvuser and renew the proxy whenever it is about to expire. Note that 1000 hours may be too large for your existing certificate. Reduce the number until you no longer receive an error about the proxy expiring after the lifetime of the certificate. - If CMSSW is installed (instructions below are repeated in the CMSSW installation):
- Add a link to the CMSSW installation in the osg-app directory:
cd /share/apps/osg-app
mkdir cmssoft
ln -s /software/cmssw cmssoft/cms - Inform BDII which versions of CMSSW are installed and that we have the slc4_ia32 environment. Edit /share/apps/osg-app/etc/grid3-locations.txt to include the lines:
VO-cms-slc4_ia32_gcc345
VO-cms-CMSSW_X_Y_Z CMSSW_X_Y_Z /software/cmssw
(modify X_Y_Z and add a new line for each release of CMSSW installed)
- Add a link to the CMSSW installation in the osg-app directory:
- Start the OSG CE:
cd /share/apps/osg
. setup.sh
vdt-control --off
vdt-control --on - Start the OSG WN client:
In a fresh shell to avoid PATH problems due to sourcing a different setup.sh:
cd /share/apps/wnclient
. setup.sh
vdt-control --off
vdt-control --on - Start BeStMan:
In a fresh shell to avoid PATH problems due to sourcing a different setup.sh:
/etc/rc3.d/S97bestman start - The cemon log is kept at $VDT_LOCATION/glite/var/log/glite-ce-monitor.log.
- The globus log is kept in $GLOBUS_LOCATION/var/container.log.
- Results of the RSV probes will be visible at https://hepcms-0.umd.edu:7443/rsv in 15-30 mins. Further information can be found in $VDT_LOCATION/osg-rsv/logs/probes.
Register with the Grid Operations Center (GOC):
This should be done only once per site (we've done already).
- Navigate to the OSG Information Management web portal.
- Register as a new user.
- Under the Registrations navigation bar, select Resources->Add New Resource
- Fill in the following values for our CE:
Facility: My Facility Is Not Listed (now that we have registered, we select University of Maryland for any new resources we might add later)
Site: My Site Is Not Listed (again, now that we have registered, we select UMD-CMS)
Resource Name: umd-cms
Resource Services: Compute Element, Bestman-Xrootd Storage Element (note: this text may change soon, select whatever is designated as BeStMan or select SRM V2 if BeStMan is removed)
Fully Qualified Domain Name: hepcms-0.umd.edu
Resource URL: http://hep-t3.physics.umd.edu
OSG Grid: OSG Production Resource
Interoperability: Select WLCG Interoperability BDII (Published to WLCG); do not select WLCG Interoperability Monitoring (SAM)
GOC Logging: Do not select Publish Syslogng
Resource Description: Tier-3 computing center. Priority given to local users, but opportunistic use by CMS VO allowed. - Add the primary and secondary system and security admins.
- You will receive emails with further instructions.
Once VORS registration has completed, monitoring info will be here. Once BDII registration has completed, monitoring info will be here.
Install PhEDEx
We install PhEDEx on a WN for two reasons: (1) we wish to perform some load balancing to prevent the HN from servicing all site requests and (2) we may want to use PhEDEx with gLite at some later date and gLite is known to not cooperate with the other services provided by the Rocks HN. PhEDEx does support transfers without gLite. However, it relies on srmcp, which is less reliable than the PhEDEx default FTS, which uses gLite. Additionally, PhEDEx without gLite may not be supported in the future. Keep in mind that if you install PhEDEx on top of gLite-UI, that it may conflict with the existing CRAB gLite-UI install. Regardless, our current installation of PhEDEx does not use gLite. We install PhEDEx on compute-0-7 (later named phedex-node-0-7). These instructions are adapted from this PhEDEx and this Rocks guide.
These instructions assume you have already done all the major tasks except for the CRAB install. Namely, you need to have configured the big disk, configured the external network for the WNs, and installed BeStMan (which requires globus tools provided by OSG). You will also need to have Kerberos configured & CVS installed and configured (which requires CMSSW) or have the ability to log into a site where you can copy your PhEDEx configuration and which allows you to get Kerberos tickets from CERN and commit to the CMS CVS repository.
- Site registration
- Preparation & manual installation on the WN
- Get proxy & start services
- Clean logs
- Commission links
- Rocks-controlled installation (kickstart)
Site registration
Site registration is done only once for a site. This has already been done for the UMD HEP T3, so do not register again. These instructions are based on this PhEDEx guide, be sure to consult it for the most recent details. You can register your site in SiteDB prior to OSG GOC registration, however, once OSG GOC registration is complete, you should change your SAM name to your OSG GOC name by filing a new Savannah ticket.
- Create a Savannah ticket with your user public key (usercert.pem) and with the information:
- Site name: UMD
- CMS name: T3_US_UMD
- SAM name: umd-cms (our OSG GOC registration name)
- City/Country: College Park, MD, USA
- Site tier: Tier 3
- SE host: hepcms-0.umd.edu
- SE kind: disk
- SE technology: BeStMan
- CE host: hepcms-0.umd.edu
- Associate T1: FNAL
- Grid type: OSG
- Data manager: Marguerite Tonjes
- PhEDEx contact: Marguerite Tonjes
- Site admin: Marguerite Tonjes
- Site executive: Nick Hadley
- Email the persons listed here and ask them to add our site to the PhEDEx database, including a link to the Savannah ticket (CERN phonebook).
- Once someone has responded to say UMD has been put into SiteDB, go to https://cmsweb.cern.ch/sitedb/sitedb/sitelist/
- Log in with your CERN hypernews user name and password
- Under Tier 3 centres, click on the T3_US_UMD link
- Click on "Edit site information" and specify OSG as our Grid Middleware, our site home page as http://hep-t3.physics.umd.edu and our site logo URL as http://hep-t3.physics.umd.edu/images/umd-logo.gif
- We can also add/edit user information by clicking on "Edit site contacts":
- Click on "edit" to edit an existing user's info
- Click on "Add a person with a hypernews account to site" to add someone new
- Then click on the first letter of the user's last name. Note that many users are listed by their middle name instead of their last.
- Find the user in the list, and click "edit"
- A new page will appear. Click on appropriate values ("Site Admin", "Data Manager",etc.) in the last row of the new page (for the Tier 3), and click "Edit these details" to save.
- Under Site Configuration, select "Edit site configuration":
- CE FQDN: hepcms-0.umd.edu
- SE FQDN: hepcms-0.umd.edu
- PhEDEx node: T3_US_UMD
- GOCDB ID: leave blank
- Install development CMSSW releases?: Do not check
- Site installs software manually?: Check
Preparation & manual installation on the WN
We first install PhEDEx manually on compute-0-7 (hepcms-8) as the user phedex. We then create a tarball of the installation directory and place it on the HN to serve via a new Rocks XML kickstart for a PhEDEx appliance, which we create. These instructions are for PhEDEx 3.1.2.
Prepare for the PhEDEx install. On the HN as root (su -):
- Create the PhEDEx user:
useradd -c "PhEDEx" -n phedex -s /bin/bash
passwd phedex
ssh-agent $SHELL
ssh-add
rocks sync config
rocks sync users - Change ownership of the directory on /data which PhEDEx will use:
chown phedex:users /data/se/store
chmod 775 /data/se/store
Install PhEDEx manually on compute-0-7 (hepcms-8):
- Set up the environment:
ssh phedex@hepcms-8.umd.edu
cd /scratch
mkdir -p phedex/3.1.2
ln -s phedex/3.1.2 phedex/current
cd phedex/3.1.2 - We'll need a couple packages to have calls to srmcp make the directories automatically if they don't exist. As root (su -):
- Get the rpm to access the Nebraska yum repository:
rpm -i "http://t2.unl.edu/store/rpms/SL4/x86_64/Nebraska-repo-0.1-1.noarch.rpm" - Get the srmcpT3 package:
yum install srmcpT3 - Get some needed utilities:
yum install setuptools - Remove the dcache-srmclient installed by srmcpT3 - we want to use the dcache-srmclient already provided and updated by OSG on the CE, which is network mounted:
rpm -e --nodeps dcache-srmclient - You can test that the install worked by calling /usr/bin/srmcp_wrapper just as you would call srmcp.
- Get the rpm to access the Nebraska yum repository:
- As root (su -), install apt-rpm (apt-get):
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/apt-0.5.15cnc6-9.SL.x86_64.rpm"
rpm -i -v apt-0.5.15cnc6-9.SL.x86_64.rpm - Install PhEDEx following these instructions. Some notes:
- Our arch is slc4_amd64_gcc345
- We set version=3_1_2
- We use the srm client already installed and network mounted on the OSG CE (we tell PhEDEx to grab the environment in the ConfigPart.Common file).
- We use the JDK already installed and network mounted on the OSG CE. No special modifications to PhEDEx to use it were required.
- Configure PhEDEx following these (1, 2) instructions. Examples of site configuration can be found here. Our local site configuration can be found here.
Some notes:
- Our site name is T3_US_UMD, so our configuration directories are
$PHEDEX_BASE/SITECONF/T3_US_UMD/PhEDEx
and
$PHEDEX_BASE/SITECONF/T3_US_UMD/JobConfig - We had to modify more than just storage.xml, so be sure to check all the files in the directories for differences from the default template.
- The JobConfig directory is not actually needed by PhEDEx, it's needed by CMSSW jobs submitted via CRAB. We choose to put it in our PhEDEx installation area as well (it's harmless).
- CRAB CMSSW jobs also need the files in your SITECONF directory. Copy the entire SITECONF directory to the $CMS_PATH directory:
su -
cp -r /scratch/phedex/current/SITECONF /software/cmssw/.
cp -r /software/cmssw/SITECONF/T3_US_UMD /software/cmssw/SITECONF/local
logout
Some sites use different storage.xml files in their $PHEDEX_BASE and $CMS_PATH directories to handle CRAB stage-out of files without a locally installed storage element. Since we have a storage element, ours are the same. A good example of this case can be seen in UCR's storage.xml.
- Our site name is T3_US_UMD, so our configuration directories are
- You can test your storage.xml file by:
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Prod environ`
Test srmv2 mapping from LFN to PFN:
/scratch/phedex/current/sw/slc4_amd64_gcc345/cms/PHEDEX/PHEDEX_3_1_2/Utilities/TestCatalogue -c storage.xml -p srmv2 -L /store/testfile
Test srmv2 mapping from PFN to LFN:
/scratch/phedex/current/sw/slc4_amd64_gcc345/cms/PHEDEX/PHEDEX_3_1_2/Utilities/TestCatalogue -c storage.xml -p srmv2 -P srm://hepcms-0.umd.edu:8443/srm/v2/server?SFN=/data/se/store/testfile
Other transfers types can be tested by changing the protocol tag srmv2 to direct, srm, or gsiftp and changing the PFN or LFN argument passed to match. - Submit a Savannah ticket for a CVS space under /COMP/SITECONF named T3_US_UMD. Once you receive the space, upload your site configuration to CVS:
kinit_cern mkirn@CERN.CH
cvs co COMP/SITECONF/T3_US_UMD
cp -r /scratch/phedex/current/SITECONF/T3_US_UMD/* COMP/SITECONF/T3_US_UMD/.
cd COMP/SITECONF/T3_US_UMD
cvs add PhEDEx
cvs add PhEDEx/*
cvs commit -R -m "T3_US_UMD PhEDEx site configuration" PhEDEx - Once your initial registration request is satisfied, you will receive three emails titled "PhEDEx authentication role for Prod (Debug, Dev)/UMD." Copy and paste the commands in the email to the command line. Copy the text output for each into the file /scratch/phedex/current/gridcert/DBParam. Each text output should look something like (exact values removed for security):
Section Prod/UMD
Interface Oracle
Database db_not_shown_here
AuthDBUsername user_not_shown_here
AuthDBPassword LettersAndNumbersNotShownHere
AuthRole role_not_shown_here
AuthRolePassword LettersAndNumbersNotShownHere
ConnectionLife 86400
LogConnection on
LogSQL off
Get proxy & start services
After reboot of the phedex node, the grid certificate and proxy should still be valid. After reinstall, you will need to get the certificate and proxy again. On the phedex node (ssh hepcms-8.umd.edu):
- Copy your personal usercert.pem and userkey.pem grid certificate files into ~phedex/.globus and give the phedex user ownership:
chown phedex:users ~phedex/.globus/* - As phedex, create your grid proxy:
voms-proxy-init -voms cms -hours 1000 -out /scratch/phedex/current/gridcert/proxy.cert
Be sure to make note of when the proxy will expire and log on to renew it before then. Note that 1000 hours may be too large for your existing certificate. Reduce the number until you no longer receive an error about the proxy expiring after the lifetime of the certificate. Some sites will not accept proxies older than a week, so if you have many links, you will probably need to renew your proxy every week. - Now start the services. To be extra safe, each service should be started in a new shell, though in most cases, executing the following in sequence should be OK:
- Start the Dev service instance:
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Dev environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Dev start
This service can be stopped by changing the command start to stop. - Start the Debug service instance:
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Debug environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Debug start
This service can be stopped by changing the command start to stop. - Start the Prod service instance:
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Prod environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Prod start
This service can be stopped by changing the command start to stop.
- Start the Dev service instance:
Clean Logs:
PhEDEx does not clean up its own logs. The first time you start the PhEDEx services, it will create the log files. We use logrotate in cron to clean them monthly, as well as to retain two months of old logs. After starting PhEDEx services at least once on the phedex node (ssh phedex@hepcms-8.umd.edu):
- Create the backup directories:
mkdir /scratch/phedex/current/Dev_T3_US_UMD/logs/old
mkdir /scratch/phedex/current/Debug_T3_US_UMD/logs/old
mkdir /scratch/phedex/current/Prod_T3_US_UMD/logs/old
- Create the file /home/phedex/phedex.logrotate with the contents (this logrotate guide was helpful):
rotate 2
monthly
olddir old
nocompress
/scratch/phedex/current/Dev_T3_US_UMD/logs/* {
prerotate
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Dev environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Dev stop
endscript
postrotate
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Dev environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Dev start
endscript
}
/scratch/phedex/current/Debug_T3_US_UMD/logs/* {
prerotate
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Debug environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Debug stop
endscript
postrotate
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Debug environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Debug start
endscript
}
/scratch/phedex/current/Prod_T3_US_UMD/logs/* {
prerotate
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Prod environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Prod stop
endscript
postrotate
cd /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx
eval `/scratch/phedex/current/PHEDEX/Utilities/Master -config /scratch/phedex/current/SITECONF/T3_US_UMD/PhEDEx/Config.Prod environ`
/scratch/phedex/current/PHEDEX/Utilities/Master -config Config.Prod start
endscript
}
- Run logrotate from the command line to check that it works:
/usr/sbin/logrotate -f /home/phedex/phedex.logrotate -s /home/phedex/logrotate.state - As root (su -), automate by editing /var/spool/cron/phedex and adding the line:
52 01 * * 0 /usr/sbin/logrotate /home/phedex/phedex.logrotate -s /home/phedex/logrotate.state
Which will direct logrotate to run every Sunday at 1:52 as the user phedex. - Additionally, the Prod download-remove agent doesn't clean up its job logs. As root, edit /var/spool/cron/phedex and add the line:
02 00 * * 0 find /scratch/phedex/current/Prod_T3_US_UMD/state/download-remove/*log -mtime +7 -type f -exec rm -f {} \;
Commission links:
To download data using PhEDEx, a site must have a Production link originating from one of the nodes hosting the dataset. To create each link, sites must go through a LoadTest/link commissioning process. Our Production links to download to our site are listed here. These instructions are adapted from this Twiki.
- The first link you'll want to commission is from the T1_US_FNAL_Buffer. To commission from FNAL, send a request to begin the link commissioning process to hn-cms-ddt-tf@cern.ch. To commission links from other sites, contact the PhEDEx admins for that site as listed in SiteDB (requires Firefox). Ask them if a link is OK and if so, to please create a LoadTest.
- For non-FNAL sites, create a Savannah ticket requesting that the Debug link be made from the other site to T3_US_UMD. Select the data transfers category, set the severity as 3-Normal, the privacy as public and T3_US_UMD as the site.
- PhEDEx or originating-site admins may create the transfer request for you. If they do, follow the link in the PhEDEx transfer request email sent to you to approve the request. If they do not, create the transfer request yourself:
- Go to the PhEDEx LoadTest injection page and under the link "Show Options," click the "Nodes Shown" tab, then select the source node.
- Find T3_US_UMD in the "Destination node" column and copy the "Injection dataset" name.
- Create a transfer request and copy the dataset name into the "Data Items" box. Select T3_US_UMD as the destination. The DBS is typically LoadTest07, but some sites may create the subscription under LoadTest. You will receive an error if you select the wrong one - simply go back and select the other DBS. Leave the drop down menus as-is (replica, growing, low priority, non-custodial, undefined group). Enter as a comment something to the effect of "Commissioning link from T1_US_FNAL_Buffer to T3_US_UMD," then click the "Submit Request" button.
- As administrator for the site, you should be able to approve the request right away, simply select the "Approve" radio button and submit the change.
- Files created by load tests should be removed shortly after they are created. As root (su -) on the HN, create a cron job that will clean up files automatically. Edit /var/spool/cron/root and add the line:
07 * * * * find /data/se/store/PhEDEx_LoadTest07 -mmin +180 -type f
-exec rm -f {} \;
This will remove three hour old PhEDEx load test files every hour at the 7th minute. - Once load tests have been successful at a rate of >5 MB/sec for one day, the link qualifies as commissioned and PhEDEx admins will create the Production link. If PhEDEx admins don't take note of the successful tests within a week, you can send a reminder to hn-cms-ddt-tf@cern.ch or reply to the Savannah ticket that the link passes commissioning criteria and that you'd like the Prod link to be created.
Rocks-controlled installation (kickstart)
Since we install PhEDEx on only one of the WNs, we create a special Rocks appliance, based on the instructions in this Rocks guide. Additionally, this guide on the Rocks Kickstart XML syntax is helpful. It's also useful to keep in mind that XML has a few reserved characters.
First, as the phedex user on the PhEDEx node, create the tarball:
cd /scratch/phedex
tar --preserve-permissions -czvf phedex.tgz 3.1.2
cp phedex.tgz /home/phedex/.
As the root user on the HN:
- Copy the tarball and the apt rpm to the directory where it will be served from at install time:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/apt-0.5.15cnc6-9.SL.x86_64.rpm"
wget "http://t2.unl.edu/store/rpms/SL4/x86_64/Nebraska-repo-0.1-1.noarch.rpm"
cp -p /home/phedex/phedex.tgz . - Download the files phedex-node.xml to /home/install/site-profiles/4.3/nodes and phedex-appliance.xml to /home/install/site-profiles/4.3/graphs/default.
- Create the new Rocks distribution:
cd /home/install
rocks-dist dist - Add an entry for the new PhEDEx appliance to the Rocks MySQL database:
rocks add appliance phedex-node membership='PhEDEx' short-name='ph' node='phedex-node' - Verify that the new XML code is correct:
rocks list appliance xml phedex-node
If this throws an exception, the last line states where the syntax problem is. - Replace compute-0-7 with the new phedex-node-0-7 and install it:
insert-ethers --replace compute-0-7
Select PhEDEx in the menu and leave insert-ethers running. - In a separate terminal, ssh directly to hepcms-8 as root to kickstart it (compute-0-7 has been removed from the rocks database):
ssh hepcms-8.umd.edu
Create /tmp/nukeit.sh:for i in `df | awk '{print $6}'` do if [ -f $i/.rocks-release ] then rm -f $i/.rocks-release fi done
Execute it:
chmod +x /tmp/nukeit.sh
/tmp/nukeit.sh
And kickstart the node:
/boot/kickstart/cluster-kickstart
You will lose your connection to hepcms-8. - Once insert-ethers shows phedex-node-0-7 (*), you can exit it (using the F11 key). The newly-named phedex-node-0-7 should be available within an hour. You can watch its install progress:
rocks-console phedex-node-0-7
The post installation script now performs a lot of work, so expect it to take some time. - Restart the Ganglia monitor:
/etc/init.d/gmond restart
/etc/init.d/gmetad restart - You will need to reconfigure the node's external network connection, which requires another reinstall:
rocks set host interface ip phedex-node-0-7 eth1 128.8.164.20
rocks set host interface gateway phedex-node-0-7 eth1 128.8.164.1
rocks set host interface name phedex-node-0-7 eth1 HEPCMS-8.UMD.EDU
rocks set host interface subnet phedex-node-0-7 eth1 public
ssh-agent $SHELL
ssh-add
shoot-node phedex-node-0-7 - Once you've reinstalled the node, you will need to restart the PhEDEx services.
Install/configure other software
Software which must be usable by the worker nodes should be installed in the head node /export/apps directory. /export/apps is cross mounted across all nodes and is visible by all nodes as the /share/apps directory.
RPMforge:
RPMforge helps to resolve package dependencies when installing new software. It enables RPMforge repositories in smart, apt, yum, and up2date. We use yum. Packages are installed both on the HN and on the WNs, so RPMforge needs to be installed for both. These instructions are adapted from RPMforge and Rocks.
- To install RPMforge on the HN:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://packages.sw.be/rpmforge-release/rpmforge-release-0.3.6-1.el4.rf.x86_64.rpm"
rpm -Uhv rpmforge-release-0.3.6-1.el4.rf.x86_64.rpm - To install RPMforge on the WNs:
Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml and add the following line:
<package>rpmforge-release</package>
Make a new Rocks kickstart distribution:
cd /home/install
rocks-dist dist
Reinstall the WNs.
xemacs/emacs:
Rocks does not install xemacs on any nodes nor emacs on the WNs. These installation instructions below assumed that you have installed RPMforge on the HN to resolve package dependencies. Instructions to install on the WNs are adapted from this Rocks guide.
- Install xemacs on the HN:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/xemacs-common-21.4.15-10.EL.1.x86_64.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/xemacs-21.4.15-10.EL.1.x86_64.rpm"
yum localinstall xemacs-common-21.4.15-10.EL.1.x86_64.rpm
yum localinstall xemacs-21.4.15-10.EL.1.x86_64.rpm - Install xemacs and emacs on the WNs:
"http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/apel-xemacs-10.6-5.noarch.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/FreeWnn-libs-1.10pl020-5.x86_64.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/Canna-libs-3.7p3-7.EL4.x86_64.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/xemacs-sumo-20040818-2.noarch.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/emacs-common-21.3-19.EL.4.x86_64.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/emacs-21.3-19.EL.4.x86_64.rpm"
Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml by adding the following <package> lines:
<package>Canna-libs</package>
<package>FreeWnn-libs</package>
<package>apel-xemacs</package>
<package>xemacs-sumo</package>
<package>xemacs-common</package>
<package>xemacs</package>
<package>emacs-common</package>
<package>emacs</package>
Create the new Rocks kickstart distribution:
cd /home/install
rocks-dist dist
Re-shoot the WNs.
It is not entirely clear if all these rpm files really must be downloaded (they should come with the SL4.5 release), but the instructions above have been verified to work.
g++ & g77:
Rocks does not install g++ or g77 on any nodes. Instructions to install on the WNs are adapted from this Rocks guide. As root (su -) on the HN:
- Install on the HN:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/libstdc++-devel-3.4.6-8.x86_64.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/gcc-c++-3.4.6-8.x86_64.rpm"
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/gcc-g77-3.4.6-8.x86_64.rpm"
rpm -ivh "libstdc++-devel-3.4.6-8.x86_64.rpm"
rpm -ivh "gcc-c++-3.4.6-8.x86_64.rpm"
rpm -ivh "gcc-g77-3.4.6-8.x86_64.rpm" - Install g++ on the WNs:
Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml by adding the following <package> lines:
<package>libstdc++-devel</package>
<package>gcc-c++</package>
<package>gcc-g77</package>
Create the new Rocks kickstart distribution:
cd /home/install
rocks-dist dist
Re-shoot the WNs.
It is not entirely clear if all these rpm files really must be downloaded (they should come with the SL4.5 release), but the instructions above have been verified to work.
LaTeX:
LaTeX is not automatically installed on the WNs, although it is on the HN. These instructions assume you have installed RPMForge on the WNs, a useful dependency resolver.
- As root (su -) on the HN, edit /home/install/site-profiles/4.3/nodes/extend-compute.xml and add to the <post> section:
yum -y install tetex-latex
yum -y install tetex-afm
yum -y install tetex-xdvi
yum -y install tetex-doc
Various image viewers are also useful:
yum -y install ggv
yum -y install gpdf
yum -y install ImageMagick - Create new distribution:
cd /home/install
rocks-dist dist - Reinstall the WNs.
Pacman:
We install Pacman only on the HN. As root (su -):
- Download the latest Pacman:
wget "http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-latest.tar.gz" - Unzip to /usr:
tar xzvf pacman-latest.tar.gz -C /usr
- Source the setup script for the first time:
cd /usr
. /usr/pacman-3.25/setup.sh
- Edit ~root/.bashrc to include the source:
. /usr/pacman-3.25/setup.sh
Kerberos:
These instructions enable getting kerberos tickets from FNAL and from CERN. User instructions for kerberos authentication are given here.
Configure Kerberos on the HN. As root (su -) on the HN:
- To enable FNAL tickets, save this file as /etc/krb5.conf.
- To enable CERN tickets, add to /etc/krb.conf:
CERN.CH
CERN.CH afsdb1.cern.ch
CERN.CH afsdb3.cern.ch
CERN.CH afsdb2.cern.ch - And add to /etc/krb.realms:
.cern.ch CERN.CH - Configure ssh to use Kerberos tickets:
Make the appropriate file writeable:
chmod +w /etc/ssh/ssh_config
Add the lines to /etc/ssh/ssh_config:
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
Remove writeability:
chmod -w /etc/ssh/ssh_config
Restart the ssh service:
/etc/init.d/sshd restart - Add to /etc/skel/.cshrc:
# Kerberos
alias kinit_fnal '/usr/kerberos/bin/kinit -A -f'
alias kinit_cern '/usr/kerberos/bin/kinit -5'
- Add to /etc/skel/.bashrc and to ~root/.bashrc:
# Kerberos
alias kinit_fnal='/usr/kerberos/bin/kinit -A -f'
alias kinit_cern='/usr/kerberos/bin/kinit -5'
Configure Kerberos on the WNs. As root (su -) on the HN:
- Copy krb5.conf to where it can be served from the HN during WN install:
cp /etc/krb5.conf /home/install/contrib/4.3/x86_64/RPMS/krb5.conf - Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml and add to the <post> section:
wget -P /etc http://<var name="Kickstart_PublicHostname"/>/install/rocks-dist/lan/x86_64/RedHat/RPMS/krb5.conf
<file name="/etc/ssh/ssh_config" mode="append">
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
</file>
<file name="/etc/krb.conf" mode="append">
CERN.CH
CERN.CH afsdb1.cern.ch
CERN.CH afsdb3.cern.ch
CERN.CH afsdb2.cern.ch
<file>
<file name="/etc/krb.realms" mode="append">
.cern.ch CERN.CH
</file> - Create the new Rocks distribution:
cd /home/install
rocks-dist dist - Reinstall the WNs
CVS:
CVS needs to be configured to automatically contact the CMSSW repository using Kerberos-enabled authentication. A Kerberos-enabled CVS client is already installed on the HN, but the WNs use a version of CVS distributed by Rocks, which needs to be updated. While this is a one-time install, we believe it must be done after at least one version of CMSSW has been installed on your system. At the very least, it must be done after the one-time CMSSW install commands. Of course, Kerberos authentication to CERN must also be configured. These instructions also assume that RPMforge is installed on the WNs. These instructions are based on this FAQ.
On the HN as root (su -):
- Configure CVS to contact the Kerberos-enabled CMSSW CVS repository:
source /software/cmssw/slc4_ia32_gcc345/external/apt/0.5.15lorg3.2-cms3/etc/profile.d/init.sh
apt-get install cms+cms-cvs-utils+1.0-cms - Download the Kerberos-enabled CVS client for the WNs:
cd /home/install/contrib/4.3/x86_64/RPMS
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/cvs-1.11.17-9.RHEL4.x86_64.rpm" - Install CVS on the WNs. Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml and add to the <post> section:
wget "http://<var name="Kickstart_PublicHostname"/>/install/rocks-dist/lan/x86_64/RedHat/RPMS/cvs-1.11.17-9.RHEL4.x86_64.rpm"
yum -y localinstall cvs-1.11.17-9.RHEL4.x86_64.rpm - Create the new distribution:
cd /home/install
rocks-dist dist - Reinstall the WNs.
User instructions for CVS checkout are given here.
Subversion:
We already had RPMforge (to resolve dependencies) at the time that we install subversion. A dependency resolver, such as RPMforge, may be required to install subversion.
yum install subversion
cron garbage collection:
These instructions provide the cron and Rocks kickstart cron commands to add garbage collection of /tmp for all nodes.
First, create a cron job on the HN. As root (su -) on the HN, edit /var/spool/cron/root and add the line:
6 * * * * find /tmp -mtime +1 -type f -exec rm -f {} \;
36 2 * * 6 find /tmp -depth -mtime +7 -type d -exec rmdir --ignore-fail-on-non-empty {} \;
This will remove day-old files in /tmp on the HN every hour on the 6th minute and week-old empty directories in /tmp on the HN every Saturday at 2:36.
Now create the cron job on the WNs:
- Edit /home/install/site-profiles/4.3/nodes/extend-compute.xml and place the following commands inside the <post></post> brackets:
<!-- Create a cron job that garbage-collects /tmp -->
<file name="/var/spool/cron/root" mode="append">
6 * * * * find /tmp -mtime +1 -type f -exec rm -f {} \;
36 2 * * 6 find /tmp -depth -mtime +7 -type d -exec rmdir --ignore-fail-on-non-empty {} \;
</file> - Create the new distribution:
cd /home/install
rocks-dist dist - Re-install the WNs
Condor
We install Condor using the Rocks roll, then modify it to add Condor_G as a part of the OSG installation. To be safe, you should configure condor after you've installed OSG. These instructions are based on the very complete guide provided by Condor.
- Due to a domain-mismatch between internal and external node names in Rocks, Condor must be configured to trust the domain advertised by the submitting node. This must be done on both the HN and WNs.
On the HN as root (su -):
- Edit /opt/condor/etc/condor_config.local and add the line:
TRUST_UID_DOMAIN = True - Replace the original Rocks Condor roll xml file that creates the condor_config.local file on the WNs:
cp /home/install/rocks-dist/lan/x86_64/build/nodes/condor-client.xml /home/install/site-profiles/4.3/nodes/replace-condor-client.xml - Edit /home/install/site-profiles/4.3/nodes/replace-condor-client.xml and add the following line after the call to CondorConf:
echo "TRUST_UID_DOMAIN = True" >> /opt/condor/etc/condor_config.local
- Edit /opt/condor/etc/condor_config.local and add the line:
- Restart the Condor service on the HN:
/etc/init.d/rocks-condor restart - If OSG is installed, the OSG condor-devel service (as well as RSV, which uses condor-devel) needs to be restarted:
cd /share/apps/osg
vdt-control --off osg-rsv condor-devel
vdt-control --on condor-devel osg-rsv - Create the new Rocks distribution:
cd /home/install
rocks-dist dist - Reinstall the WNs.
- Create a simple condor monitoring script that will route output to the web server, to be viewed by users:
- Create the file /root/condor-status-script.sh with the contents:
#!/bin/bash
. /root/.bashrc
OUTPUT=/var/www/html/condor_status.txt
echo -e " \n\n" >$OUTPUT
echo -e "As of `date` \n">>$OUTPUT
/opt/condor/bin/condor_status -submitters >>$OUTPUT
/opt/condor/bin/condor_userprio -all >>$OUTPUT
/opt/condor/bin/condor_status -run >>$OUTPUT - Run it every 10 minutes by editing /var/spool/cron/root and adding the line:
1,11,21,31,41,51 * * * * /root/condor-status-script.sh - Output will be here.
- Create the file /root/condor-status-script.sh with the contents:
Condor keeps logs in /var/opt/condor/log, StartLog & StarterLog are particularly useful. Generally, the most information can be found on the node which serviced (not submitted) the job you are attempting to get info on.
Backup critical files
The files below should be backed up to a secure non-cluster location. Marguerite Tonjes currently maintains the backup of these files. Users can use /data as a backup location, but this is not a sufficient backup location for these critical admin files. Note that many of these files are readable only by root.
- In ~root:
- network-ports.txt
- configure-external-network.sh
- security.txt
- hepcms-0cert.pem
- hepcms-0key.pem
- http-hepcms-0cert.pem
- http-hepcms-0key.pem
- OSG (directory and contents)
- condor-status-script.sh
- In /etc:
- krb5.conf
- krb.conf
- krb.realms
- fstab
- exports
- auto.master
- auto.home
- auto.software
- skel (directory and contents)
- sysconfig/iptables
- In /home/install/site-profiles/4.3/nodes:
- extend-compute.xml
- replace-auto-partition.xml
- replace-auto-kickstart.xml (if it exists)
- replace-condor-client.xml
- phedex-node.xml
- /home/install/site-profiles/4.3/graphs/default/phedex-appliance.xml
- /home/phedex/phedex.tgz
- /var/www/html/index.html
- /var/spool/cron (directory and contents)
- /share/apps/osg/monitoring/config.ini
- /software/cmssw/SITECONF (directory and contents)
We have a backup script, /root/backup-script.sh, which is run by cron on a weekly basis. It will copy all the needed files to /root/backup, which should then be manually copied from the cluster to a different machine on a regular basis.
Note: /share/apps/osg cannot realistically be used to recover from total HN failure. But since OSG software is regularly updated and the pacman installer is good about touching only files inside the install/update directory, in the event of a bad or failed update, it's usually safe to recover from a backup of this directory.
Recover from HN failure
A HN failure which requires HN reboot is relatively easy to deal with and simply involves the manual starting of a few services. A HN failure which requires reinstall is difficult because the WNs must be reinstalled as well. Instructions are also provided to powerdown the entire cluster and turn the entire cluster back on. This Rocks guide can help to upgrade or reconfigure the HN with minimal impact - you may want to append the files listed here to the FILEs directive in version.mk (files in /home/install/site-profiles are saved automatically).
Power down and up procedures
Before powering down, make sure you have a recent copy of the critical files to backup. Our backup script places all the need critical files in /root/backup on a weekly basis. To power down, login to the HN as root (su -):
- ssh-agent $SHELL
- ssh-add
- cluster-fork "poweroff"
- poweroff
If you are concerned about the possibility of power spikes, go to the RDC:
- Flip both power switches on the back of the big disk array.
- Turn the UPS off by pressing the O (circle) button.
- Flip the power switch on the back of the UPS.
- Flip the power switches on both large PDU's, in the middle of the
rack. Each large PDU has two switches. - Remove the floor tile directly behind the cluster.
- If possible without undue strain to the connectors, unplug both power cables from their sockets.
- Replace the floor tile.
To power up, go to the RDC:
If applicable:
- Remove the floor tile directly behind the cluster.
- Plug in power cables in the floor.
- Replace the floor tile.
- Flip UPS and big PDU unit power switches.
- Turn UPS on by pressing | / Test button on the front.
- Turn the big disk array on by flipping both switches in the back. Flip one switch, wait for the disks and fans to spin up, then spin down. Then flip the second switch.
Once the big disk array fans and disks have spun down from their initial spin up:
- Press power button on HN. Wait for it to boot completely.
- Power cycle the switch using its power cable (the switch has no switch
hardy har har). - Login on the HN as root, start the GUI environment (startx).
- Open an internet browser and enter the address 10.255.255.254. If you don't get a response, wait a few more minutes for the switch to complete its startup, diagnoses, and configuration.
- Log into the switch (user name and password can be obtained from Marguerite Tonjes).
- Under Switching->Spanning Tree->Global Settings, select Disable from the "Spanning Tree Status" drop down menu. Click "Apply Changes" at the bottom.
- Press the power buttons on all eight WNs. Wait a few seconds between each one.
- Follow the procedure below to recover from HN reboot.
Note: While our cluster has the ability to be powered up completely from a network connection, it has not yet been configured. At the present time, powering up requires a visit to the RDC.
Recover from HN reboot
BeStMan & OSG should be started automatically at boot time. As root (su -) on the HN:
- Check RSV probes.
If any probes are failing, it may be due to cron maintenance jobs for OSG which haven't run yet. Issue the command:
crontab -l | grep osg
and scan for any jobs with names that are similar to the failing probe. Execute the command manually and wait for the next RSV probe to run. - If you rebooted the phedex node (phedex-node-0-7), you must restart the PhEDEx services following these instructions.
- PhEDEx, if still running, will reconnect with the BeStMan service automatically. You can verify that the instances are still running by checking the files on phedex-node-0-7:
/scratch/phedex/current/Debug_T3_US_UMD/logs/download-srm
/scratch/phedex/current/Prod_T3_US_UMD/logs/download-srm
If PhEDEx does not reconnect, follow these instructions to stop and start the PhEDEx services. - Check Ganglia. All nodes should be reporting, it is highly unlikely that HN reboot alone would cause WNs to stop reporting. However, if they are not reporting, try restarting the Ganglia service:
/etc/init.d/gmond restart
/etc/init.d/gmetad restart
If a node is still not reporting, you can attempt to reboot the WN:
ssh-agent $SHELL
ssh-add
ssh compute-x-y 'reboot'
or, to reboot all WNs:
cluster-fork "reboot"
Recover from HN reinstall
- Install the HN and WNs following the Rocks installation instructions. Using the Rocks boot disk instead of PXE boot for the WNs has a higher probability of success. Forcing the default partitioning scheme for the WNs also has a higher probability of success. Don't forget to power cycle the switch and configure it via a web browser.
- Copy the backed up critical files to /root/backup. Make sure the read/write permissions are set correctly for each file. As root (su -):
cd /root/backup - Create at least one new user.
- Configure security following the instructions in security.txt.
- Copy the info and certificate files to the correct directory:
cp security.txt ../.
cp network-ports.txt ../.
cp configure-external-network.sh ../.
cp hepcms-0cert.pem ../.
cp hepcms-0key.pem ../.
cp http-hepcms-0cert.pem ../.
cp http-hepcms-0key.pem ../. - Follow the instructions in this How-To guide to change the WN partitions (if necessary), mount the big disk, place the WNs on the external network and install xemacs and emacs on both the HN and WNs. Instructions which call for rocks-dist dist (and the accompanying shoot-node) can be stacked. Shoot the nodes (re-install the WNs) once after configuring Rocks for the disks, network and emacs. Then install all the software (the CRAB & PhEDEx nodes must be shot one more time). A few notes:
- Backed up copies of many of the modified files should already be made, so there should be very few manual file edits. Be sure to save the original files in case of failure.
- The boot order of the WNs may have changed, so the Rocks name assignment may correspond to a different physical node. The external IP addresses map to an exact patch panel port number, so move the network cables to the correct port on the patch panel. Use /root/network-ports.txt as your guide and be sure to modify it with the new switch port numbers (or move the switch port cables if you prefer -- the switch doesn't care). You may also want to modify the LEDs displaying the Rocks internal name, which can be done at boot time (strike F2 during boot to get to setup), under "Embedded Server Management."
Solutions to encountered errors
Errors are organized by the program which caused them:
- RAID
- Rocks
- Condor
- Logical volume (LVM)
- CMSSW
- gcc/g++/gcc4/g++4
- Dell OpenManage
- YUM
- gLite
- srm
- OSG/RSV
- SiteDB/PhEDEx
RAID
- During HN boot:
Foreign configuration(s) found on adapter.
Followed by:
1 Virtual Drive(s) found
1 Virtual Drive(s) offline
3 Virtual Drive(s) handled by BIOS
This Dell troubleshooting guide is a useful resource. In our case, this occurred because we booted the HN before the disk array had fully powered up. We believe this also corrupted the PERC-6/E RAID controller configuration. Upon subsequent shut down of the HN, full disk array power-up, followed by powering the HN again, we loaded the foreign configuration (pressed the key f). The RAID controller can also be configured again using the configuration utility (c or Ctrl+r).
Rocks
- NameError: global name 'FileCopyException' is not defined (inspection of other terminals shows that comps.xml is missing)
Rocks 4.3 needs a special 'comps' roll for SL4.5. It must be downloaded, placed on disk, and selected as a roll at install time. - An error occurred when attempting to load an installer interface component className=FDiskWindow"
Rocks is complaining that the partition table in the kickstart file is incorrect. Check /home/install/site-profiles/4.3/nodes/replace-auto-partition.xml for syntactic problems (Beware! You may lose existing data!). If your system is having very serious partition issues, or this file does not exist, try these instructions to force the default Rocks partitioning scheme. Once replace-auto-partition.xml is repaired, issue the rocks-dist dist command from the /home/install directory. Depending on your situation, you may need to force the nodes to load the new kickstart file. - After a WN installs successfully, it reboots with the error:
mkrootdev: label /1 not found
Mounting root filesystem
mount: error 2 mounting ext3
mount: error 2 mounting none
Switching to new root
switchroot: mount failed: 22
umount /initrd/dev failed: 2
Kernel panic - not syncing: Attempted to kill init!
This error can occur when using non-default partitioning on the WNs and is due to disk LABEL synchronization issues. The Rocks authors have seen this error before, but are unable to reproduce the conditions which cause it to occur. In order to prevent failures of this type from occurring both when first attempting to use non-default partitioning and when calling shoot-node after successful reinstall, add the following to /home/install/site-profiles/4.3/nodes/extend-compute.xml in the <post> section:
e2label /dev/sda1 /
cat /etc/fstab | sed -e s_LABEL=/1_LABEL=/_ > /tmp/fstab
cp -f /tmp/fstab /etc/fstab
cat /boot/grub/grub-orig.conf | sed -e s_LABEL=/1_LABEL=/_
> /tmp/grub.conf
cp -f /tmp/grub.conf /boot/grub/grub-orig.conf
This will force all files which use disk LABELs to be in agreement with one another. Be sure to create the new rocks distribution:
cd /home/install
rocks-dist dist
And use the Rocks boot disk to get the WN to reinstall - we've found that PxE boot is not consistently successful in the event of a Kernel panic. - shoot-node gives errors:
Waiting for ssh server on [compute-0-1] to start
ssh: connect to host compute-0-1 port 2200: Connection refused
...
Waiting for VNC server on [compute-0-1] to start
Can't connect to VNC server after 2 minutes
ssh: connect to host compute-0-1 port 2200: Connection refused
...
main: unable to connect to host: Connection refused (111)
Exception in thread Thread-1:
Traceback (most recent call last):
File "/scratch/home/build/rocks-release/rocks/src/roll/base/src/foundation-python/foundation-python.buildroot//opt/rocks/lib/python2.4/threading.py", line 442, in __bootstrap
self.run()
File "/opt/rocks/sbin/shoot-node", line 313, in run
os.unlink(self.known_hosts)
OSError: [Errno 2] No such file or directory: '/tmp/.known_hosts_compute-0-1'
and examination of WNs reveals they are trying to install interactively (i.e., requesting language for the install, etc.):
This seems to occur most commonly when there is a problem with the .xml files used for the Rocks distribution. The solution which works most consistently is to remove all your modified .xml files in /home/install/site-profiles/4.3/nodes (leave skeleton.xml) and force default partitioning. Then reinstall the WNs -- you will have to manually restart the WNs as they will remain in the interactive install state until manual intervention. Failing this, reinstall the entire cluster, although this will not guarantee success if you use the same .xml files. - shoot-node & cluster-kickstart give the error:
error reading information on service rocks-grub: No such file or directory
cannot reboot: /sbin/chkconfig failed: Illegal seek
This occurs when the rocks-boot-auto package is removed, which prevents WNs from automatically reinstalling every time they experience a hard boot (such as power failure). This error can be safely ignored as it does not actually prevent the node from rebooting and reinstalling from the kickstart when the reinstall commands are manually issued. - Wordpress gives the error:
We were able to connect to the database server (which means your username and password is okay) but not able to select the wordpress database.
the MySQL Rocks web interface says on the left-side bar:
No databases
but Rocks commands still work.
We are unsure what caused this error. We attempted various service restarts to no avail. In the end, rebooting the HN solved the issue. We experienced no apparent Rocks DB corruption as a result of this error.
Condor
- Condor job submission works from the HN, but none of the WNs. Errors in the condor include "Permission denied" and "Command not found." Examination of /var/opt/condor/log/StarterLog shows the error:
ERROR: the submitting host claims to be in our UidDomain (UMD.EDU), yet its hostname (compute-0-1.local) does not match. If the above hostname is actually an IP address, Condor could not perform a reverse DNS lookup to convert the IP back into a name. To solve this problem, you can either correctly configure DNS to allow the reverse lookup, or you can enable TRUST_UID_DOMAIN in your condor configuration.
This occurs because jobs are submitted via the local network, so the submitting node has the name compute-x-y.local, instead of HEPCMS-X.UMD.EDU. The easiest fix is to set TRUST_UID_DOMAIN = True in the /opt/condor/etc/condor_config.local files on both the HN and WNs. Instructions are outlined here.
LVM
- Insufficient free extents (2323359) in volume group data: 2323360 required (error is received on command lvcreate -L 9293440MB data).
Sometimes it is simpler to enter the value in extents (the smallest logical units LVM uses to manage volume space). Use a '-l' instead of '-L' and specify the maximum number of free extents (provided by the error):
lvcreate -l 2323359 data
CMSSW
- cmsRun works on the HN, but none of the WNs.
Note this error could be caused by any number of issues; when we encountered this error, it was because our 'one-time' CMSSW install had the environment variable VO_CMS_SW_DIR set to the real directory where the data resides on the HN, rather than the network mounted directory (with a different name) that 'points' to the real directory. For example, the physical partition where CMSSW is installed on the HN is /scratch/cmssw. However, the network mounted directory is named /software/cmssw. Set VO_CMS_SW_DIR to /software/cmssw, rather than /scratch/cmssw. We found removing the contents of /scratch/cmssw prior to complete re-install helped. - E: Sub-process /software/cmssw/slc4_ia32_gcc345/external/apt/0.5.15lorg3.2-CMS19c/bin/rpm-wrapper returned an error code (100)
This link suggests that it is due to a lack of disk space in the area where you are installing CMSSW. However, because we install in /software and /software is network mounted, the size of /software scales with the files put into it. When RPM checks that there is enough space in /software to install, it fails. When executing apt-get, add the option:
apt-get -o RPM::Install-Options::="--ignoresize" ... - error: unpacking of archive failed on file /share/apps/cmssw/share/scramdbv0: cpio: mkdir failed - Permission denied
This error occurs because both bootstrap.sh and the CMSSW apt-get install create a soft-link to the 'root' directory where CMSSW is being installed. In our case, since we first tried to install CMSSW to /share/apps (automatically network mounted by Rocks), the soft link is named share. However, CMSSW also has a true subdirectory named share and does write files to this directory. The soft link overrides the true directory and resultantly, CMSSW tries to install to /share, where it does not have permission. In short, CMSSW cannot be installed to any directory named /share, /common, /bin, /tmp, /var, or /slc4_XXX. Follow the CMSSW installation guide for directions on network mounting /scratch/cmssw as /software/cmssw. - apt-get update issues the error:
E: Could not open lock file /var/state/apt/lists/lock - open (13 Permission denied)
E: Unable to lock the list directory
Be sure to first source the scram apt info:
source $VO_CMS_SW_DIR/$SCRAM_ARCH/external/apt/0.5.15lorg3.2-CMS19c/etc/profile.d/init.csh
gcc/g++/gcc4/g++4
- Attempts to compile code gives errors about missing libraries, including:
stddef.h: No such file or directory
bits/c++locale.h: No such file or directory
bits/c++config.h: No such file or directory
This could be caused by any number of issues. In our case, the gcc-c++ and gcc4-c++ packages needed the libstdc++-devel package. To install it:
rpm -ivh "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/libstdc++-devel-3.4.6-8.x86_64.rpm"
Dell OpenManage
- srvadmin-install.sh gives the error:
libstdc++.so.5 is needed by srvadmin-omacore-5.5.0-364.i386
libstdc++.so.5(GLIBCPP_3.2) is needed by srvadmin-omacore-5.5.0-364.i386
libstdc++.so.5(GLIBCPP_3.2.2) is needed by srvadmin-omacore-5.5.0-364.i386
libstdc++.so.5 is needed by srvadmin-rac5-components-5.5.0-364.i386
While we have compat-libstdc++-33-3.2.3-47.3.x86_64 installed, Dell needs the i386 version. Get it by:
wget "http://ftp.scientificlinux.org/linux/scientific/45/x86_64/SL/RPMS/compat-libstdc++-33-3.2.3-47.3.i386.rpm"
rpm -ivh compat-libstdc++-33-3.2.3-47.3.i386.rpm
YUM
- Transaction Check Error: file /etc/httpd/modules from install of httpd-2.0.52-38.sl4.2 conflicts with file from package tomcat-connectors-1.2.20-0
We encountered this error on a call to yum update, which was needed for our gLite UI installation. We removed the tomcat-connectors package and encountered no further issues. We also removed the tomcat5 package for extra measure, but that may not be necessary.
yum remove tomcat5
yum remove tomcat-connectors
yum clean all
yum update
gLite
- Error: Missing Dependency: perl(SOAP::Lite) is needed by package glite-data-transfer-api-perl
Error: Missing Dependency: perl(SOAP::Lite) is needed by package glite-data-catalog-api-perl
gLite UI requires the SOAP Lite Perl. SOAP is a difficult install due to the sheer quantity of dependencies on other packages. An excellent dependency resolver is available from RPMforge and makes the SOAP install a breeze. These instructions are for our particular OS and architecture:
cd /usr/src/redhat/RPMS/x86_64
wget "http://packages.sw.be/rpmforge-release/rpmforge-release-0.3.6-1.el4.rf.x86_64.rpm"
rpm -Uhv rpmforge-release-0.3.6-1.el4.rf.x86_64.rpm
cd ../noarch
wget "http://dag.wieers.com/rpm/packages/perl-SOAP-Lite/perl-SOAP-Lite-0.71-1.el4.rf.noarch.rpm"
yum localinstall perl-SOAP-Lite-0.71-1.el4.rf.noarch.rpm
Note: This error will only occur if you are attempting to do the apt-get style installation of gLite-UI. The tarball installation of gLite-UI is self-contained and you should not encounter this error, nor need RPMforge.
SRM
- srmcp issues the error:
GridftpClient: Was not able to send checksum
value:org.globus.ftp.exception.ServerException: Server refused
performing the request. Custom message: (error code 1) [Nested
exception message: Custom message: Unexpected reply: 500 Invalid
command.] [Nested exception is
org.globus.ftp.exception.UnexpectedReplyCodeException: Custom
message: Unexpected reply: 500 Invalid command.]
but the file transfer is successful.
This error occurs because srmcp is an srm client developed by dCache with the special added functionality of a checksum. BeStMan uses the LBNL srm client and does not support srmcp checksum functionality, nor does Globus gridftp. This error can be safely ignored.
OSG/RSV
- After HN boot:
RSV jobs are in the condor production queue instead of the condor development queue.
This bug occurs only when running OSG 0.8.0 with the upgraded RSV V2. For a permanent fix, install RSV V2 as a standalone product, rather than an upgrade to RSV V1 or install OSG 1.0. Alternatively, you can fix this issue every time the HN reboots. As root (su -) on the HN:- Stop the osg-rsv service:
cd /share/apps/osg
vdt-control --off osg-rsv - Kill the jobs in the condor production queue:
condor_q
condor_rm #, where # is the batch number of any jobs being run by rsvuser - Restart the osg-rsv service and check the queues:
vdt-control --on osg-rsv
condor_q (to check the production queue)
su - rsvuser
condor_q (to check the development queue) - If jobs are not in the production queue, but are in the development queue, the procedure has worked. Otherwise, we've found repeating these steps several times seems to work.
- Stop the osg-rsv service:
- The RSV cacert-crl-expiry-probe fails with an error to the effect of:
/share/apps/osg-0.8.0/globus/TRUSTED_CA/1d879c6c.r0 has expired! (nextUpdate=Aug 15 14:28:32 2008 GMT)
and voms-proxy-init fails with the error:
Invalid CRL: The available CRL has expired
This can occur because, for one reason or another, the last cron jobs which should have renewed the certificates did not execute or complete. You can manually run the cron jobs by first searching for them in cron:
crontab -l | grep cert
crontab -l | grep crl
then executing them:
/share/apps/osg-0.8.0/vdt/sbin/vdt-update-certs-wrapper --vdt-install /share/apps/osg-0.8.0
/share/apps/osg-0.8.0/fetch-crl/share/doc/fetch-crl-2.6.2/fetch-crl.cron
If fetch-crl.cron prints errors about "download no data from... persistent errors.... could not download any CRL from...", ignore them as long as voms-proxy-init works when fetch-crl.cron completes. - configure-osg.py -c chokes and vdt-install.log says:
##########
# configure-osg.py invoked at Tue Oct 28 15:36:56 2008
##########
### 2008-10-28 15:36:56,792 configure-osg ERROR In RSV section
### 2008-10-28 15:36:56,792 configure-osg ERROR Invalid domain in
gridftp_hosts: UNAVAILABLE
### 2008-10-28 15:36:56,792 configure-osg CRITICAL Invalid attributes
found, exiting
########## [configure-osg] completed
This is because gridftp_hosts does not actually accept UNAVAILABLE (despite the comments). Simply set gridftp_hosts=%(localhost)s in config.ini and try running configure-osg.py -c again. RSV will be rolling out a fix for this very soon (as of October 31, 2008).
SiteDB/PhEDEx
- After attempting to log in to PhEDEx via certificate, a window pops up several times requesting your grid cert (already imported into your browser) and after multiple OK's, eventually goes to a page with the message:
Have You Signed Up?
You need to sign up with CMS Web Services in order to log in and use privileged features. Signing up can be done via SiteDB.
If you have already signed up with SiteDB, it is possible that your certificate or password information is out of date there. In that case go back to SiteDB and update your information.
For your information, the DN your browser presents is:
/DC=something/DC=something/OU=something/CN=Your Name ID#
This problem occurs when your SiteDB/hypernews account is not linked with your grid certificate. Go to the SiteDB::Person Directory (SiteDB only works in the Firefox browser), login with your hypernews account and follow the link under the title labeled "Edit your own details here". In the form entry box titled "Distinguished Name", enter the DN info displayed earlier and click on the "Edit these details" button. You should then be able to login to PhEDEx with your grid certificate in 30-60 minutes.