May 2009 log
Installed OpenGL on all nodes, updated RAID firmware and drivers, kickstarted all WNs.
May 30, 2009
MK -- Reinstalled WNs, updated RAID firmware & drivers, re-enabled OMSA storage alerts on the HN
- PhEDEx reinstall has not been completely tested - services started OK, but haven't tested that transfers succeed. Will need to watch next PhEDEx transfer closely.
- RAID firmware and drivers were remarkably painless.
May 25, 2009
MK -- Unknown bigdisk1 failure
- OMSA logged failure of all disks in MD1000 enclosure, claiming disk removal. After issuing critical alerts, I still had the ability to access /data although OMSA had issued warnings for the physical disks. I stopped OMSA, attempted to umount /data, which failed. When I restarted OMSA it issued warnings and I was unable to access /data. OMSA showed that disks 0:0:0-11 were in unknown state and that disks 0:0:12-14 were foreign, possibly due to failed umount attempt. Rebooted the HN via OMSA and upon reboot, was able to access /data and OMSA did not issue any warnings or alerts regarding the disks (except for outdated driver warnings, as usual). Further examination of OMSA alert log did not reveal the source of the problem, but it is available for reference here. Ultimately suspect temporary failure which the system was able to partially recover from but that probably would have required HN reboot for full fix regardless of whether or not I had tried umount.
May 17, 2009
MT -- Installed CMSSW_3_1_0_pre7
May 11, 2009
MK -- CMSSW releases, OpenGL on all WNs
- Removed CMSSW_2_0_12, re-installed CMSSW_2_1_7 (it appears that SAM tests are now based on 2_1_7 - don't know when SAM will rollover to a newer release since 2_1_7 is officially deprecated on the 13th).
- Installed various OpenGL utilities on the WNs. Didn't put into the admin guide because at some point I'm going to have separate interactive nodes that get all of these packages automatically by being declared in the kickstart file as a particular classification of nodes. However, some of these packages were specifically called for by Frog and may not come even with the special classification as an interactive node, particularly the devel packages. So I've listed the commands here just in case, which may need to be revisted upon the creation of interactive nodes:
yum -y install freeglut
yum -y install freeglut-devel
yum -y install xorg-x11-Mesa-libGLU
yum -y install xorg-x11-devel
yum -y install curl-devel
I did not add these packages to the Rocks kickstart file for the WNs. On cluster reinstall, they will either have to be installed again or the kickstart will have to be edited first.
May 9, 2009
MK -- CMSSW releases
- Removed CMSSW_2_1_7, 2_1_17, 2_2_5, installed 2_2_9
May 5, 2009
MT -- CMSSW releases
- Removed CMSSW_3_1_0_pre1 & pre4, installed pre6.