How To: Guides for users and Maryland T3 admins.

Help: Links and emails for further info.

Configuration: technical layout of the cluster, primarily for admins.

Log: Has been moved to a Google page, accessible only to admins.

Open Science Grid

Description Install, configure, and run an Open Science Grid compute and storage element using BeStMan-Gateway on an NFS mounted disk array.
Notes These instructions are for OSG 1.2 (OSG 0.8, OSG 1.0 archives). You are better off following the official OSG3 Release documentation.
Last modified July 30, 2013

Table of Contents

Install Pacman:

Description Install pacman, the OSG installation manager.
Dependencies None
Notes  
Guides  

As root (su -) on the GN:

Request host certificates:

Description Get site and service certificates.
Dependencies None
Notes These are only additional notes, follow the official OSG guide to get certificates.
Guides - OSG guide to getting site certificates

If you choose to install GUMS, you will need host and http certs for your GUMS host as well. We install GUMS on our HN, so place two more requests using hepcms-hn.umd.edu instead of hepcms-0.umd.edu.

Install and configure the CE, BeStMan, and the WN client

Description Install all OSG core software, specifically the compute element, a BeStMan-Gateway storage element, and the worker node client.
Dependencies - Site certificates
- pacman
- Disk array network mounted on all nodes
- Installation directory network mounted on all nodes
Notes We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory.
Guides

- OSG release documentation
- VDT documentation
- OSG Tier-3 guide
- OSG ports guide

Prepare the environment:

Description Prepare to install OSG by creating the appropriate directories, network mounting, installing xinetd, and changing the output of hostname.
Dependencies - Base installation directory (we use /sharesoft) network mounted on all nodes
Notes We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory.
Guides - How to change the output of hostname
  1. Create the appropriate directories. As root on the GN (su -):
    mkdir /scratch/osg
    cd /scratch/osg
    mkdir wnclient-1.2 ce-1.2
    ln -s wnclient-1.2 wnclient
    ln -s ce-1.2 ce
    ln -s ce-1.2 se

    mkdir -p app/etc
    chmod 777 app app/etc
    mkdir /hadoop/osg
    chown root:users /hadoop/osg
    chmod 775 /hadoop/osg
  2. Have all nodes (including the GN) mount /scratch/osg on the GN as /sharesoft/osg. Edit /etc/auto.sharesoft on the HN as root (su -) and add the line:
    osg grid-0-0.local:/scratch/osg
  3. We use /tmp on the WNs as the temporary working directory for OSG jobs. If you haven't done so already, configure cron to garbage collect /tmp on all of the nodes.
  4. Either Rocks 5.4 or SL 5.4 doesn't install the xinetd service on the GN, which is needed by OSG services. Install it, start it, and add it to the boot sequence:
    yum install xinetd
    /etc/rc.d/init.d/xinetd restart
    chkconfig --add xinetd
    chkconfig xinetd on
  5. On a Rocks appliance, the command hostname outputs the local name (in our case, grid-0-0.local) instead of the FQHN. OSG needs hostname to output the FQHN, so we modify our configuration such that hostname prints hepcms-0.umd.edu following these instructions. Specifically:
    1. In /etc/sysconfig/network, replace:
      HOSTNAME=grid-0-0.local
      with
      HOSTNAME=hepcms-0.umd.edu
    2. In /etc/hosts, add:
      128.8.164.12 hepcms-0.umd.edu
    3. Then tell hostname to print the true FQHN:
      hostname hepcms-0.umd.edu
    4. And restart the network:
      service network restart
    5. Important: log out from the GN and log back in again before proceeding. (otherwise your CE install below may not pick up the correct hostname)
  6. Follow the OSG ports guide to open up necessary ports. Additionally, be sure to edit the vdt-local-setup.(c)sh files after you have installed the CE to set the globus port ranges.

Install the compute element:

Description Install the CE, set Condor as the job manager, install ManagedFork, handle port conflicts, download certs, and configure rsvuser.
Dependencies - Base installation directory (we use /sharesoft) network mounted on all nodes
Notes These are only additional notes, follow the official OSG CE release docs, consulting our notes for details. We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory.
Guides - OSG release docs to install the CE
- Using condor as the OSG jobmanager
- Installing ManagedFork

Configure the CE:

Description Configure which services the CE will run and other various settings in config.ini.
Dependencies - OSG CE installed and CE environment sourced
Notes Follow the official OSG CE release docs, our config.ini is available here for reference. In OSG 1.2, config.ini is placed in the $VDT_LOCATION/osg/etc directory instead of $VDT_LOCATION/monitoring. After editing config.ini, be sure to call configure-osg -v to verify your syntax and configure-osg -c to actually configure OSG. Because we install the CE & SE software in the same directory, the config.ini which comes with the SE will overwrite the config.ini from the CE. This may also occur on subsequent updates, so be sure to keep a backed up copy of config.ini in a different location.
Guides - OSG release docs to configure the CE

Get the OSG environment:

Description Edit the .bashrc & .cshrc skeleton files so new users get the OSG environment on login.
Dependencies None
Notes Highly optional - this is just how we do it at UMD. Existing users (such as cmssoft) will have to add the source commands to their ~/.bashrc & ~/.cshrc files.
Guides

As root (su -) on the HN:

  1. Add to /etc/skel/.bashrc:
    . /sharesoft/osg/ce/setup.sh
  2. Add to /etc/skel/.cshrc:
    source /sharesoft/osg/ce/setup.csh

Edit vomses:

Description Edit the vomses file to use one proxy server for cms.
Dependencies - OSG CE installed
Notes This is optional, and if users let CRAB initiate getting a proxy this is not needed. Do NOT remove mis or ops
Guides

As root (su -) on the GN:

  1. Remove the following line from /sharesoft/osg/ce/glite/etc/vomses:
    "cms" "voms.cern.ch" "15002" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "cms"

Install and configure GUMS:

Description Install the service which maps a user's distinguished name in their certificate to an account on your cluster. We install GUMS on the HN.
Dependencies - OSG CE installed (for /etc/grid-security/certificates).
Notes These are only additional notes, follow the official OSG release docs, consulting our notes for details. The alternative to GUMS is the grid-mapfile service, which is a simpler way to get started. However, GUMS is highly recommended as the permanent authentication mechanism.
Guides - OSG guide to getting site certificates
- OSG GUMS guide

Follow the OSG GUMS guide to install and configure. Notes and additions:

Install & configure the gridftp-hdfs service:

Description Install the service which maps a user's distinguished name in their certificate to an account on your cluster.
Dependencies - GUMS installed, configured, and running
- Hadoop installed and running
Notes These are only additional notes, follow the official OSG release docs, consulting our notes for details. Full testing may only be available after the SE is configured below.
Guides - Hadoop docs for gridftp-hdfs


 

Install & configure the storage element:

Description Install BeStMan-Gateway to provide access to disk array via grid utilities.
Dependencies - OSG CE configured
- GUMS installed, configured, and running
- Disk array network mounted on all nodes
Notes These are only additional notes, follow the official OSG release docs, consulting our notes for details. We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory. We run BeStMan as user "best" instead of as "daemon" because we do not allow world-readable access to files on our SE. The daemon user is not in the "users" group, but the best user is.
Guides - OSG release docs for BeStMan-Gateway

Install the worker node client:

Description Install the worker node client to give worker nodes access to certificate information, system configuration, and software binaries.
Dependencies - OSG CE configured
Notes These are only additional notes, follow the official OSG release docs, consulting our notes for details. We install the worker-node client as root (su -) on the GN in a fresh shell in /sharesoft/osg/wnclient.
Guides - OSG release docs for worker node client

Start the CE & SE

Description Start the compute and service elements, check RSV tests and service logs, publish CMSSW software pins if applicable.
Dependencies - OSG CE configured
- OSG SE installed and configured
- OSG WN client installed
Notes
Guides - Globus 2.0 error codes (for debugging)

As root (su -) on the GN:

  1. Start the OSG CE & SE:
    cd /sharesoft/osg/ce
    . setup.sh
    vdt-control --on

    This starts all the services for both the CE & SE because we installed them in the same directory.
    • In the case of running gridftp-hdfs, make sure gsiftp is disabled
  2. You can perform a series of simple tests to see if your CE has basic functionality. Login to any user account and:
    source /sharesoft/osg/ce/setup.csh
    grid-proxy-init
    cd /sharesoft/osg/ce/verify
    ./site_verify.pl
  3. The CEmon log is kept at $VDT_LOCATION/glite/var/log/glite-ce-monitor.log.
  4. The GIP logs are kept at $VDT_LOCATION/gip/var/logs.
  5. globus & gridftp logs are kept in $GLOBUS_LOCATION/var and $GLOBUS_LOCATION/var/log.
  6. The BeStMan log is kept in $VDT_LOCATION/vdt-app-data/bestman/logs/event.srm.log.
  7. Results of the RSV probes will be visible at https://hepcms-0.umd.edu:7443/rsv in 15-30 mins. Further information can be found in the CE $VDT_LOCATION/osg-rsv/logs/probes.
  8. You can force RSV probes to run immediately (as rsvuser on GN) following these instructions: rsv-control --run --all-enabled.

After starting the CE for the first time, the file /sharesoft/osg/app/etc/grid3-locations.txt is made. This file is used to publish VO software pins and should be edited every time a new VO software release is installed or removed. If CMSSW is installed (instructions below are repeated in the CMSSW installation):

  1. Add a link to the CMSSW installation in the osg-app directory:
    cd /sharesoft/osg/app
    mkdir cmssoft
    chmod 777 cmssoft
    chown cmssoft:users cmssoft
  2. Give cmssoft ownership of the release file:
    chown cmssoft:users /sharesoft/osg/app/etc/grid3-locations.txt
  3. As cmssoft (su - cmssoft), create the needed symlink in the OSG APP directory to CMSSW:
    cd /sharesoft/osg/app/cmssoft
    ln -s /sharesoft/cmssw cms
  4. As cmssoft (su - cmssoft), inform BDII which versions of CMSSW are installed and that we have the slc4_ia32_gcc345 environment. Edit /sharesoft/osg/apps/etc/grid3-locations.txt to include the lines:
    VO-cms-slc4_ia32_gcc345 slc4_ia32_gcc345 /sharesoft/cmssw
    VO-cms-CMSSW_X_Y_Z CMSSW_X_Y_Z /sharesoft/cmssw
    (modify X_Y_Z and add a new line for each release of CMSSW installed)

Register with the Grid Operations Center (GOC):

Description Register site with OSG.
Dependencies None
Notes We used a much older registration process, but provide the options we selected for reference. Follow the OIM registration instructions guide for details as the new registration page has changed substantially.
Guides - OIM web portal
- OIM registration instructions
- BDII information about your site, once registered