Open Science Grid
Description | Install, configure, and run an Open Science Grid compute and storage element using BeStMan-Gateway with Hadoop. |
Notes | It actually is 99% out of date as of 2015 and will be removed and replaced shortly. ADMINS of hepcms: please consult our private Google pages for documentation. These instructions are for OSG 3.0. Configuration is primarily based on historical choices which were optimized for older versions of OSG and T3_US_UMD hardware and are occasionally difficult to use in the OSG3 setup. |
Warning | Never blindly follow commands from this or any other guide. Some may be optional, some are to be run on different machines as different users, some may be only valid during initial setup and not a re-install. |
Last modified | September 10, 2015 |
Table of Contents
- Setup yum
- Install the OSG Client on the Interactive Nodes
- Request host certificates
- Install and configure the CE
- Install and configure the SE (BeStMan, gridftp-hdfs)
- Install the worker node (WN) client
- Start the CE & SE
- Register with the GOC
Setup yum:
Description | Install EPEL, OSG repositories, and setup yum priorities |
Dependencies | None |
Notes | Each of the OSG3 guides will instruct you to do this as the beginning step to install that particular client |
Guides | OSG3 Using yum and RPM |
As root (su -) on all nodes (this can be done using rocks commands on the HN instead):
- Install EPEL:
rpm -Uvh http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
- Install the yum-priorities plugin:
yum install yum-priorities
- Ensure that /etc/yum.conf has the following line in the [main] section:
plugins=1
- Install the OSG repositories:
rpm -Uvh http://repo.grid.iu.edu/osg-el5-release-latest.rpm
Install OSG Client on the Interactive Nodes:
Description | Install the OSG Client on the Interactive Nodes |
Dependencies | None. - Steps shown assume you setup yum above |
Notes | These are only additional notes, follow the official OSG guide. We choose to have certificates update automatically using osg-ca-scripts. |
Guides | - OSG guide to installing the OSG Client |
Request host certificates:
Description | Get site and service certificates. |
Dependencies | - Request comes from CMS T3 sysadmin with grid-admin privileges |
Notes | These are only additional notes, follow the official OSG guide to get certificates. |
Guides | - OSG guide to getting host and service certificates - OSG guide for certificate request as GridAdmin |
- We obtain service certificates for the grid node CE functionality, and for the SE node to run GUMS. Note that the node named "SE" does not actually run our storage element grid function. That is run on the grid node.
- Make sure to install the OSG pki command line tools
- As root (su - ) on each of the two interactive nodes:
yum install osg-pki-tools
- As root (su - ) on each of the two interactive nodes:
- Request the grid certificates as yourself (being a GridAdmin) on an interactive node after getting your proxy
- Be sure to run the second request for the http certificate which is used for CEMon, as well as for RSV monitoring on a secure port
- Request the certificats as yourself from the command line as follows:
osg-gridadmin-cert-request --hostname=hepcms-0.umd.edu --vo=CMS
osg-gridadmin-cert-request --hostname=rsv/hepcms-0.umd.edu --vo=CMS
osg-gridadmin-cert-request --hostname=http/hepcms-0.umd.edu --vo=CMS
- Be sure to chmod 400 *key.pem, and chmod 444 *edu.pem files
- Since we're going to give the rsvuser ownership of the cert, we create the user account (for new installation). As root (su -) on the HN:
useradd -c "RSV monitoring user" -n rsv
passwd rsv
ssh-agent $SHELL
ssh-add
rocks sync config
rocks sync users
- Once you've received email confirmation that your certificates are approved and you've followed the instructions to retrieve your certificates, copy the files to the appropriate directories on the GN and give them the needed ownerships:
mkdir -p /etc/grid-security/http
cp hepcms-0.umd.edu.pem /etc/grid-security/hostcert.pem
cp hepcms-0.umd.edu-key.pem /etc/grid-security/hostkey.pem
cp hepcms-0.umd.edu.pem /etc/grid-security/containercert.pem
cp hepcms-0.umd.edu-key.pem /etc/grid-security/containerkey.pem
cp http-hepcms-0.umd.edu.pem /etc/grid-security/http/httpcert.pem
cp http-hepcms-0.umd.edu-key.pem /etc/grid-security/http/httpkey.pem
cp http-hepcms-0.umd.edu.pem /etc/grid-security/http/httpcert2.pem
cp http-hepcms-0.umd.edu-key.pem /etc/grid-security/http/httpkey2.pem
cp rsv-hepcms-0.umd.edu.pem /etc/grid-security/rsvcert.pem
cp rsv-hepcms-0.umd.edu-key.pem /etc/grid-security/rsvkey.pem
chown daemon:daemon /etc/grid-security/containercert.pem
chown daemon:daemon /etc/grid-security/containerkey.pem
chown tomcat:tomcat /etc/grid-security/http/httpcert.pem
chown tomcat:tomcat /etc/grid-security/http/httpkey.pem
chown apache:apache /etc/grid-security/http/httpcert2.pem
chown apache:apache /etc/grid-security/http/httpkey2.pem
chown rsv:users /etc/grid-security/rsvcert.pem
chown rsv:users /etc/grid-security/rsvkey.pem - Note: for bestman, install certificates following instructions below, however if you are just doing the yearly renewal, you will need to do:
cp /etc/grid-security/hostkey.pem /etc/grid-security/bestman/bestmankey.pem
cp /etc/grid-security/hostcert.pem /etc/grid-security/bestman/bestmancert.pem
chown best:users /etc/grid-security/bestmancert.pem
chown best:users /etc/grid-security/bestmankey.pem
If you choose to install GUMS, you will need host and http certs for your GUMS host as well. We install GUMS on our SE node, so place two more requests using hepcms-1.umd.edu instead of hepcms-0.umd.edu. Configure the certificates as tomcat owned http certificates (with appropriate permissions as above) following the GUMS guide.
Any changes in the authority which provides the site certificates will require adding them appropriately in GUMS for http and rsv services.
If you are doing a yearly replacement of certificates, be sure to Stop and Start OSG services followed by tests of the grid services. Be sure to test again after the date at which the old certificates expire in case you forgot to replace something.
Install and configure the CE, BeStMan, and the WN client
Description | Install all OSG core software, specifically the compute element, a BeStMan-Gateway storage element, and the worker node client. |
Dependencies | - Site certificates - RPM setup on the grid node - Disk array network mounted on all nodes |
Notes | We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory. |
Guides | - OSG release documentation |
- Prepare the environment
- Install the compute element
- Configure the CE
- Edit vomses file
- Get the OSG environment
- Install & configure GUMS
- Install & configure gridftp-hdfs
- Install & configure the storage element
- Install the worker node client
Prepare the environment:
Description | Prepare to install OSG by creating the appropriate directories, network mounting, installing xinetd, and changing the output of hostname. |
Dependencies | - Base installation directory (we use /sharesoft) network mounted on all nodes |
Notes | We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory. |
Guides | - How to change the output of hostname |
- Create some directories which are wanted only for historical OSG reasons (for the empty setup.(c)sh script). As root on the GN (su -):
mkdir /scratch/osg
mkdir /scratch/osg/ce
touch /scratch/osg/ce/setup.sh
touch /scratch/osg/ce/setup.csh - Create directories needed for test jobs:
mkdir /hadoop/osg
chown root:users /hadoop/osg
chmod 775 /hadoop/osg - Have all nodes (including the GN) mount /scratch/osg on the GN as /sharesoft/osg. Edit /etc/auto.sharesoft on the HN as root (su -) and add the line:
osg grid-0-0.local:/scratch/osg - We use /tmp on the WNs as the temporary working directory for OSG jobs. If you haven't done so already, configure cron to garbage collect /tmp on all of the nodes.
- Either Rocks 5.4 or SL 5.4 doesn't install the xinetd service on the GN, which is needed by OSG services. Install it, start it, and add it to the boot sequence (not certain this is needed in Release3 of OSG):
yum install xinetd
/etc/rc.d/init.d/xinetd restart
chkconfig --add xinetd
chkconfig xinetd on - On a Rocks appliance, the command hostname outputs the local name (in our case, grid-0-0.local) instead of the FQHN. OSG needs hostname to output the FQHN, so we modify our configuration such that hostname prints hepcms-0.umd.edu following these instructions. Specifically (Note that it is probably better to do this using Rocks tools instead of hard coded):
- In /etc/sysconfig/network, replace:
HOSTNAME=grid-0-0.local
with
HOSTNAME=hepcms-0.umd.edu - In /etc/hosts, add:
128.8.164.12 hepcms-0.umd.edu - Then tell hostname to print the true FQHN:
hostname hepcms-0.umd.edu - And restart the network:
service network restart - Important: log out from the GN and log back in again before proceeding. (otherwise your CE install below may not pick up the correct hostname)
- In /etc/sysconfig/network, replace:
- Follow the OSG ports guide to open up necessary ports.
Install the compute element:
Description | Needs rewriting for Release3! Install the CE, set Condor as the job manager, install ManagedFork, handle port conflicts, download certs, and configure rsvuser. |
Dependencies | - Site certificates obtained |
Notes | These are only additional notes, follow the official OSG CE release docs, consulting our notes for details. We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory. |
Guides | - OSG release docs to install the CE - Using condor as the OSG jobmanager - Installing ManagedFork |
- We install in /sharesoft/osg/ce:
cd /sharesoft/osg/ce - The pacman CE install:
pacman -get http://software.grid.iu.edu/osg-1.2.18:ce
outputs the messages:
INFO: The Globus-Base-Info-Server package is not supported on this platform
INFO: The Globus-Base-Info-Client package is not supported on this platform
which are safe to ignore. - Be sure to get the environment after downloading:
. setup.sh - We use the "root" version of certificate install area (/etc/grid-security/) as described above
- We use our existing Condor installation as our jobmanager, so execute:
export VDTSETUP_CONDOR_LOCATION=/opt/condor
pacman -allow trust-all-caches -get http://software.grid.iu.edu/osg-1.2:Globus-Condor-Setup - We also use ManagedFork:
pacman -allow trust-all-caches -get http://software.grid.iu.edu/osg-1.2:ManagedFork
$VDT_LOCATION/vdt/setup/configure_globus_gatekeeper --managed-fork y --server y - Since we run our CE & SE on the same node and various CMS utilities assume the SE is on port 8443, we need to change the ports that some CE services run on.
- Replace 8443 in $VDT_LOCATION/tomcat/v55/conf/server.xml with 7443. The line:
enableLookups="false" redirectPort="8443" protocol="AJP/1.3"
should become:
enableLookups="false" redirectPort="7443" protocol="AJP/1.3" - Edit the file $VDT_LOCATION/apache/conf/extra/httpd-ssl.conf to change port 8443 to port 7443. The lines:
Listen 8443
RewriteRule (.*) https://%{SERVER_NAME}:8443$1
<VirtualHost _default_:8443>
ServerName www.example.com:8443
should become:
Listen 7443
RewriteRule (.*) https://%{SERVER_NAME}:7443$1
<VirtualHost _default_:7443>
ServerName www.example.com:7443
- Replace 8443 in $VDT_LOCATION/tomcat/v55/conf/server.xml with 7443. The line:
- Don't forget to run the post install:
vdt-post-install - We download certs to the local directory, which is network mounted and so readable by all nodes in the cluster:
vdt-ca-manage setupca --location local --url osg
The local directory /etc/grid-security/certificates on all nodes which need access to certs should point to the CE $VDT_LOCATION/globus/share/certificates. E.g., as root (su -) on the GN (needed by OSG services) and interactive nodes (needed by CRAB):
mkdir /etc/grid-security
cd /etc/grid-security
ln -s /sharesoft/osg/ce/globus/share/certificates
The WNs will get certificates by following the symlinks we create in the wnclient directory (installation instructions for WN client below). They do not assume that certificates are at /etc/grid-security/certificates. Due to installation of gridftp-hdfs, we will have to remove the symlink above before installation of gridftp-hdfs and add it. - *Note: This step may no longer be necessary in OSG 1.2. RSV needs to run in the condor-cron queue instead of the global condor pool because it has many lightweight jobs running constantly. Edit ~rsvuser/.cshrc and add:
source /sharesoft/osg/ce/setup.csh
source $VDT_LOCATION/vdt/etc/condor-cron-env.csh
and edit ~rsvuser/.bashrc and add:
. /sharesoft/osg/ce/setup.sh
. $VDT_LOCATION/vdt/etc/condor-cron-env.sh
Configure the CE:
Description | Configure which services the CE will run and other various settings in config.ini. |
Dependencies | - OSG CE installed and CE environment sourced |
Notes | Follow the official OSG CE release docs, our config.ini is available here for reference. In OSG 1.2, config.ini is placed in the $VDT_LOCATION/osg/etc directory instead of $VDT_LOCATION/monitoring. After editing config.ini, be sure to call configure-osg -v to verify your syntax and configure-osg -c to actually configure OSG. Because we install the CE & SE software in the same directory, the config.ini which comes with the SE will overwrite the config.ini from the CE. This may also occur on subsequent updates, so be sure to keep a backed up copy of config.ini in a different location. |
Guides | - OSG release docs to configure the CE |
Get the OSG environment:
Description | Edit the .bashrc & .cshrc skeleton files so new users get the OSG environment on login. |
Dependencies | None |
Notes | Highly optional - this is just how we do it at UMD. Existing users (such as cmssoft) will have to add the source commands to their ~/.bashrc & ~/.cshrc files. |
Guides |
As root (su -) on the HN:
- Add to /etc/skel/.bashrc:
. /sharesoft/osg/ce/setup.sh - Add to /etc/skel/.cshrc:
source /sharesoft/osg/ce/setup.csh
Edit vomses:
Description | Edit the vomses file to use one proxy server for cms. |
Dependencies | - OSG CE installed |
Notes | This is optional, and if users let CRAB initiate getting a proxy this is not needed. Do NOT remove mis or ops |
Guides |
As root (su -) on the GN:
- Remove the following line from /sharesoft/osg/ce/glite/etc/vomses:
"cms" "voms.cern.ch" "15002" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "cms"
Install and configure GUMS:
Description | Install the service which maps a user's distinguished name in their certificate to an account on your cluster. We install GUMS on the HN. |
Dependencies | - OSG CE installed (for /etc/grid-security/certificates). |
Notes | These are only additional notes, follow the official OSG release docs, consulting our notes for details. The alternative to GUMS is the grid-mapfile service, which is a simpler way to get started. However, GUMS is highly recommended as the permanent authentication mechanism. |
Guides | - OSG guide to getting site certificates - OSG GUMS guide |
Follow the OSG GUMS guide to install and configure. Notes and additions:
- The GUMS host (the SE node) will also need site and http certificates. Follow the instructions for getting site certificates (but obtain them for the FQDN of the SE node) and place them in /etc/grid-security, but only get certificates for the host and http services (don't need rsv certificate for the GUMS host).
- We install on the SE using yum following the OSG Release3 GUMS installation guide
- Follow the directions about being careful about the MySQL password and security
- To get your personal DN to be added to the GUMS admins, you can use grid-cert-info to query your usercert.pem file, typically located in ~yourUsername/.globus:
grid-cert-info -subject -file ~/yourUsername/usercert.pem
More info on getting and importing personal grid certificates can be found in the user guide. - Add yourself to the GUMS database with:
gums-add-mysql-admin 'YOUR DN'
- To setup the configuration in /etc/gums/gums.config, follow the OSG installation recommendations. Details of our installation: we remove VOs that we don't support. This can be done from the web interface instead -- indeed, changes to the configuration from the web interface will modify gums.config. However, editing the file now is a nice way to start with a cleaner environment. We remove the following:
- All of the groupMapping blocks except for those for the mis, cms, and ops VOs (under name='mis', name='cms', and name='ops'). mis and ops must be supported by all OSG sites.
- Remove the groups in <vomsUserGroup ... name=.../>, <groupAccountMapper ... name=.../>, and <groupToAccountMapping ... name= .../>, except for the following: cmsfrontier, cmsphedex, cmsphedex2, cmsprod, cmsproduction, cmssoft, cmssoft2, cmsuser, cmsuser-null, gums-test, mis, ops, uscmst2admin, uscmsuser
- You will then have a <hostToGroupMapping ... like the following at the end:
<hostToGroupMapping
groupToAccountMappings='localGroupToAccountMapping, uscmsuser, cmsuser, uscmst2admin, uscmssoft, cmssoft, uscmsprod,
uscmsphedex, cmsphedex2, uscmsfrontier, cmsuser-null, cmsproduction, ops, mis, gums-test, rsv, http'
description=''
cn='*/?*.umd.edu'/>
- To start, first fetch the crls:
/usr/sbin/fetch-crl3
/sbin/service fetch-crl3-boot start
/sbin/service fetch-crl3-cron start - Then start GUMS services:
/sbin/service mysqld start
/sbin/service tomcat5 start - Have fetch-crl3 to keep CRLs up to date after reboots:
/sbin/chkconfig fetch-crl3-boot on
/sbin/chkconfig fetch-crl3-cron on
- Now to add certificates and accounts, go to the GUMS web interface, which will be located at https://yourgumshost:8443/gums. Open "Configuration->Summary" to force GUMS to read your modified gums.config file.
- You need to assign a range of pool accounts to the CMS VO. In "Configuration->Account Mappers->Manage Pool Accounts", select uscmsPool and provide a range. The number of accounts you will require will depend on whether you want to generate a grid-mapfile or not.
- If you generate and keep a grid-mapfile, every user in the VO will be assigned to a specific UNIX account that you will have to make. CMS currently requires order(3000) pool accounts (more may be needed as certificate authorities change).
- Note that in usage, we found the accounts were automatically recycled (a single user coming in on glidein will map to any account, usually uscms0001-uscms0012 at the same time).
- These accounts will need to be made. Supposing you used a pool range of cms0000-cms0500, here is a script to make the accounts, which must be called on the HN as root. Be sure to call the Rocks user update commands to propagate the accounts to the rest of the cluster:
ssh-agent $SHELL
ssh-add
rocks sync users
- If you generate and keep a grid-mapfile, every user in the VO will be assigned to a specific UNIX account that you will have to make. CMS currently requires order(3000) pool accounts (more may be needed as certificate authorities change).
- The tests in the OSG GUMS guide are helpful.
- You also must add the GN http and rsv service certs to GUMS. The below process must be repeated for each one, changing the recognized DN and various GUMS object names each time. The following example is for the RSV certificate (you will have to do the same for the http service certificate):
- "Configuration->User Groups", add button at bottom
- Name: rsvGroup
- Description: RSV
- Type: manual
- Persistence factory: mysql
- Members URI: leave blank
- Non-members URI: leave blank
- GUMS Access: read self
- "Configuration->Account Mappers", add button at bottom
- Name: rsvAccountMapper
- Description: RSV
- Type: group
- Account:
- Specify the UNIX user to map to here.
- RSV should map to the user in the OSG CE config.ini, in our case, rsvuser. Make this account if you haven't already.
- For the GN http cert, create some generic account that doesn't have a login shell, e.g.:
useradd -c "CMS grid jobs" -n cms -s /bin/true - Be sure to call rocks sync users to propagate these new accounts to the rest of the cluster.
- "Configuration->Group to Account Mappings", add button at bottom
- Name: rsvGroupToAccountMapping
- Description: RSV
- User Group(s): rsvSiteGroup, leave second dropdown blank
- Account Mapper(s): rsvAccountMapper, leave second dropdown blank
- Accounting VO Subgroup: leave blank
- Accounting VO: leave blank
- "Configuration->Host To Group Mappings", edit button at left
- Select rsvGroupToAccountMapping for last dropdown
- "User Management->Manual User Group Members", add button at bottom
- User group: rsvGroup
- DN: /DC=com/DC=DigiCert-Grid/O=Open Science Grid/OU=Services/CN=rsv/hepcms-0.umd.edu
- "Configuration->User Groups", add button at bottom
- Finally, we map a few individual certificates to specific accounts. We do this because these accounts are expected to need specialized write access to some disks. Since we regularly unassign pool accounts, the UNIX account that these special users get mapped to would change. The below process only needs to be done once for any users in the CMS VO. The final step is then modified to add specific users.
- "Configuration->User Groups", add button at bottom
- Name: localUserGroup
- Description: Users affiliated with UMD
- Type: voms
- VOMS Server: cms
- Remainder URL: leave blank
- Accept non-VOMS certificates: true
- Match VOMS certificate's FQAN as: vo
- VO/Group: /cms
- Role: leave blank
- GUMS Access: read self
- "Configuration->Account Mappers", add button at bottom
- Name: localAccountMapper
- Description: Users affiliated with UMD
- Type: manual
- Persistence factory: mysql
- "Configuration->Group to Account Mappings", add button at bottom
- Name: localGroupToAccountMapping
- Description: Users affiliated with UMD
- User Group(s): localUserGroup, leave second dropdown blank
- Account Mapper(s): localAccountMapper, leave second dropdown blank
- Accounting VO Subgroup: cms
- Accounting VO: CMS
- "Configuration->Host To Group Mappings", edit button at left
- Order matters. Since these users are in the CMS VO, the normal CMS pool account will be assigned to them unless the localSiteGroupToAccountMapping is listed first. Rearrange the entire list to select localSiteGroupToAccountMapping as the first in the list.
- "User Management->Manual Account Mappings", add button at bottom
- This step is repeated for each individual user.
- We map local users with certificates to their own new grid accounts (username_g). Since grid accounts shouldn't be login accounts, it is not advised to map grid jobs from local users to regular accounts.
- Be sure to make the UNIX accounts as root on the HN. Grid accounts should never need a login shell, so useradd for these accounts should always be called with the option of "-s /bin/true" which means that there is no password for the account. Also be sure to call rocks sync users to propagate these new accounts to the rest of the cluster.
- Get the user DN, select localAccountMapper, and specify the UNIX account to map the DN to. Here is a DN that is convenient for CMS (for SAM tests):
- For SAM (create an account for SAM): /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=sciaba/CN=430796/CN=Andrea Sciaba
- "Configuration->User Groups", add button at bottom
- GUMS logs location are described in the Release3 GUMS installation guide, and can be found in /var/log/tomcat5
Install & configure the gridftp-hdfs service:
Description | Install the service which maps a user's distinguished name in their certificate to an account on your cluster. |
Dependencies | - GUMS installed, configured, and running - Hadoop installed and running |
Notes | These are only additional notes, follow the official OSG release docs, consulting our notes for details. Full testing may only be available after the SE is configured below. |
Guides | - Hadoop docs for gridftp-hdfs |
- Turn off running OSG services (if services were started, as root on the GN):
cd /sharesoft/osg/ce
. setup.sh
vdt-control --off - If the following symlink exists, you need to remove it because the gridftp-hdfs that installs via yum will try to modify it cd /etc/grid-security
- If you have already called configure-osg, back up the following files (used in GUMS) /etc/grid-security/prima-authz.conf
- In a fresh shell as root (su -) on the GN:
yum install gridftp-hdfs - Because we install CE and SE on the same node, we have to do this:
rpm -e --nodeps osg-ca-certs fetch-crl - Remove the installed certificate directory and replace with the symlink cd /etc/grid-security
- Restore the prima-authz.conf and gsi-authz.conf files
- Since we have our CE and SE on the same node, and they require different environment settings for the globus_mapping, create a configuration file for the SE (/etc/gridftp-hdfs/gridftp-hdfs-local.conf), and set it as an environment variable in gridftp-hdfs-local.conf in the next step:
globus_mapping /usr/lib64/liblcas_lcmaps_gt4_mapping.so lcmaps_callout - Configure the service by modifying /etc/gridftp-hdfs/gridftp-hdfs-local.conf
- change directory names to those used in our system:
export GRIDFTP_HDFS_MOUNT_POINT=/hadoop
export TMPDIR=/tmp
export GSI_AUTHZ_CONF=/etc/grid-security/gridftp-hdfs-authz.conf
- change directory names to those used in our system:
- Modify /etc/gridftp-hdfs/replica-map.conf to comment out the following line (local setting choice), and change the PHEDEX load test directory to that used on our cluster: #/store/user 3
- Modify /etc/lcmaps/lcmaps.db to talk to GUMS following our hostname settings for the scasclient:
"--endpoint https://hepcms-hn.umd.edu:8443/gums/services/GUMSXACMLAuthorizationServiceP
ort" - Make sure that the gridftp ports are in /etc/services, if not, add them:
gsiftp 2811/tcp # GSI FTP
gsiftp 2811/udp # GSI FTP
globus-gatekeeper 2119/tcp # VDT in /sharesoft/osg/ce-1.2 - Start the gridftp-hdfs service:
service xinetd restart - For install of just gridftp-hdfs, start the OSG services (make sure that gsiftp is disabled). Otherwise, start services later after SE & WN client are installed and configured.
cd /sharesoft/osg/ce
. setup.sh
vdt-control --list
vdt-control --disable gsiftp
vdt-control --on - Can test this service with globus-url-copy (not Bestman dependent) - this may not work depending on order of installation and services for you (can try later) as a user with grid certificate installed on interactive node, and small local file transfer-tests.txt existing. This may not be possible until the SE is configured and running below.
globus-url-copy file:///`pwd`/transfer-tests.txt gsiftp://hepcms-0.umd.edu:2811/hadoop/store/user/username/transfer-tests8.txt
rm certificates
/etc/grid-security/gsi-authz.conf
rmdir certificates
ln -s /sharesoft/osg/ce/globus/share/certificates
/store/PhEDEx_LoadTest07 1
Install & configure the storage element:
Description | Install BeStMan-Gateway to provide access to disk array via grid utilities. (NOT fixed for OSG Release3!) |
Dependencies | - OSG CE configured - GUMS installed, configured, and running - Disk array network mounted on all nodes |
Notes | These are only additional notes, follow the official OSG release docs, consulting our notes for details. We install the worker-node client, the CE, and SE all on the same node (the grid node) and the CE & SE in the same directory. We run BeStMan as user "best" instead of as "daemon" because we do not allow world-readable access to files on our SE. The daemon user is not in the "users" group, but the best user is. |
Guides | - OSG release docs for BeStMan-Gateway |
- Do this step in a fresh shell.
- If you want BeStMan to be in the users group, BeStMan can't be run as the daemon user. We make a user account for BeStMan following these instructions. Supposing a username of best, this user will need the host certificate:
cp /etc/grid-security/containercert.pem /etc/grid-security/bestmancert.pem
cp /etc/grid-security/containerkey.pem /etc/grid-security/bestmankey.pem
chown best:users /etc/grid-security/bestmancert.pem
chown best:users /etc/grid-security/bestmankey.pem - We install in our /sharesoft/osg/se directory, which is a symlink to our ce installation directory. If you're working in a fresh shell, be sure to source the existing OSG installation:
cd /sharesoft/osg/se
. setup.sh - BeStMan comes with its own config.ini file, which will overwrite your existing config.ini file. Copy your original config.ini to a safe place, then call pacman to get BeStMan:
pacman -get http://software.grid.iu.edu/osg-1.2:Bestman
Then copy your original config.ini back to $VDT_LOCATION/osg/etc/config.ini (we will use configure_bestman, so don't need the config.ini which comes with the SE). Note that any subsequent updates to OSG may cause the same problem, so keep a backed up copy of your config.ini in another directory. - Be sure to source the environment again to get the new settings:
. setup.sh - The certificate updater service is already configured to run via the CE, so we don't need to take any special steps for the SE. This is because we installed the SE on the same node and in the the same directory as the CE.
- We use the following configuration settings:
vdt/setup/configure_bestman --server y \
--user best \
--cert /etc/grid-security/bestmancert.pem \
--key /etc/grid-security/bestmankey.pem \
--http-port 7070 \
--https-port 8443 \
--globus-tcp-port-range 20000,25000 \
--enable-gateway \
--with-allowed-paths "/tmp;/home;/data;/hadoop" \
--with-transfer-servers gsiftp://hepcms-0.umd.edu \
--gums-host hepcms-hn.umd.edu \
--gums-port 8443
If you call configure_bestman more than once, it will issue the message:
find: /sharesoft/osg/se-1.2/bestman/bin/sharesoft/osg/se-1.2/bestman/sbin/sharesoft/osg/se-1.2/bestman/setup: No such file or directory
Which can be safely ignored. - Don't forget to edit the sudoers file to give daemon needed permissions:
visudo
a
Copy and paste the needed lines from the Twiki. Be sure to comment out the "Defaults requiretty" line and replace <user_name> with best.
Esc
:wq! - We use the gsiftp server running via the CE software, so don't need any special configuration options for the SE. This is because we installed the SE on the same node as the CE.
- Validation can only be performed after the SE is started. We first examine the test results from RSV on the SE and perform further manual tests as described in the instructions if RSV tests fail. RSV will run once we start services (later).
Install the worker node client:
Description | Install the worker node client to give worker nodes access to certificate information, system configuration, and software binaries. |
Dependencies | - OSG CE configured |
Notes | These are only additional notes, follow the official OSG release docs, consulting our notes for details. |
Guides | - OSG release docs for worker node client |
- Do this step from a fresh shell
- To install:
cd /sharesoft/osg/wnclient
pacman -allow trust-all-caches -get http://software.grid.iu.edu/osg-1.2:wn-client
You can safely ignore the message:
INFO: The Globus-Base-Info-Client package is not supported on this platform - Don't forget to get the new environment:
. setup.sh
- Because we install the WN client on the same network mount as the CE, we have the CE handle certificates. This is option 2 in the Twiki.
- Tell the WN client that we will store certificates in the local directory, specifically /sharesoft/osg/wnclient/globus/TRUSTED_CA:
vdt-ca-manage setupca --location local --url osg - Since we will run our CE on the same node, point the WN client TRUSTED_CA directory to the CE TRUSTED_CA directory:
rm globus/TRUSTED_CA
ln -s /sharesoft/osg/ce/globus/TRUSTED_CA globus/TRUSTED_CA - The original WN client certificate directory can be removed if desired:
rm globus/share/certificates
rm -r globus/share/certificates-1.16
- Tell the WN client that we will store certificates in the local directory, specifically /sharesoft/osg/wnclient/globus/TRUSTED_CA:
- Since the WN client is on the same node as the CE, no services need to be enabled or turned on. It is purely a passive software directory from which WNs can grab binaries and configuration.
Start the CE & SE
Description | Start the compute and service elements, check RSV tests and service logs, publish CMSSW software pins if applicable. |
Dependencies | - OSG CE configured - OSG SE installed and configured - OSG WN client installed |
Notes | |
Guides | - Globus 2.0 error codes (for debugging) |
As root (su -) on the GN:
- Start the OSG CE & SE:
cd /sharesoft/osg/ce
. setup.sh
vdt-control --on
This starts all the services for both the CE & SE because we installed them in the same directory. - In the case of running gridftp-hdfs, make sure gsiftp is disabled
- You can perform a series of simple tests to see if your CE has basic functionality. Login to any user account and:
source /sharesoft/osg/ce/setup.csh
grid-proxy-init
cd /sharesoft/osg/ce/verify
./site_verify.pl - The CEmon log is kept at $VDT_LOCATION/glite/var/log/glite-ce-monitor.log.
- The GIP logs are kept at $VDT_LOCATION/gip/var/logs.
- globus & gridftp logs are kept in $GLOBUS_LOCATION/var and $GLOBUS_LOCATION/var/log.
- The BeStMan log is kept in $VDT_LOCATION/vdt-app-data/bestman/logs/event.srm.log.
- Results of the RSV probes will be visible at https://hepcms-0.umd.edu:7443/rsv in 15-30 mins. Further information can be found in the CE $VDT_LOCATION/osg-rsv/logs/probes.
- You can force RSV probes to run immediately (as rsvuser on GN) following these instructions: rsv-control --run --all-enabled.
After starting the CE for the first time, the file /sharesoft/osg/app/etc/grid3-locations.txt is made. This file is used to publish VO software pins and should be edited every time a new VO software release is installed or removed. If CMSSW is installed (instructions below are repeated in the CMSSW installation):
- Add a link to the CMSSW installation in the osg-app directory:
cd /sharesoft/osg/app
mkdir cmssoft
chmod 777 cmssoft
chown cmssoft:users cmssoft
- Give cmssoft ownership of the release file:
chown cmssoft:users /sharesoft/osg/app/etc/grid3-locations.txt - As cmssoft (su - cmssoft), create the needed symlink in the OSG APP directory to CMSSW:
cd /sharesoft/osg/app/cmssoft
ln -s /sharesoft/cmssw cms - As cmssoft (su - cmssoft), inform BDII which versions of CMSSW are installed and that we have the slc4_ia32_gcc345 environment. Edit /sharesoft/osg/apps/etc/grid3-locations.txt to include the lines:
VO-cms-slc4_ia32_gcc345 slc4_ia32_gcc345 /sharesoft/cmssw
VO-cms-CMSSW_X_Y_Z CMSSW_X_Y_Z /sharesoft/cmssw
(modify X_Y_Z and add a new line for each release of CMSSW installed)
Register with the Grid Operations Center (GOC):
Description | Register site with OSG. |
Dependencies | None |
Notes | We used a much older registration process, but provide the options we selected for reference. Follow the OIM registration instructions guide for details as the new registration page has changed substantially. |
Guides | - OIM web portal - OIM registration instructions - BDII information about your site, once registered |
- Facility: My Facility Is Not Listed (now that we have registered, we select University of Maryland for any new resources we might add later)
- Site: My Site Is Not Listed (again, now that we have registered, we select umd-cms)
- Resource Name: umd-cms
- Resource Services: Compute Element, Bestman-Xrootd Storage Element
- Fully Qualified Domain Name: hepcms-0.umd.edu
- Resource URL: http://hep-t3.physics.umd.edu
- OSG Grid: OSG Production Resource
- Interoperability: Select "This is a WLCG resource", "Forward BDII data to WLCG Interop BDII", and "Forward RSV data to WLCG Interop Monitoring".
- GOC Logging: Do not select Publish Syslogng
- Resource Description: Tier-3 computing center. Priority given to local users, but opportunistic use by CMS VO allowed.