Installation of the Glidein Pool Collector
1. Description
The glidein pool Collector node will be the Condor Central Manager for the glidein pool, i.e. it will run the Condor Collector and Negotiator daemons. These daemons define the glidein pool; if this node dies, the pool dies with it.
2. Hardware requirements
CPUs | Memory | Disk |
1 - 2 | 1+GB | 5+GB |
This machine needs one or two fast CPUs (one for the Collector and one for the Negotiator) and a moderate amount of memory (1GB should be enough for most tasks; really big pools may need more).
It must have reliable network connectivity and must be on the public internet, with no firewalls; all worker nodes will be continuously sending UDP packets to the Collector.
The machine must be very stable; if the Collector dies, the glidein pool dies with it (There are Condor techniques to minimize this damage, but you should still try to choose the stablest machine you can afford.)
The disk needed is just for Condor binaries and log files (5GB should be enough)
3. Needed software
Software | Notes | Install Before glideinWMS |
Linux OS | A reasonably recent Linux OS (RH/SL4 nad RH/SL5 tested at press time). | X |
Python interpreter | v2.3.4 or above | X |
The OSG client software. | This can be installed prior to glideinWMS, but the installer can install it inline with the glideinWMS install | |
The Condor distribution as a tarball. | The installer will use the tarball to install and configure Condor inline | |
The glideinWMS software. |
NOTE:
- Condor version v7.3.1 has a known issue with incorrect return/exit codes of condor_status and condor_q
-
If you are using Condor version v7.3.2 disable VOMS checking in condor_config file used by Condor daemons other
than that used by user schedd. VOMS checking adds unrequired overhead. To do so, set
USE_VOMS_ATTRIBUTES = False
or for individual condor daemons like collector
COLLECTOR.USE_VOMS_ATTRIBUTES = False
4. Before you begin...
The installer will ask for several DNs for GSI authentication. You have the option of using a service certificate or a proxy. These should be created and put in place before running the installer. The following is a list of DNs the installer will ask for:
- Pool Collector cert/proxy DN
- User Submitter cert/proxy DN
- Glidein Factory cert/proxy DN
- VO Frontend proxy DN (cannot use a cert here)
Note 2: The installer will ask if these are trusted Condor Daemons. Answer 'y'.
5. Installation instructions
The pool Collector can be installed either as root or as a non privileged user. Either case, make sure that the user has access to the needed GSI credentials. There is no real advantage to install as root, so non-privileged installation is recommended if installed separately.
The whole process is managed by a install script described below.
Move into
glideinWMS/install
and execute
./glideinWMS_install
You will be presented with the service selection screen. Choose [4] for the user pool collector, and f ollow the instructions and install all the software components. Further detail and a walk-through is presented below:
Field | Installation Text | Description |
Condor | Where do you have the Condor tarball? Where do you want to install it? |
The user pool collector is part of the Condor pool that will actually run the user's jobs.
This will be the server that you will submit jobs to. This piece of the install will configure
the collector to work with the submitted glideins. For this, you will need a condor distribution and a location to install to. It will also prompt for a administrator email. It is not recommended to install this into a user home directory. |
GSI Security | Where can I find the directory with the trusted CAs? |
GSI security is based on x509 certificates.
First, you will need a list of trusted certificates. VDT comes with a list of certificates, so, if you install that now (or have installed it previously), you can install that now. Note that you may have to update your certificates if you have an old VDT installation. You will next need a certificate or proxy for the user pool collector. See the previous section for more information on required certificates and proxies. |
PrivSep | Please insert all such DNs, together with a user nickname. |
You will need to provide the DN(s) of the glideins, the DNs of all the submit machines and the DN of the VO frontend.
The installer will then configure the condor_mapfile (located in the certs directory for each condor install). See GSI Reference for more information. |
Condor configuration | What name would you like to use for this pool? How many slave collectors do you want? |
You will need to provide a name for your pool, and determine how many slave collectors you will need.
The number of slave collectors will vary based on the number of jobs and other factors and can later be tuned. |
Here a possible set of answers is presented; your setup will probably be slightly different:
Welcome to the glideinWMS Installation Helper What do you want to install? (May select several options at one, using a , separated list) [1] glideinWMS Schedds and Collector [2] Glidein Factory [3] GCB [4] User Pool Collector [5] User Schedd [6] Condor for VO Frontend [7] VO Frontend [8] Components Please select: 4 The following profiles will be installed: [4] User Pool Collector Installing pool collector Installing condor You will now need the Condor tarball You can find it on http://www.cs.wisc.edu/condor/ Versions v7.2.2 and 7.3.1 have been tested, but you should always use the latest one Where do you have the Condor tarball? /home/collector/downloads/condor-7.4.2-linux-x86_64-rhel5-dynamic.tar.gz Checking... Seems condor version 7.4.2 Where do you want to install it?: [/home/collector/glidecondor] /home/collector/glidecondor Directory '/home/collector/glidecondor' does not exist, should I create it?: (y/n) y Installing condor in '/home/collector/glidecondor' If something goes wrong with Condor, who should get email about it?: admin@my.org Extracting from tarball Running condor_configure Installing Condor from /home/collector/glidecondor/tar/condor-7.4.2 to /home/collector/glidecondor Condor has been installed into: /home/collector/glidecondor Configured condor using these configuration files: global: /home/collector/glidecondor/etc/condor_config local: /home/collector/glidecondor/condor_local/condor_config.local You should look inside the installation log for some details about how Condor was installed. Created scripts which can be sourced by users to setup their Condor environment variables. These are: sh: /home/collector/glidecondor/condor.sh csh: /home/collector/glidecondor/condor.csh Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y] y The Condor config has been put in your login files Please remember to exit and reenter the terminal after the install Condor installed Configuring GSI security GSI security relies on a list of trusted CAs Where can I find the directory with the trusted CAs? Do you want to get it from VDT?: (y/n) y Do you have already a VDT installation?: (y/n) y Where is the VDT installed?: /home/collector/vdt Using VDT installation in /home/collector/vdt To use the GSI security for Pool Collector, you either need a valid GSI proxy or a valid x509 certificate and relative key. Its subject (i.e. DN) will be added as the trusted daemon in the condor configuration. Will you be using a proxy or a cert? (proxy/cert) cert Where is your certificate located?: /home/collector/grid-security/servicecert.pem Where is your certificate key located?: /home/collector/grid-security/servicekey.pem My DN = '/DC=org/DC=doegrids/OU=Services/CN=collector/master1.my.org' You will most probably need other DNs in the condor grid mapfile. The User Schedd(s) and Glidein startds will connect to and act as daemons to the Pool Collector. Any other node or process that needs to talk securely with the Collector (like the VO Frontend) also needs to be authenticated, but not as a daemon. Finally, if you expect any processes on this node to use condor security toward other nodes (e.g. the VO Frontend talking to the WMS Collector), the remote services will also need to be authenticated. The subjects (i.e. DNs) for these services will thus most likely be needed. Please insert all such DNs, together with a user nickname. An empty DN entry means you are done. DN: /DC=org/DC=doegrids/OU=Services/CN=schedd1.my.org nickname: [condor001] submit Is this a trusted Condor daemon?: (y/n) y DN: /DC=org/DC=doegrids/OU=Services/CN=gfactory/gfactory1.my.org nickname: [condor002] pilot Is this a trusted Condor daemon?: (y/n) y DN: /DC=org/DC=doegrids/OU=Services/CN=frontend/frontend1.my.org nickname: [condor002] frontend Is this a trusted Condor daemon?: (y/n) n DN: enter What name would you like to use for this pool?: [My pool] TestPool How many slave collectors do you want?: [10] 10
6. To Start/Stop Pool Collector
Setup the environment
source /home/collector/condor/condor.sh
To start Condor run:
/home/condor/sbin/condor_master
You should see three processes run as user condor: condor_master, condor_collector and condor_negotiator.
The log files can be found in
/home/condor/condor_local/log
To stop Condor run:
/home/condor/sbin/condor_off -master