Installation of the WMS Collector and collocated glidein Factory

1. Description

The glidein Factory node will be the Condor Central Manager for the WMS, i.e. it will run the Condor Collector and Negotiator daemons, but it will also act as a Condor Submit node for the glidein factory, running Condor schedds used for Grid submission.
On top of that, this node also hosts the glidein factory daemons.

2. Hardware requirements

This machine needs one or two CPUs (one for the Condor damons, and one for the glidein factory daemons) and a moderate amount of memory (512MB should be enough).
It must must be on the public internet, with at least one port open to the world; all worker nodes will load data from this node trough HTTP.
The disk needed is just for binaries, config files, log files and Web monitoring data (10GB should be enough).

3. Needed software

A reasonably recent Linux OS (SL4 used at press time).
The OSG client software.
A HTTP server, like Apache or TUX.
A PostgreSQL server.
The Condor distribution.
The RRDTool package.
The glideinWMS software.

4. Base tools installation instructions

You will need to install all tools but glideinWMS software as root. The whole process is managed by a install script described below.
The script will try to automate the installation as much as possible, but you may need manually download a few tarballs and RPMs for the script to use. The install script has pointers to the relevant URLs where you can download them, so be prepared to have a second terminal to perform the downloads.

Now move into

glideinWMS/install

and execute

./glideinWMS_install

You will be presented with this screen:

What do you want to install?
(May select several options at one, using a , separated list)
[1] glideinWMS Collector
[2] Glidein Factory
[3] GCB
[4] pool Collector
[5] Schedd node
[6] Condor for VO Frontend
[7] VO Frontend
[8] Components

Select 1.

Now follow the instructions and install all the software components.

Most of the questions should be fairly straightforward (see an example installation snapshot in factory_root_install_snapshot.txt).
The part that is not completely automatic is the configuration of the GSI security. You will need to specify the DNs of the glidein pool collector, the VO frontend and of all the submit machines. See Configuring GSI security in Condor for more details.

4.1 Example config files

If following the above instructions does not make you feel confident, you can find the complete Condor config files in
example-config/glide-factory/mymachine/condor_config
and
example-config/glide-factory/mymachine/condor_config.local.

5. Glidein Factory installation guide

Before installing the glidein factory, create a valid x509 proxy in ~/.globus/x509_service_proxy. You will also need to keep it valid for the life of the factory. This proxy must, at any point in time, have a validity of at least the longest expected job being run by the glideinWMS (and not less than 12 hours).

How you keep this proxy valid (via MyProxy, kx509, voms-proxy-init from a local certificate, scp from other nodes, or other methods), is beyond the scope of this document.

The glidein factory itself should be installed as a non privileged user, for example the user gfactory.

Move into

glideinWMS/install

and execute

./glideinWMS_install

You will be presented with this screen:

What do you want to install?
(May select several options at one, using a , separated list)
[1] glideinWMS Collector
[2] Glidein Factory
[3] GCB
[4] pool Collector
[5] Schedd node
[6] Condor for VO Frontend
[7] VO Frontend
[8] Components

Select 2.

Now follow the instructions and install all the software components.

Most of the questions should be fairly straightforward (see an example installation snapshot in factory_gf_install_snapshot.txt).
The only part that is not completely automatic is the listing of GCB nodes.

At this point you can start the factory with

cd /home/gfactory/glideinsubmit/glidein_v1_0
./factory_startup start

where the directory is the one written out by the installation script.

5.1 Manual configuration

The glidein factory can also be configured manually.
The complete guide can be found in the glideinWMS documentation, but in example-config/glide-factory/config_v1.xml you can find a working example.

Once a configuration file has been created, you can create the glidein factory by executing

cd glideinWMS/creation
./create_glidein config_v1.xml

The startup procedure is the same as described above.

6 Glidein Factory monitoring

There are several ways to monitor the entry points of the glidein factory:

6.1 Glidein factory entry Web monitoring

You can either monitor the factory as a whole, or just a single entry point.

The factory monitoring is located at a URL like the one below

http://node1.my.company.org/glidefactory/monitor/glidein_v1_0/

Moreover, each entry point, has its own history on the Web.

Assuming you have a SanDiego entry, it can be monitored at

http://nopde1.my.company.org/glidefactory/monitor/glidein_v1_0/entry_SanDiego/

6.2 Glidein factory monitoring via WMS tools

You can get the equivalent of the Web page snaphot by using

cd glideinWMS/tools/
python wmsXMLView.py

6.3 Glidein factory entry log files

The glidein factory writes two log files per entry point factory_info.YYYYMMDD.log and factory_err.YYYYMMDD.log.

Assuming you have a SanDiego entry, the log files are in

/home/gfactory/glidein_submit/glidein_v1_0/entry_SanDiego/log

All errors are reported in the factory_err.YYYYMMDD.log. file, while factory_info.YYYYMMDD.log contains entries about what the factory is doing.

6.4 Glidein output

Each glidein creates 2 files on exit; job.ID.out and job.ID.err.

Assuming you have a SanDiego entries, the log files are in

/home/gfactory/glidein_submit/glidein_v1_0/entry_SanDiego/log

Problems are usually reasonably easy to spot.

6.5 Glidein factory ClassAds in the WMS Collector

The glidein factory also advertises summary information in the WMS collector.

Use condor_status:

condor_status -any

and look for glidefactory and glidefactoryclient ads.


Back to the index