Glidein configuration variables

Description

This document describes what configuration variables are used by the glideins. Most administrators never need to touch most of them, but a sophisticated Glidein Factory administrator may need to tweak some of them to implement the desired policies (for example: require encryption over the wire) or to address the needs of a particular site (for example: max allowed wallclock time).

Index

Configuration variable location

The glideinWMS ships with a set of pre-defined configuration variables, that are stored in two files, known as condor vars files:

glideinWMS/creation/web_base/condor_vars.lst
glideinWMS/creation/web_base/condor_vars.lst.entry

The two files are equivalent, but were split for historical reasons, and the second one is meant to contain site specific configuration variables.
These files should never be modified, and represent just the default shipped by the software!

A glideinWMS administrator can change the values of the predefined variables (with some exceptions, see below), and define new ones using the Glidein Factory configuration file.

Condor vars files

The condor vars files contain the glideinWMS pre-defined configuration variables, and should never be modified.
However, a glideinWMS administrator should nevertheless be able to read them.

Each of them is an ASCII file, with one entry per row.
Lines starting with # are comments and are ignored.

Each non comment line must have 7 columns. Each column has a specific meaning:

  1. Attribute name

  2. Attribute type

  1. Default value, use – if no default

  2. Condor name, i.e. under which name should this attributed be known in the configuration used by Condor daemons

  3. Is a value required for this attribute?
    Must be Y or N. If Y and the attribute is not defined, the glidein will fail.

  4. Will condor_startd publish this attribute to the collector?
    Must be Y or N.

  5. Will the attribute be exported to the user job environment?

Here below, you can see a short extract; the semantics of the variables is defined below.

# VarName               Type    Default         CondorName                      Req.    Export  UserJobEnvName
#                       S=Quote - = No Default  + = VarName                             Condor  - = Do not export
#                                                                                               + = Use VarName
#                                                                                               @ = Use CondorName
#################################################################################################################
X509_USER_PROXY         C       -               GSI_DAEMON_PROXY                Y       N       -
USE_MATCH_AUTH          C       -     SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION  N       N       -
GLIDEIN_Factory         S       -               +                               Y       Y       @
GLIDEIN_Name            S       -               +                               Y       Y       @
GLIDEIN_Collector       C       -               HEAD_NODE                       Y       N       -
GLIDEIN_Expose_Grid_Env C       False     JOB_INHERITS_STARTER_ENVIRONMENT      N       Y       +
TMP_DIR                 S       -               GLIDEIN_Tmp_Dir                 Y       Y       @
CONDORG_CLUSTER         I       -               GLIDEIN_ClusterId               Y       Y       @
CONDORG_SUBCLUSTER      I       -               GLIDEIN_ProcId                  Y       Y       @
CONDORG_SCHEDD          S       -               GLIDEIN_Schedd                  Y       Y       @
SEC_DEFAULT_ENCRYPTION  C       OPTIONAL        +                               N       N       -
SEC_DEFAULT_INTEGRITY   C       REQUIRED        +                               N       N       -
MAX_MASTER_LOG          I       1000000         +                               N       N       -
MAX_STARTD_LOG          I       10000000        +                               N       N       -

Variables used by the glideins

This section defines all the variables that the glideins explicity use. Please be aware that, apart from the blow mentioned variable many other variables will be used by the Condor daemons, since glideins are Condor based; see the Condor manual for more details.

The variables can be divided in two categories:

Admin modifiable variables

This section presents all the variables that can be directly changed by a Glidein Factory administrator.

First, the variables that can be changed by using the

<attr name=”namevalue=”valtype=”type” .../>

elements as explained in the Glidein Factory configuration section.
Most of these variables also have their default values defined in the condor vars files, which are used if the Glidein Factory administrator does not override them.

Please also note that some of these variables may also be provided by the VO factory clients.

Name

Type

Default
value

Description

GLIDEIN_Site

String

Entry name

Logical name of the Grid site where the glidein is running.

This information is published both in the startd ClassAd and in the user job environment.

GLIDEIN_Start

Exp
(Bool)

1

Condor START expression to use in the glideins. Both must be true for a user job to start.

PS: The idea of having two variables is meant to help setting global and entry-specific requirements.

GLIDEIN_Entry_Start

Exp
(Bool)

1

GLIDEIN_Rank

Exp
(Int)

1

Used in calculating the Condor RANK.
They are summed together, and the user job with the largest rank will run first.

GLIDEIN_Entry_Rank

Exp
(Int)

1

GLIDEIN_Expose_Grid_Env

Exp
(Bool)

False

If False, the user job environment will contain only glidein factory provided variables.

If True, the user job environment will also contain the environment variables defined at glidein startup.

See JOB_INHERITS_STARTER_ENVIRONMENT documentation for more details.

GLIDEIN_Max_Idle

Int

1200
(20 mins)

Max amount of time a condor_startd will wait to be matched before giving up and terminating.

GLIDEIN_Retire_Time

Int

21600
(6 hours)

How long should the condor_startd be running before entering into the Retiring state (thus not accepting any new jobs).
The random spread is used to improve the efficiencies, so the actual value used by Condor will be anywhere between GRT-GRTS and GRT.

GLIDEIN_Retire_Time_Spread

Int

7200
(2 hours)

GLIDEIN_Job_Max_Time

Int

360k
(100 hours)

Max allowed time for the job to end.
This is used by the condor_startd; once it enters the Retiring state it will wait at most this amount of time before killing the job.

GLEXEC_BIN

String

None

If set, Condor will launch all user jobs via glexec, thus running the job under the appropriate local account. This is important both for glideinWMS security and for accounting purposes.
A special value of "OSG" can be used to automatically find the locally installed gLExec on OSG worker nodes.

This variable is renamed to GLEXEC in the condor config.

GLEXEC_JOB

Bool

False

If set to False, the condor_starter is run sharing the same UID as the user job. This has security implications.

If running Condor 7.1.3 or later, it is recommended to turn this on and have the condor_starter be protected from the user jobs.

GLIDEIN_Use_PGroups

Bool

False

Should process group monitoring be enabled?

This is a Condor optimization parameter. Unfortunately, it negatively interferes with the batch systems used by the Grid sites, so it should not be turned on unless you have a very good reason to do so.

UPDATE_COLLECTOR_WITH_TCP

Bool

False

If True, forces the glidein to use TCP updates.
The collector must be configured in the same way for this to work.

Also see the Condor documentation for implications and side effects.

WANT_UDP_COMMAND_SOCKET

Bool

False

If True, enable the startd UDP command socket (Condor default).

Using the UDP command socket is a Condor optimization that makes working over firewalls and NATs very difficult. It is thus recommended you leave it disabled in the glideins.

Please note if you leave it disabled, that you must configure the schedd with
SCHEDD_SEND_VACATE_VIA_TCP = True
and the negotiator with
NEGOTIATOR_INFORM_STARTD = False
to have a functional system.

STARTD_SENDS_ALIVES

Bool

True

If set to False, the schedd will be sending keepalives to the startd.

Setting this to True instructs the startd to send keepalives to the schedd. This improves the glidein behavior when running behind a firewall or a NAT.

Please note that the schedd must be configured in the same way for this to work.

SEC_DEFAULT_INTEGRITY

Exp

REQUIRED

Security related settings. Please notice that the glideins always require GSI authentication.

For more details see the configuration page or the Condor manual.

SEC_DEFAULT_ENCRYPTION

Exp

OPTIONAL

USE_MATCH_AUTH

Bool

False

Another security setting.

If set to True, the schedd and the startd will use a low overhead protocol. See the configuration page or the Condor manual.

MAX_MASTER_LOG

Int

1M

What is the maximum size the logs should grow.

Setting them too low will made debugging difficult.
Setting them too high may fill up the disk in anomalous situations, both on the work nodes and on the glidein factory.

MAX_STARTD_LOG

Int

10M

MAX_STARTER_LOG

Int

10M

GCB_LIST

List

-

Configure GCB.

GCB is needed to run glideins on worker nodes behind a firewall or a NAT.

For more information, see the configuration page.

GCB_ORDER

String

GCB_MIN_FREE

Int

GCB_REMAP_ROUTE

String

MASTER_GCB_RECONNECT_TIMEOUT

Int

1200
(20 mins)

Specifies how long should a glidein wait before giving up on a GCB, if network connectivity is lost.

USE_CCB

Bool

False

If set to True, it will enable CCB (available since Condor v7.3.0).



The second set of variables comes from values the Glidein Factory administrator defined to make the factory to work. They cannot be changed by an administrator in any other way.

Name

Type

Source

Description

GLIDEIN_Factory

String

<glidein factory_name="value">

Logical name of the Glidein Factory machine (like “osg1”).

GLIDEIN_Name

String

<glidein glidein_name="value">

Identification name of the Glidein Factory instance (like “v1_0”).

GLIDEIN_Entry_Name

String

...<entries><entry name=”value”>

Identification name of the entry point (like “ucsd5”).

GLIDEIN_GridType

String

...<entries><entry gridtype=”value”>

Type of Grid resource (like “gt2”).

GLIDEIN_Glidekeeper

String

...<entries><entry gatekeeper=”value”>

URI of the Grid gatekeeper (like “osg1.ucsd.edu/jobmanager-pbs”)

GLIDEIN_GlobusRSL

String

...<entries><entry rsl=”value”>

Optional RSL string (like "(condor_submit=('+ProdSlot' 'TRUE'))")

PROXY_URL

String

...<entries><entry proxy_url=”value”>

Optional URL of the site Web proxy.

A special value “OSG” can be used to automatically discover the local Web proxy on OSG worker nodes.

This variable is exported as GLIDEIN_Proxy_URL to the use job environment.

DEBUG_MODE

String

...<entries><entry verbosity=”value”>

This setting can be either:

  • “std” - Default mode, where all interesting debug information is reported back to the Glidein Factory and the glidein will wait 20 minutes on a worker node that failed validation to minimize the black hole effect.

  • “nodebug” - Disable most diagnostic messages. This can be useful for very stable setups. The glidein still waits 20 minutes on a worker node that failed validation to minimize the black hole effect.

  • Fast – All debugging is enabled and the glidein waits only 2 minutes on a worker node that failed validation. This mode is useful when debugging a misbehaving Grid site.



Finally, the third set of values comes from the VO Frontend clients. While a client can set any number of variables, the ones described below ar the most often used.

Name

Type

Description

GLIDEIN_Client

String

Identification name of the VO frontend request (like “ucsd5@v1_0@osg1@cms4”).

GLIDEIN_Collector

List

List of Collector URIs used by the VO Condor pool (like “cc.cms.edu:9620,cc.cms.edu:9621”).

One of the URIs in the list will be selected and used as HEAD_NODE in the condor_config.



Dynamically generated variables

The following variables are being dynamically generated and/or modified by glideinWMS processes. The glideinWMS administrators cannot directly change them.

The first set of variables comes from the Glidein Factory.

Name

Type

Description

GLIDEIN_Signature

String

These variables contain the SHA1 signature of the signature files.

These signatures are used as a base to ensure the integrity of all the data downloaded in the glidein startup scripts, but they also provide a fingerprint of the configuration used by the glidein.

These variables are published both in the glidein ClassAd and in the user job environenmt.

GLIDEIN_Entry_Signature

CONDORG_SCHEDD

String

The schedd used by the Glidein Factory to submit the glidein.

This variables is exported a GLIDEIN_Schedd both in the glidein ClassAd and to the user job environment.

CONDORG_CLUSTER

Int

The cluster and process id assigned by the Glidein Factory schedd to this glidein.

These variables are exported as GLIDEIN_ClusterId and GLIDEIN_ProcId both in the glidein ClassAd and to the user job environment.

CONDORG_SUBCLUSTER

Int



The second set contains the location of files and/or directories downloaded or created by the glidein. Most of them are located under the working directory specified by

<entry work_dir=”value”>


Name

Description

TMP_DIR

Path to the directory that admin-provided scripts and user jobs can use for storing temporary data.

This variable is exported as GLIDEIN_Tmp_Dir both to the glidein ClassAd and to the use job environment.

CONDOR_VARS_FILE

File path to the condor vars files.

Admin-provided scripts may want to add entries to these files.

CONDOR_VARS_ENTRY_FILE

ADD_CONFIG_LINE_SOURCE

File path to the script containing the add_config_line and add_confir_vars line functions.

X509_USER_PROXY

File path to the glidein proxy file.

X509_CONDORMAP

File path to the Condor mapfile used by the glidein.

X509_CERT_DIR

Path to the directory containing the trusted CAs' public keys and RSLs.

CONDOR_DIR

Directory where the glidein Condor binary distribution have been installed.

WRAPPER_LIST

File path to the list of wrapper scripts used by the glidein.



The last set contains various variables generated by the glidein startup scripts.

Name

Type

Description

X509_GRIDMAP_DNS

String

List of DNs trusted by the glidein.

X509_EXPIRE

time_t

When is the proxy expected to expire.

GLEXEC_STARTER

Bool

If gLExec is used and this is set to True, the condor_starter will be run sharing the same UID as the user job.

ALTERNATIVE_SHELL

String

If gLExec is used, this variable points to a trusted copy of a shell.

GLEXEC_USER_DIR

String

If gLExec is used, this variable points to the working directory under which all user jobs will be started.



Repository

CVSROOT

cvsuser@cdcvs.fnal.gov:/cvs/cd

Package(s)

glideinWMS/creation

Author(s)

Igor Sfiligoi (Fermilab Computing Division)