GlideinWMS The Glidein-based Workflow Management System

Search Results

WMS Factory

Custom Scripts

Description

This document describes how to write custom scripts to run in a glidein. Glidein Factory administrators may want to write them to implement features specific to their clients. Two examples are worker node validation, and discovery and setup of VO-specific software.
PS: The "scripts" can also be compiled binaries.

Script inclusion

A script is a file that was listed in the Glidein Factory configuration file as being executable:

executable="True"

By default the files listed are non executable, so an administrator needs explicitly list the executable ones.

Script API

A script is provided with exactly 2 arguments:

  1. The name of the glidein configuration file
  2. An entry id; this can be either main or the name of the entry point

All other input comes from the glidein configuration file that is used as a dashboard between different scripts.

If the script provides any output to be used by other scripts, it should write it the glidein configuration file. If the values need to be published by the condor_startd or visible by the user jobs, the condor vars file should also be modified.

The script must return with exit code 0 if successful; a non-zero return value will stop the execution of the glidein with a validation error.

The glidein configuration file

The glidein configuration file acts as a dashboard between different scripts.

The glidein configuration file is a simple ASCII file, with one value per line; the first column represents the attribute name, while the rest is the attribute value.
If the value does not contain any spaces, the easiest way to extract a value in bash is:

attr_val=`grep "^$attr_name " $glidein_config | awk '{print $2}'`

Several attributes are added by the default glidein scripts, the most interesting being:

  • ADD_CONFIG_LINE_SOURCE – Script that can be used to add new attributes to the glidein configuration file (see below).
  • GLIDEIN_Name – Name of the glidein branch
  • GLIDEIN_Entry_Name – name of the glidein entry point
  • TMP_DIR – The path to the temporary dir
  • PROXY_URL – The URL of the Web proxy

All attributes of the glidein factory (both the common and the entry specific) are also loaded into this file.

To write into the glidein configuration file, the best approach in bash is to use the add_config_line support script. Just source the provided script and use it. Here is an example:

# get the glidein configuration file name
# must use glidein_config, it is used as global variable
glidein_config=$1
# import add_config_line function
add_config_line_source=`grep '^ADD_CONFIG_LINE_SOURCE ' $glidein_config | awk '{print $2}'`
source $add_config_line_source
# add an attributes
add_config_line myattribute myvalue

Condor vars file

The glideinWMS uses a so called condor vars file to decide which attributes are going to be inserted into the condor configuration file, which are going to be published by the glidein condor_startd to the collector, and which attributes are going to be put into the job environment.

The condor vars file can be found from the glidein configuration file as

CONDOR_VARS_FILE

It is an ASCII file, with one entry per row. Each non comment line must have 7 columns. Each column has a specific meaning:

  1. Attribute name (will be extracted from the glidein configuration file)
  2. Attribute type
    • I – integer
    • S – quoted string
    • C – unquoted string (i.e. Condor keyword or expression)
  3. Default value, use – if no default
  4. Condor name, i.e. under which name should this attributed be known in the condor configuration
  5. Is a value required for this attribute?
    Must be Y or N. If Y and the attribute is not defined, the glidein will fail.
  6. Will condor_startd publish this attribute to the collector?
    Must be Y or N.
  7. Will the attribute be exported to the user job environment?
    • - - Do not export
    • + - Export using the original attribute name
    • @ - Export using the Condor name

The glideinWMS defines several attributes in the default condor var files

glideinWMS/creation/web_base/condor_vars.lst
glideinWMS/creation/web_base/condor_vars.lst.entry

Here below, you can see a short extract. For all the options, look at dedicated configuration variables page.

# VarName               Type    Default         CondorName                      Req.    Export  UserJobEnvName
#                       S=Quote - = No Default  + = VarName                             Condor  - = Do not export
#                                                                                               + = Use VarName
#                                                                                               @ = Use CondorName
#################################################################################################################
X509_USER_PROXY         C       -               GSI_DAEMON_PROXY                Y       N       -
USE_MATCH_AUTH          C       -     SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION  N       N       -
GLIDEIN_Factory         S       -               +                               Y       Y       @
GLIDEIN_Name            S       -               +                               Y       Y       @
GLIDEIN_Collector       C       -               HEAD_NODE                       Y       N       -
GLIDEIN_Expose_Grid_Env C       False     JOB_INHERITS_STARTER_ENVIRONMENT      N       Y       +
TMP_DIR                 S       -               GLIDEIN_Tmp_Dir                 Y       Y       @
CONDORG_CLUSTER         I       -               GLIDEIN_ClusterId               Y       Y       @
CONDORG_SUBCLUSTER      I       -               GLIDEIN_ProcId                  Y       Y       @
CONDORG_SCHEDD          S       -               GLIDEIN_Schedd                  Y       Y       @
SEC_DEFAULT_ENCRYPTION  C       OPTIONAL        +                               N       N       -
SEC_DEFAULT_INTEGRITY   C       REQUIRED        +                               N       N       -
MAX_MASTER_LOG          I       1000000         +                               N       N       -
MAX_STARTD_LOG          I       10000000        +                               N       N       -

If you need to add anything to a condor vars file, the best approach in bash is to use the add_condor_vars_line support script. Just source the provided script and use it. Here is an example:

# get the condor vars file name
# must use condor_vars_file, it is used as global variable
condor_vars_file=`grep -i "^CONDOR_VARS_FILE " $glidein_config | awk '{print $2}'`
# import add_condor_vars_line function
add_config_line_source=`grep '^ADD_CONFIG_LINE_SOURCE ' $glidein_config | awk '{print $2}'`
source $add_config_line_source
# add an attribute
add_condor_vars_line myattribute type def condor_name req publish jobid

Reporting script exit status

Since v2_6_2, the glideinWMS factory can receive and interpret a detailed exit status report, if provided by the validation script.

The script should write the exit status report in the following file:

otrb_output.xml

The factory provides a helper script to properly generate such a file. A detailed description of the format can be found in the dedicated description page.

To use the helper script, first discover its location with:

# find error reporting helper script
error_gen=`grep '^ERROR_GEN_PATH ' $glidein_config | awk '{print $2}'`

If the validation script succeeded, report the success by using:

# Everything worked out fine
"$error_gen" -ok <script name> [<key> <value>]*

You can specify any number of (key,value) pairs, representing any metrics you verified during your valudation run, if any.

If the validation script instead failed, report the failure by using:

# Uh oh, we hit an error
"$error_gen" -error <script name> <error type> "<detailed description>" [<key> <value>]*

The script should use one of the standard error types.
It should also provide a human readable detailed description. It is perfectly fine if it extends over multiple lines; just make sure you properly pass it to the script.
You can also specify any number of (key,value) pairs, representing any metrics that failed during the test. Proviind at least one metric is recommended, but not strictly necessary.

Note: The reported status MUST match the script exit code. E.g. if you claim the script succeeded, you must also exit with a 0 exit code.

Loading order

Scripts are loaded and executed one at a time. There are six distinct stages involved:

  1. Global attributes are loaded and global system scripts executed.

  2. The user provided global files are loaded and user scripts are executed (i.e. all the ones that have the default after_entry="False")

  3. The entry specific attributes are loaded and entry specific system scripts executed.

  4. The user provided entry specific files are loaded and entry specific user scripts are executed.

  5. The after_entry user provided global files are loaded and after_entry user scripts are executed (i.e. all the ones that have set after_entry="True")

  6. Final global system scripts executed and the Condor daemons are launched.

The Glidein Factory configuration allows an administrator to specify the files/scripts mentioned in points 2, 4 and 5.
The files/scripts are loaded/executed in the order in which they are specified in the configuration file.

Examples

The above documentation is hopefully providing enough information to write the scripts that will customize the glideins to your needs. Below are a few examples you can use as templates.

Test that a certain library exists

#!/bin/sh

glidein_config="$1"

# find error reporting helper script
error_gen=`grep '^ERROR_GEN_PATH ' $glidein_config | awk '{print $2}'`

if [ -z "/usr/lib/libcrypto.so.0.9.8" ]; then
  "$error_gen" -error "libtest.sh" "WN_Resource" "Crypto library not found." "file" "/usr/lib/libcrypto.so.0.9.8"
  exit 1
fi
echo "Crypto library found"
"$error_gen" -ok  "libtest.sh" "file" "/usr/lib/libcrypto.so.0.9.8"
exit 0

Find, test and advertise a software distribution

#!/bin/sh

glidein_config="$1"

###############
# Get the data

# find error reporting helper script
error_gen=`grep '^ERROR_GEN_PATH ' $glidein_config | awk '{print $2}'`

if [ -f "$VO_SW_DIR/setup.sh" ]; then
   source "$VO_SW_DIR/setup.sh"
else
  "$error_gen" -error "swfind.sh" "WN_Resource" "Could not find $VO_SW_DIR/setup.sh" \ 
              "file" "$VO_SW_DIR/setup.sh" "base_dir_attr" "VO_SW_DIR"
   exit 1
fi

tmpname=$PWD/installed_software_tmp_$$.tmp
software_list> $tmpname


###########################################################
# import add_config_line and add_condor_vars_line functions

add_config_line_source=`grep '^ADD_CONFIG_LINE_SOURCE ' $glidein_config | awk '{print $2}'`
source $add_config_line_source

condor_vars_file=`grep -i "^CONDOR_VARS_FILE " $glidein_config | awk '{print $2}'`


##################
# Format the data

sw_list=`cat $tmpname | awk '{if (length(a)!=0) {a=a "," $0} else {a=$0}}END{print a}'`

if [ -z "$sw_list" ]; then
  ERRSTR="No SW found.
But the setup script was present at $VO_SW_DIR/setup.sh."
  "$error_gen" -error "swfind.sh" "WN_Resource" "$ERRSTR" \ 
               "source_file" "$VO_SW_DIR/setup.sh"

  exit 1
fi

#################
# Export the data

add_config_line GLIDEIN_SW_LIST "$sw_list"
add_condor_vars_line GLIDEIN_SW_LIST "S" "-" "+" "Y" "Y" "+"

"$error_gen" -ok  "swfind.sh" "sw_list" "$sw_list"
exit 0

Change an existing value based on conditions found

#!/bin/bash

glidein_config=$1
entry_dir=$2

# find error reporting helper script
error_gen=`grep '^ERROR_GEN_PATH ' $glidein_config | awk '{print $2}'`

# import add_config_line function, will use glidein_config
add_config_line_source=`grep '^ADD_CONFIG_LINE_SOURCE ' $glidein_config | awk '{print $2}'`
source $add_config_line_source

vo_scalability=`grep '^VO_SCALABILITY ' $glidein_config | awk '{print $2}'`

if [ -z "$vo_scalability" ]; then
  # set a reasonable default
  vo_scalability=5000
fi

tot_mem=`grep MemTotal /proc/meminfo |awk '{print $2}'`
if [ "$tot_mem" -lt 500000 ]; then
  if [ "$entry_dir" == "main" ]; then
    # all glideins need to scale down if low on memory
    let vo_scalability=vo_scalability/2
  elif [ "$entry_dir" == "florida23" ]; then
    # but florida23 can use a little more
    let vo_scalability=vo_scalability*5/4
  fi

  # write it back
  add_config_line VO_SCALABILITY $vo_scalability
  "$error_gen" -ok  "memset.sh" "vo_scalability" "$vo_scalability"
  exit 0
fi 
"$error_gen" -ok  "memset.sh"
exit 0