GlideinWMS The Glidein-based Workflow Management System

Search Results

Glidein Frontend

Configuration

Example Configuration

Below is an example frontend configuration xml file. Click on any piece for a more detailed description.
<frontend advertise_delay="5" frontend_name="vofrontend-v2_5" loop_delay="60" advertise_with_tcp="True" advertise_with_multiple="True" restart_attempts="3" restart_interval="1800">
<downtimes />
<match match_expr="True">
<factory query_expr="True">
<match_attrs />
<collectors>
<collector DN="/DC=org/DC=doegrids/OU=Services/CN=factory-server.fnal.gov" comment="" factory_identity="factoryuser@factory-server.fnal.gov" my_identity="frontenduser@frontend-server.fnal.gov" node="factory-server.fnal.gov:8618" />
</collectors>
</factory>
<job comment="" query_expr="(JobUniverse==5)&&(GLIDEIN_Is_Monitor =!= TRUE)&&(JOB_Is_Monitor =!= TRUE)">
<match_attrs />
<schedds>
<schedd DN="/DC=org/DC=doegrids/OU=Services/CN=userpool.fnal.gov" fullname="userpool.fnal.gov">
<schedd DN="/DC=org/DC=doegrids/OU=Services/CN=userpool.fnal.gov" fullname="schedd_jobs1@userpool.fnal.gov">
<schedd DN="/DC=org/DC=doegrids/OU=Services/CN=userpool.fnal.gov" fullname="schedd_jobs2@userpool.fnal.gov">
</schedds>
</job>
</match>

<monitor base_dir="/var/www/html/vofrontend/monitor" flot_dir="/opt/javascriptrrd-0.6.1/flot" javascriptRRD_dir="/opt/javascriptrrd-0.6.1/src/lib" jquery_dir="/opt/javascriptrrd-0.6.1/flot">
<monitor_footer display_txt="Legal Disclaimer" href_link="/site/disclaimer.html" />

<security classad_proxy="/etc/grid-security/vocert.pem" proxy_DN="/DC=org/DC=doegrids/OU=Services/CN=frontend-server.fnal.gov" proxy_selection_plugin="ProxyAll" security_name="frontenduser" sym_key="aes_256_cbc">
<proxies>
<proxy absfname="/tmp/x509up_u" security_class="frontend">
</proxies>
<stage base_dir="/var/www/html/vofrontend/stage" use_symlink="True" web_base url="http://frontend-server.fnal.gov:9000/vofrontend/stage">
<work base_dir="/opt/vofrontend" base_log_dir="/opt/vofrontend/logs">
<attrs>
<attr name="GLIDECLIENT_Rank" glidein_publish="False" job_publish="False " parameter="True" type="string" value="1">
<attr name="GLIDECLIENT_Start" glidein_publish="False" job_publish="False" parameter="True" type="string" value="True">
<attr name="GLIDEIN_Expose_Grid_Env" glidein_publish="True" job_publish="True" parameter="False" type="string" value="True">
<attr name="GLIDEIN_Glexec_Use" glidein_publish="True" job_publish="True" parameter="False" type="string" value="OPTIONAL">
<attr name="USE_MATCH_AUTH" glidein_publish="False" job_publish="False" parameter="True" type="string" value="True">
</attrs>
<groups>
<group name="main" enabled="True">
<config>
<idle_glideins_per_entry max="100" reserve="5">
<idle_vms_per_entry curb="5" max="100">
<running_glideins_per_entry max="10000" relative_to_queue="1.15">
</config>
<downtimes />
<match match_expr="True">
<factory query_expr="True">
<match_attrs />
<collectors />
</factory>
<job query_expr="True">
<match_attrs />
<schedds />
</job>
<security>
<proxies />
</security>
<attrs />
<files />
</group>
</groups>
<collectors>
<collector DN="/DC=org/DC=doegrids/OU=Services/CN=usercollector.fnal.gov" node="usercollector.fnal.gov" secondary="False">
</collectors>
<files />
</frontend>

The Glidein Frontend Configuration

The Glidein Frontend configuration involves creating the configuration directory and files and then creating the daemons. As in the Glidein Factory set up, an XML file is converted into a configuration tree by a configuration tool.

For the installer to create the Glidein Frontend instance from the configuration directory and grid mapfile, the following objects can be defined:

Other attributes can be specified as well. They are used by the VO frontend matchmaking and job matchmaking. The format is similar to the attributes on the Factory config file. The table below describes the <attrs ... > tag in more detail.

Attribute Name

Attribute Description

name

Name of the attribute

value

Value of the attribute

parameter

Set to True if the attribute should be passed as a parameter. If set to False, the attribute will be put in the staging area to be accessed by the glidein startup scripts. Always set this to True unless you know what you are doing.

glidein_publish

If set to True, the attribute will be available in the condor_startd's classad. Used only if parameter is True.

job_publish

If set to True, the attribute will be available in the user job's environment. Used only if parameter is True.

comment

You can specify description of the attribute here.

type

Type of the attribute. Supported types are 'int', 'string' and 'expr'. Type expr is equivalent to condor constant/expression in condor_vars.lst

An example attribute would be:

<attrs><attr name="GLIDEIN_Collector" value="mymachine.mydomain" publish="True" parameter="True" const="True" glidein_publish="True" comment=”Just a test attribute”/>

The following group parameters are used to configure multiple frontends. If only one group is specified, they apply to all frontends. The objects specified are used for creating and monitoring glideins. Groups are used to group users with similar requirements, such as proxies, criteria for matching job requirements with sites, and configuration of glideins.

Adding Custom Code/Scripts to Glidein Frontend Glideins

You can add custom scripts to glideins created for this Glidein Frontend by adding scripts and files to the configuration in the files section:
<glidein>
 [<groups><group>]
 <files>
  <file absfname="script name" executable="True" comment="comment"/>

The script will be copied to the Web-accessible area and added to the glidein's file_list, and when a glidein starts, the glidein startup script will pull it and execute it. If any parameters are needed, they can be specified using <attr />.

Files will be in the "main" sub directory for factory files and the "client" sub directory for frontend files.

For more detailed information, see the page dedicated to writing custom scripts.

You can also create wrapper scripts or tar-balls of files, see the factory entry page for syntax. (Use groups/group tags instead of the factory's entry tag).

Match and Match Attributes

Several sections in the configuration allow a match expression. Each of these sections allows an expression to be evaluated to determine where glideins and jobs should be matched.
For example, expressions allowing a white list by the frontend can be created in order to control where the glideins are submitted. It can also allow you to give a Condor expression to specify where jobs can run or to specify which glidein_sites can run jobs.
Note that, for some tags (like factory query_expr), you can specify expressions in both the default global section as well as in individual group sections. You should take special care before doing this to make sure the expressions are correct, as the expreesions are typically "AND"-ed together.

Each match expression is a python expression that will be evaluated. Matches can be scoped to either global scope (<frontend><match>) or to a group specific scope.

Each python expression will typical be a series of boolean tests, surrounded by parantheses and connected by the boolean expressions "and", "or", and "not". You can use several dictionaries in these match expressions. The "job" dictionary contains the classad of the job being matched, and the "glidein" dictionary contains information about the factory (entry point) classad. While an extensive list of everything you can in these expressions is out of scope, some examples are below:

  • (job.has_key("ImageSize")): Returns true if the job classad has the attribute "ImageSize".
  • (job["NumJobStarts"]>5): Returns true if the job classad attribute "NumJobStarts" is greater than 5.
  • (glidein["attrs"].has_key("GLIDEIN_Retire_Time")): Returns true if the factory entry classad has the attribute "GLIDEIN_Retire_Time".
  • (glidein["attrs"]["GLIDEIN_Retire_Time"]>21600): Returns true if the factory entry classad's "GLIDEIN_Retire_Time" is greater than 21600.

Each attribute used in a match expression should be declared in a subsequent match_attrs section. This makes classad variables available to the match expression. Attributes can be made available from the:

  1. Factory classad: (<match><factory><match_attr>)
  2. Job classad: (<match><job><match_attr>)

Each match_attr tag must contain a name. This is the name of the attribute in the appropriate classad.
It must also contain a type which can be one of the following:

  1. string: A constant string of letters, numbers, or characters.
  2. int: An integer: a positive or negative number, or zero.
  3. real: A real number that could have decimal places
  4. bool: It can by "True" or "False"
  5. Expr: A ClassAd expression

Example

<match match_expr='glidein["attrs"]["GLIDEIN_Site"] in job["DESIRED_Sites"].split(",")'>
<factory query_expr="(GLIDEIN_Site=!=UNDEFINED)">
<match_attrs> <match_attr name="GLIDEIN_Site" type="string"/> </match_attrs>
<collectors> </collectors>
</factory>
<job query_expr="(DESIRED_Sites=!=UNDEFINED)">
<match_attrs> <match_attr name="DESIRED_Sites" type="string"/> </match_attrs>
<schedds> </schedds>
</job>
</match>

Using Multiple Proxies

Why would you want to use a pool of pilot proxies instead of a single one?

If your VO maps to a single group account at the remote grid sites, you wouldn't. A pool of pilot proxies (try saying that 5 times fast) does not gain you anything. If your VO maps to a pool of accounts at remote grid sites, you should consider using a pool of proxies equivalent to the number of users you have. Why?

Consider the following scenario: Alice, Bob, and Charlie are all in the FUNGUS experiment and form a VO. They are using glideinWMS. Alice sends 1000 jobs to FNAL via their glideinWMS using a single pilot proxy. The pilots map to ther userid fungus01 at FNAL, and in accordance with the batch system's fairshare policies, the job priority for user fungus01 is decreased significantly.

Bob comes along and submits 1000 jobs via glideinWMS, while Charlie submits 1000 jobs under his own proxy and not using glideinWMS. The glideinWMS pilots launch for Bob, and map to fungus01. Charlie launches his own jobs that get mapped to fungus73. Relative to fungus02, fungus01 priority is terrible, and Bob's jobs sit around waiting for Charlie -- even though Bob didn't occupy the FNAL resources, Alice did!

The solution: have a pool of pilot proxies. We then spread the fairshare penalty amongst fungus01, fungus02, and fungus03, and Bob now can compete on a more equal footing with Charlie.

Using multiple proxies

Proxies can be specified in the <security><proxies><proxy> tags. Multiple proxy tags can be entered, one for each proxy file. These can found in the security section at the top of the xml, in which case, the proxies are shared for all securitty groups. They can also be found within <group> tags, in which case they are used only by that security group.

One example follows:

<security>
<proxies>
<proxy absfname="/home/frontend/.globus/x509_pilot05_cms_prio.proxy" security_class="cmsprio"/>
<proxy absfname="/home/frontend/.globus/x509_pilot06_cms_prio.proxy" security_class="cmsprio"/>
<proxy absfname="/home/frontend/.globus/x509_pilot07_cms_prio.proxy" security_class="cmsprio"/>
<proxy absfname="/home/frontend/.globus/x509_pilot08_cms_prio.proxy" security_class="cmsprio"/>
<proxy absfname="/home/frontend/.globus/x509_pilot09_cms_prio.proxy" security_class="cmsprio"/>
<proxies>
<security>

Starting a Glidein Frontend Daemon

Once you have the desired configuration file, move to the VO frontend directory  and launch the command:

./frontend_startup start

All the activity messages will go into

group_*/log/frontend_info.<date>.log

while the warnings go into

group_*/log/frontend_err.<date>.log

The frontend logs are deleted after a week.