GlideinWMS - Glidein Frontend

Glidein Frontend

Troubleshooting

Jump to:

Verifying services are communicating.
General Issues
Failed to talk to factory_pool (WMS Pool)
Failed to talk to collector (User Pool)
Problems Submitting
Jobs Stay Idle
Administrative tools

Verifying all GlideinWMS services are communicating.

One mean of verifying that all the GlideinWMS services are commnunicating correctly is to perform the following on the User Pool (Collector).

> source CONDOR_LOCATION/condor.sh
> condor_status -any -pool NODE:PORT (of the User Pool)

MyType

TargetType

Name

. 1.The DaemonMaster, Negotiator and Collector types indicate the User Pool services are running.
2.The number of Scheduler types should equal the number of schedds you specified for the Submit service.
3.The glideresource indicates the VO Frontend is talking to the WMS Pool/Factory and the User Pool. NOTE: You do require at least one entry point in the Factory for this to show.

Scheduler None cms-xen21.fnal.gov

DaemonMaster None cms-xen21.fnal.gov

Negotiator None cms-xen21.fnal.gov

Collector None frontend_service@cms-xen21.fnal

glideresource None ress_ITB_INSTALL_TEST@same_node

Scheduler None schedd_jobs2@cms-xen21.fnal.gov

One exception is the communcation between the Frontend and the User Submit/Schedd services. This is one area of failure this check does not cover.

General Issues

This section contains tips and troubles relevant to all phases of a job's execution. Also see the user tutorials with example job submissions for VO Frontends.

Authentication Issues

Many GlideinWMS issues are caused by authentication. Make sure that your proxy and certificate are correct. Each process needs a proxy/cert that is owned by that user.
Also, make sure that this cert has authorization to run a job by running a command such as (all on one line - NOTE: You need to have globus-gram-client-tools installed):

X509_USER_PROXY=/tmp/x509up_u<UID> globus-job-run -a -r <gatekeeper in Factory config>

Note that /tmp/x509up_u<UID> is the typical location for x509/kerberos proxy certificates, but use the proper location if the place of your server certificate varies.

Wrong condor.sh sourced

Always source the correct condor.sh before running any commands. Many problems are caused are by using the wrong path/environment, (for instance, sourcing the User Pool condor.sh then running WMS Pool Collector commands). Run "which condor_q" to see if your path is correct.
Note: If you are using OSG Client installed via tarball and source the setup.sh (e.g. for voms-proxy-init), this may change your path/environment, and you may need to run condor.sh again.

Failed to talk to factory_pool (WMS Pool Collector)

This is a failure to communicate with the WMS Pool (Collector) and Factory services.

Symptoms: Many
Useful files:

In the GLIDEINWMS_VOFRONTEND_HOME/log/*/group_main/frontend.*.info.log:
[2011-09-21T13:54:04-05:00 3859] WARNING: Failed to talk to factory_pool cms-xen21.fnal.gov:9618. See debug log for more details.
In the GLIDEINWMS_VOFRONTEND_HOME/log/*/group_main/frontend.*.debug.log:
[2011-09-21T13:54:04-05:00 3859] Failed to talk to factory_pool cms-xen21.fnal.gov:9618:
Error running '/usr/local2/glideins/git-same-condor-v2plus.ini/condor-frontend/bin/condor_status
... (more traceback)
code 1:Error: communication error
CEDAR:6001:Failed to connect to <131.225.206.78:9618> Error: Couldn't contact the condor_collector on cms-xen21.fnal.gov (<131.225.206.78:9618>).

Debugging Steps:

Verify the WMS Pool is running.
Verify the IP/NODE is correct for the WMS Pool.
If it is incorrect, the frontend.xml should be corrected and a Frontend reconfig executed.
<frontend .... >
<match .... > <factory .... > <collectors>
<collector node="cms-xen21.fnal.gov:9618"/> ... This one. This is WMS Pool
</collectors> </factory> </match .... >
<collectors>
<collector ... node="cms-xen21.fnal.gov:9640"/> ... Not this one. This is User Pool
</collectors>
</frontend>
If you have access to the WMS Pool, check the ALLOW/DENY configuration in its condor_config.
Another reason for failure is a GSI authentication error (aka permission denied) error occuring on the WMS Collector.
If you have access to the Condor log files for that service, check the MasterLog and CollectorLog for authentication errors. The VOFrontend's proxy (proxy_DN) must be in the CONDOR_LOCATION/certs/condor_mapfile of the WMS Collectors to allow classads to be published.
<frontend .... >
<security classad_proxy="VOFRONTEND PROXY" proxy_DN="VOFRONTEND PROXY ISSUER"... />
</frontend>

Failure to talk to collector (User Pool)

This is a failure to communicate with the User Pool Collector service. This does not affect the ability to submit and run jobs.

Symptoms: Many
Useful files:

In the GLIDEINWMS_VOFRONTEND_HOME/log/*/group_main/frontend.*.info.log:
[2011-09-20T13:51:25-05:00 4619] WARNING: Failed to talk to collector. See debug log for more details.
[2011-09-20T13:51:25-05:00 2994] WARNING: Exception in jobs. See debug log for more details.
In the GLIDEINWMS_VOFRONTEND_HOME/log/*/group_main/frontend.*.debug.log:
[2011-09-20T13:51:25-05:00 4619] Failed to talk to collector:
Error running '/usr/local2/glideins/git-same-condor-v2plus.ini/condor-frontend/bin/condor_status
... (more traceback)
code 1:Error: communication error
CEDAR:6001:Failed to connect to <131.225.206.78:9640>

Debugging Steps:

Verify the User Pool is running.
Verify the IP/NODE is correct for the User Pool.
If it is incorrect, the frontend.xml should be corrected and a Frontend reconfig executed.
<frontend .... >
<match .... > <factory .... > <collectors>
<collector ... node="cms-xen21.fnal.gov:9618"/> ... Not this one. This is WMS Pool
</collectors> </factory> </match .... >
<collectors>
<collector node="cms-xen21.fnal.gov:9640" .. /> ... This one. This is User Pool
</collectors>
</frontend>
If you have access to the User Pool, check the ALLOW/DENY configuration in its condor_config.
Another reason for failure is a GSI authentication error (aka permission denied) error occuring on the User Pool. If you have access to the HTCondor log files for that service, check the MasterLog and CollectorLog for authentication errors. The VOFrontend's proxy (proxy_DN) must be in the CONDOR_LOCATION/certs/condor_mapfile, of both collectors, to allow classads to be published.
<frontend .... >
<security classad_proxy="VOFRONTEND PROXY" proxy_DN="VOFRONTEND PROXY ISSUER"... />
</frontend>

Problems submitting your job

Symptoms: Error submitting user job
Useful files: GLIDEINWMS_USERSCHEDD_HOME/condor_local/logs/SchedLog
Debugging Steps:

If you encounter errors submitting your job using condor_submit, the error messages printed on the screen will be useful in identifying potential problems. Occasionally, you can additional information in the condor schedd logs.

Always make sure that you have sourced the condor.sh and that the path and environment is correct.

source $GLIDEINWMS_USERSCHEDD_HOME/condor.sh

Based on the actual condor scheduler, you can find scheduler logfile, SchedLog, in one of the sub directories of directory listed by “condor_config_val local_dir”

If you are installing all services on one machine (not recommended but sometimes useful for testing) make sure that the user collector and wms collector are on two different ports (such as 9618 and 8618). You can do "ps -ef" to see if the processes are started (should be multiple condor_masters, condor_schedds and condor_procd for each machine). Make sure they are running as the proper users (user schedd should be probably be run as root.).

Also refer to the Collector install for verification steps.

User Jobs Stay Idle

Symptoms:User job stays idle and there are no glideins submitted that correspond to your job.

This step involves the interaction of the VO Frontend, the WMS Factory and the Glideins. Hence, there are two separate facilities to see why no glideins are being created. See the Factory Troubleshooting page if none of suggestions below help.

Frontend unable to map your job to any entry point

Symptoms: User job stays idle and there is no information in the Frontend logs about glideins required to run your job.
Useful files: GLIDEINWMS_VOFRONTEND_HOME/log/*
GLIDEINWMS_VOFRONTEND_HOME/group_<GROUP_NAME>/log/*
Debugging Steps:

Check if the VO Frontend is running. If not start it.

Glidein Frontend processes periodically query for user jobs in the user schedd. Once you have submitted the job, the VO Frontend should notice it during its next quering cycle. Once the Frontend identifies potential entry points that can run your job, it will reflect this information in the glideclient classad in WMS Pool collector for that corresponding entry point. You can find this information by running “condor_status -any -pool <wms collector fqdn>”

Check for error messages in logs located in GLIDEINWMS_VOFRONTEND_HOME/log. Assuming that you have named Frontend main group as “main”, check the log files in GLIDEINWMS_VOFRONTEND_HOME/group_main/log.

[2009-12-07T15:16:25-05:00 12398] For ress_GRATIA_TEST_31@v1_0@mySites-cmssrv97@cmssrv97.fnal.gov Idle 19 (effective 19 old 19) Running 0 (max 10000)
[2009-12-07T15:16:25-05:00 12398] Glideins for ress_GRATIA_TEST_31@v1_0@mySites-cmssrv97@cmssrv97.fnal.gov Total 0 Idle 0 Running 0
[2009-12-07T15:16:25-05:00 12398] Advertize ress_GRATIA_TEST_31@v1_0@mySites-cmssrv97@cmssrv97.fnal.gov Request idle 11 max_run 22

You should notice something like above in the logs corresponding to your job. If the Frontend does not identify any entry that can run your job, then either the the desired entry is not configured in the glidein Factory or the requirements you have expressed in your jobs are not correct.

Also, check the security classad to make sure the proxy/cert for the Frontend is correct. It should be chmod 600 and owned by the Frontend user.
If using VOMS, try to query the information to verify:

X509_USER_PROXY=<vofronted_proxy_location> voms-proxy-info.

The symptoms of this issue are a break in communication between the VO Frontend and the Factory. In this case, the problem may also be a problem with the Factory. See the Factory Troubleshooting guide for more details.

Found an untrusted Factory

Symptoms: You will receive an error similar to:

info log:
[2010-09-29T09:07:24-05:00 26824] WARNING: Found an untrusted Factory ress_ITB_GRATIA_TEST_2@v2_4_3@factory_service at cms-xen21.fnal.gov; ignoring.
debug log:
[2010-09-29T09:07:24-05:00 26824] Found an untrusted Factory ress_ITB_GRATIA_TEST_2@v2_4_3@factory_service at cms-xen21.fnal.gov; identity mismatch ' weigand@cms-xen21.fnal.gov'!='factory@cms-xen21.fnal.gov '

Debugging Steps:

Verify the Frontend config:
<frontend > <collector ... factory_identity="...">
The Frontend config's security element security_name attribute does not match the Factory config's Frontend element name attribute.
You can find the authenticated identity by:

condor_status -collector >WMSCollector_node:port> -long |grep -i AuthenticatedIdentity |sort -u

Frontend Web server down or unreachable

Symptoms: Glideins do not report. Everything seems OK, no special error visible in the Frontend. Errors would be visible in the Glidein logs, in the Facotry Debugging Steps:

It is worth to test quickly the Web server. It serves both the monitoring pages and the stage area used by the Glideins. Check that the following pages are available (the staging area should be available from everywhere Glideins are running, so check the pages also from outside the Firewall):

monitoring pages: http://FRONTEND_HOST_NAME/vofrontend/monitor/
staging area (most files have a hash in the file name): http://FRONTEND_HOST_NAME/vofrontend/stage/nodes.blacklist

If the pages are not reachable check that your web server is running and that no firewall (host or network) is blocking those pages.

Administrative tools

GlideinWMS comes with a set of administrative tools that you can use when troubleshooting problems.

All tools are located in a common area:

glideinwms/frontend/tools

There are two tools that can be used to talk to the glideins:

fetch_glidein_log - This tool will fetch the log from a glidein. It is a wrapper around condor_fetchlog.
glidein_off - This tool will turn off one or more glideins. It is a wrapper around condor_off.

Additionally, there is a general purpuse tool, called
enter_frontend_env
that simply sets the environment so that it matches the one used by the Frontend daemons. You can then use all HTCondor commands directly, to both the glideins and the Factories.

All tools require access to the Frontend group's work area in order to work.

The top level work area can be set either with a -d option, or through the $FE_WORK_DIR environment variable.
The group name can be set either with a -g option, or theough the $FE_GROUP_NAME environemtn variable.

An example command would be:

# Turn off all the glideins that are currently not used
export FE_WOK_DIR=~/frontstage/frontend_UCSD_v1_0
cd ~/glideinwms/frontend/tools
./glidein_off -g main --constraint 'State=!="Claimed"'

You can furthermore use:

remove_requested_glideins - This tool will request the removal of glideins requested by the Frontend.
This comes handy when/if you find out that you had major configuration problem in the previous Frontend configuration. By default, a reconfig will only request glideins of new type, but will not remove any old ones.

There are two types of removals you most likely want to request (through the -t option):
- idle - Only remove glideins that have not started yet (the default).
- all - All glideins, including those currently running.

Glidein Tools

This section describe tools available to manually operate glideins

manual_glidein_startup

This tool queries the Factory and Frontend classads in the wms collector and generates glidein_startup.sh command along with arguments. This is useful from a VO or a site perceptive for debugging purposes to manually launch a glidein and have the HTCondor Startd report to the VO collector. Although all the required information is derieved from the Factory and Frontend classads, user still needs to set the X509_USER_PROXY environment variable. glidein_startup.sh uses this proxy for the condor daemons to talk to the VO collector.

[prompt]$ manual_glidein_startup --help
usage: manual_glidein_startup [-h] [--wms-collector WMS_COLLECTOR]
                              [--req-name REQ_NAME]
                              [--client-name CLIENT_NAME]
                              [--glidein-startup GLIDEIN_STARTUP]
                              [--override-args OVERRIDE_ARGS]
                              [--cmd-out-file CMD_OUT_FILE] [--debug]

Generate glidein_startup command

Example:
manual_glidein_startup --wms-collector=fermicloud145.fnal.gov:8618 --client-name=Frontend-master-v1_0.main --req-name=TEST_SITE_2@v1_0@GlideinFactory-master
--cmd-out-file=/tmp/glidein_startup_wrapper --override-args="-proxy http://httpproxy.mydomain -v fast"

optional arguments:
  -h, --help            show this help message and exit
  --wms-collector WMS_COLLECTOR
                        COLLECTOR_HOST for WMS Collector(s) in CSV format (default: gfactory-2.opensciencegrid.org,gfactory-itb-1.opensciencegrid.org)
  --req-name REQ_NAME   Factory entry info: ReqName in the glideclient classad
  --client-name CLIENT_NAME
                        Frontend group info: ClientName in the glideinclient classad
  --glidein-startup GLIDEIN_STARTUP
                        Full path to glidein_startup.sh to use
  --override-args OVERRIDE_ARGS
                        Override args to glidein_startup.sh
  --cmd-out-file CMD_OUT_FILE
                        File where glidein_startup.sh command is created
  --debug               Enable debug logging

Step by step example on how to use the manual_glidein_startup on an lxplus machine at CERN

Step 1: Create a test directory and get the necessary scripts.

[mmascher@lxplus794 ~]$ mkdir gwmstest
[mmascher@lxplus794 ~]$ cd gwmstest/
[mmascher@lxplus794 ~]$  wget https://raw.githubusercontent.com/glideinWMS/glideinwms/master/tools/manual_glidein_startup
--2019-09-12 15:08:32--  https://raw.githubusercontent.com/glideinWMS/glideinwms/master/tools/manual_glidein_startup
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.112.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.112.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10522 (10K) [text/plain]
Saving to: ‘manual_glidein_startup’

100%[====================================================================================================================================================

2019-09-12 15:08:32 (85,7 MB/s) - ‘manual_glidein_startup’ saved [10522/10522]

[mmascher@lxplus794 gwmstest]$ wget https://raw.githubusercontent.com/glideinWMS/glideinwms/master/creation/web_base/glidein_startup.sh
--2019-09-12 15:09:46--  https://raw.githubusercontent.com/glideinWMS/glideinwms/master/creation/web_base/glidein_startup.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.112.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.112.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 69546 (68K) [text/plain]
Saving to: ‘glidein_startup.sh’

100%[====================================================================================================================================================

2019-09-12 15:09:46 (3,81 MB/s) - ‘glidein_startup.sh’ saved [69546/69546]

[mmascher@lxplus794 gwmstest]$ chmod +x manual_glidein_startup
[mmascher@lxplus794 gwmstest]$ chmod +x glidein_startup.sh

Step 2: Query the OSG factory (gfactory-2.opensciencegrid.org) for the necessary parameters. The following condor query can be used to retrieve the request-name and the client-name parameters. Notice, the command shows enabled factory entries that are queried by a frontend group (i.e.: the frontend group factory query expression selects the factory entry). It is not required that the entries has some pressure (i.e.: ReqIdleGlideins can be zero).

[mmascher@lxplus794 gwmstest]$ condor_status -pool gfactory-2.opensciencegrid.org -any -const 'MyType=="glideclient"' -af ClientName ReqName | tail -n 5
chtc2.main OSG_US_IIT_iitce2_rhel6@gfactory_instance@OSG
chtc2.main OSG_US_MWT2_mwt2_condce@gfactory_instance@OSG
chtc2.main OSG_US_MWT2_mwt2_condce_mcore@gfactory_instance@OSG
chtc2.main OSG_US_UConn_gluskap_op@gfactory_instance@OSG
chtc2.main OSG_US_WSU_GRID_ce2@gfactory_instance@OSG

Step 3: Generate the wrapper script for glidein_startup.sh. The manual_glidein_startup tool will query the factory collector and figure out the parameters needed by the glidein_startup.sh script. Then, it will write out a file that invokes glidein_startup.sh correctly (the wrapper is glidein_startup_wrapper in this case). In this case I picked the last request name/client name pair retrieved the condor_status:

./manual_glidein_startup --wms-collector=gfactory-2.opensciencegrid.org --client-name=chtc2.main --req-name=OSG_US_WSU_GRID_ce2@gfactory_instance@OSG
--cmd-out-file=glidein_startup_wrapper --glidein-startup=./glidein_startup.sh

This is an example of the generated wrapper for our example:

[mmascher@lxplus794 gwmstest]$ cat glidein_startup_wrapper
#!/bin/sh

# glidein_startup.sh command
./glidein_startup.sh -param_GLIDECLIENT_Rank 1 -param_USE_MATCH_AUTH True -sign 9de23b396248df4285410adb2f4c9d977514e91c -clientgroup main -name gfactory_instance
-param_UPDATE_COLLECTOR_WITH_TCP True -param_GLIDEIN_Client chtc2.main -param_CONDOR_OS auto -clientdescript description.j9ba1c.cfg -clientsigntype sha1
-param_GLIDEIN_Report_Failed NEVER -clientsign e3c5adc75b3559831deec7970ab8674761f636b6 -param_GLIDEIN_Job_Max_Time 34800 -clientweb http://glidein2.chtc.wisc.edu/vofrontend/stage
-param_GLIDEIN_Glexec_Use NEVER -factory OSG -param_GLIDECLIENT_ReqNode gfactory.minus,2.dot,opensciencegrid.dot,org -proxy OSG -schedd UNAVAILABLE -clientsigngroup
af9109d73f695d596ce9ffef8440867021652284 -v std -dir TMPDIR -entry OSG_US_WSU_GRID_ce2 -param_GLIDEIN_Monitoring_Enabled False -param_OSG_SINGULARITY_EL7_PERCENT 100
-slotslayout fixed -param_CONDOR_ARCH default -param_STARTD_JOB_ATTRS .dollar,.open,STARTD_JOB_ATTRS.close,.comma,x509userproxysubject.comma,x509UserProxyFQAN.comma,
x509UserProxyVOName.comma,x509UserProxyEmail.comma,x509UserProxyExpiration
-signentry 67e1af17dc0bd4cd5dba5823e99278b50d2af0eb -clientdescriptgroup description.j9ba1c.cfg -param_GLIDEIN_Collector glidein2.dot,chtc.dot,wisc.dot,edu.colon,9620.minus,9640
-descript description.j9b23C.cfg -descriptentry description.j9b23C.cfg -signtype sha1 -clientname chtc2 -clientwebgroup
http://glidein2.chtc.wisc.edu/vofrontend/stage/group_main -cluster 0 -submitcredid UNAVAILABLE -param_CONDOR_VERSION 8.dot,8.dot,x -web
http://gfactory-2.opensciencegrid.org/factory/stage -subcluster 0 -param_MIN_DISK_GBS 1

Notice that some of these parameters changes every time a factory (or frontend) operator reconfigures the glideinWMS factory (or frontend). For example, the description file names and their signatures are files downloaded by the pilots from the frontend and the factory via http. Their name changes at every reconfig because they content might be different at every reconfig, and squid server are used to download them. If you plan to use the wrapper for a long time on a production cluster you should run the manual_glidein_startup script periodically to regenerate a wrapper with the correct parameters.
Step 4: Execute the pilot!

[mmascher@lxplus794 gwmstest]$ export X509_USER_PROXY=/tmp/x509up_u8440

[mmascher@lxplus794 gwmstest]$ ./glidein_startup_wrapper
gio 12 set 2019, 16.05.53, CEST OSG_SQUID_LOCATION undefined, not using any Squid URL
Starting glidein_startup.sh at gio 12 set 2019, 16.05.53, CEST (1568297153)
script_checksum   = '8c9a6cab9b22fe4dc93548aac0528874  ./glidein_startup.sh'
debug_mode        = 'std'
condorg_cluster   = '0'
condorg_subcluster= '0'
condorg_schedd    = 'UNAVAILABLE'
glidein_credential_id = 'UNAVAILABLE'
glidein_factory   = 'OSG'
glidein_name      = 'gfactory_instance'
glidein_entry     = 'OSG_US_WSU_GRID_ce2'
client_name       = 'chtc2'
client_group      = 'main'
multi_glidein/restart = ''/''
work_dir          = 'TMPDIR'
web_dir           = 'http://gfactory-2.opensciencegrid.org/factory/stage'
sign_type         = 'sha1'
proxy_url         = 'None'
descript_fname    = 'description.j9b23C.cfg'
descript_entry_fname = 'description.j9b23C.cfg'
sign_id           = '9de23b396248df4285410adb2f4c9d977514e91c'
sign_entry_id     = '67e1af17dc0bd4cd5dba5823e99278b50d2af0eb'
client_web_dir              = 'http://glidein2.chtc.wisc.edu/vofrontend/stage'
client_descript_fname       = 'description.j9ba1c.cfg'
client_sign_type            = 'sha1'
client_sign_id              = 'e3c5adc75b3559831deec7970ab8674761f636b6'
client_web_group_dir        = 'http://glidein2.chtc.wisc.edu/vofrontend/stage/group_main'
client_descript_group_fname = 'description.j9ba1c.cfg'
client_sign_group_id        = 'af9109d73f695d596ce9ffef8440867021652284'

Running on lxplus794.cern.ch
System: Linux lxplus794.cern.ch 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Release: CentOS Linux release 7.6.1810 (Core)
As: uid=8440(mmascher) gid=1399(zh) groups=1399(zh),1096330822 context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
PID: 17645

------- Initial environment ---------------
LC_PAPER=it_IT.UTF-8
MANPATH=/cvmfs/cms.cern.ch/share/man::/opt/puppetlabs/puppet/share/man
XDG_SESSION_ID=c27950
LC_ADDRESS=it_IT.UTF-8
VIRTUALENVWRAPPER_SCRIPT=/usr/bin/virtualenvwrapper.sh
GUESTFISH_INIT=\e[1;34m
HOSTNAME=lxplus794.cern.ch
LC_MONETARY=it_IT.UTF-8
SELINUX_ROLE_REQUESTED=
TERM=xterm-256color
SHELL=/bin/bash
HISTSIZE=1000
TMPDIR=/tmp/mmascher
SSH_CLIENT=80.117.57.151 55835 22
CONDA_SHLVL=0


[.....]

MODULESHOME=/usr/share/Modules
LESSOPEN=||/usr/bin/lesspipe.sh %s
XDG_RUNTIME_DIR=/run/user/8440
QT_PLUGIN_PATH=/usr/lib64/kde4/plugins:/usr/lib/kde4/plugins
GIT_AUTHOR_EMAIL=marco.mascheroni@cern.ch
LC_TIME=it_IT.UTF-8
GUESTFISH_RESTORE=\e[0m
LC_NAME=it_IT.UTF-8
BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`
}
_=/usr/bin/env
--- ============= ---

Unsetting X509_USER_PROXY
=== Condor starting gio 12 set 2019, 16.51.20, CEST (1568299880) ===
=== Condor started in background, now waiting on process 30732 ===

The wrapper will execute the glidein_startup.sh script, which in turn will run validation scripts, and then it will start condor (as you can see in the last lines in the log above). If everything is successful you will se a process tree similar to the one below, and, if interested, you can immediately check the condor job logs (you can find the execution directory of the pilot either in the wrapper stdou, or from the ps output).

[mmascher@lxplus794 ~]$ ps auxf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mmascher 28651  0.0  0.0 190528  3208 ?        S    13:31   0:00 sshd: mmascher@pts/61
mmascher 28652  0.0  0.0 128392  4996 pts/61   Ss   13:31   0:00  \_ -bash
mmascher 10614  0.0  0.0 113180  1448 pts/61   S    16:50   0:00      \_ /bin/sh ./glidein_startup_wrapper
mmascher 10615  0.0  0.0 113708  2132 pts/61   S    16:50   0:00          \_ /bin/bash ./glidein_startup.sh -param_GLIDECLIENT_Rank 1 -param_USE_MATCH_AUTH True -sign 9de23b396248df4285410ad
mmascher 27815  0.0  0.0 113416  1728 pts/61   S    16:51   0:00              \_ /bin/bash /tmp/mmascher/glide_j8tqfe/main/condor_startup.sh glidein_config
mmascher 30732  0.0  0.0 130832  8716 ?        Ss   16:51   0:00                  \_ /tmp/mmascher/glide_j8tqfe/main/condor/sbin/condor_master -f -pidfile /tmp/mmascher/glide_j8tqfe/condor_m
mmascher 31451  0.1  0.0  24920  2660 ?        S    16:51   0:04                      \_ condor_procd -A /tmp/mmascher/glide_j8tqfe/log/procd_address -L /tmp/mmascher/glide_j8tqfe/log/ProcLo
mmascher 31470  0.0  0.0 109664  9228 ?        Ss   16:51   0:01                      \_ condor_startd -f
mmascher  1995  0.0  0.0 110628  8920 ?        Ss   17:20   0:00                          \_ condor_starter -f submit-1.chtc.wisc.edu
mmascher  2009  0.0  0.0 110628  3088 ?        S    17:20   0:00                              \_ condor_starter -f submit-1.chtc.wisc.edu

[mmascher@lxplus794 ~]$ ls /tmp/mmascher/glide_j8tqfe/log/
InstanceLock  MasterLog  procd_address  procd_address.watchdog  ProcLog  ProcLog.old  StartdHistoryLog  StartdLog  StarterLog  transfer_history  XferStatsLog

Final remark: if you wish to change some of the non constant factory or frontend parameter, you may use the --override-args options of manual_glidein_startup:

--override-args="-dir TMPDIR -param_GLIDEIN_REQUIRED_OS rhel6"

MyType	TargetType	Name
			.	1.The DaemonMaster, Negotiator and Collector types indicate the User Pool services are running. 2.The number of Scheduler types should equal the number of schedds you specified for the Submit service. 3.The glideresource indicates the VO Frontend is talking to the WMS Pool/Factory and the User Pool. NOTE: You do require at least one entry point in the Factory for this to show.
Scheduler	None	cms-xen21.fnal.gov
DaemonMaster	None	cms-xen21.fnal.gov
Negotiator	None	cms-xen21.fnal.gov
Collector	None	frontend_service@cms-xen21.fnal
glideresource	None	ress_ITB_INSTALL_TEST@same_node
Scheduler	None	schedd_jobs2@cms-xen21.fnal.gov

GlideinWMS The Glidein-based Workflow Management System