glideinwms.frontend package¶
Subpackages¶
Submodules¶
glideinwms.frontend.checkFrontend module¶
Check if a glideinFrontend is running
- param $1 = work_dir:
- param $2 =:
- type $2 =:
optional) run mode (defaults to “run”
- Exit code:
0 - Running 1 - Not running anything 2 - Not running my types, but another type is indeed running
glideinwms.frontend.glideinFrontend module¶
This is the main of the glideinFrontend
- param $1 = work_dir:
- class glideinwms.frontend.glideinFrontend.FailureCounter(my_name, max_lifetime)[source]¶
Bases:
object
- glideinwms.frontend.glideinFrontend.clear_diskcache_dir(work_dir)[source]¶
Clear the cache by removing the directory used for the cachedir, and recreate it.
- glideinwms.frontend.glideinFrontend.set_frontend_htcondor_env(work_dir, frontendDescript, element=None)[source]¶
- glideinwms.frontend.glideinFrontend.shouldHibernate(frontendDescript, work_dir, ha, mode, groups)[source]¶
Check if the frontend is running in HA mode. If run in master mode never hibernate. If run in slave mode, hiberate if master is active.
@rtype: bool @return: True if we should hibernate else False
- glideinwms.frontend.glideinFrontend.spawn(sleep_time, advertize_rate, work_dir, frontendDescript, groups, max_parallel_workers, restart_interval, restart_attempts)[source]¶
- glideinwms.frontend.glideinFrontend.spawn_cleanup(work_dir, frontendDescript, groups, frontend_name, ha_mode)[source]¶
- glideinwms.frontend.glideinFrontend.spawn_iteration(work_dir, frontendDescript, groups, max_active, failure_dict, max_failures, action)[source]¶
glideinwms.frontend.glideinFrontendConfig module¶
Frontend config related classes
- class glideinwms.frontend.glideinFrontendConfig.AttrsDescript(base_dir, group_name)[source]¶
Bases:
JoinConfigFile
Global and grup attributes in a Frontend
One per group/element Content comes from the <attrs> sections in the global and group configuration Files: attrs.cfg in the main directory and group subdirectory cWDictFile.ReprDictFile defined in cvWDictFile.get_common_dicts()
- class glideinwms.frontend.glideinFrontendConfig.BaseSignatureDescript(config_dir, signature_fname, signature_type, validate=None)[source]¶
Bases:
ConfigFile
- class glideinwms.frontend.glideinFrontendConfig.ConfigFile(config_dir, config_file, convert_function=<built-in function repr>, validate=None)[source]¶
Bases:
object
Load a file or URL composed of NAME VAL lines and create the data dictionary
self.data[NAME]=VAL
- Also define:
self.config_file=”name of file”
- If validate is defined, also define a variable with the file hash:
self.hash_value
- load(fname, convert_function, validate=None)[source]¶
Load the config file/URI. The file/URI is a series of NAME VALUE lines or comment lines (starting with #) The hash algorithm and value are used to validate the file content. The convert_function is used to convert the value of each line
- Parameters:
fname (str|bytes) – URL or file path
convert_function – function converting the line value
validate (None|tuple) – if defined, must be (hash_algo,value)
- open(fname)[source]¶
Open the config file/URI. Used in self.load()
- Parameters:
fname (str|bytes) – URL or file path
Returns:
- class glideinwms.frontend.glideinFrontendConfig.ElementDescript(base_dir, group_name)[source]¶
Bases:
GroupConfigFile
Description of a Frontend group
One per group/element Content comes from the group configuration File name: group.descript (in the group subdirectory - group_GROUPNAME) cWDictFile.StrDictFile defined in cvWDictFile.get_group_dicts()
- class glideinwms.frontend.glideinFrontendConfig.ElementMergedDescript(base_dir, group_name)[source]¶
Bases:
object
Selective merge of global and group configuration
not everything is merged the old element in the global configuration can still be accessed
- class glideinwms.frontend.glideinFrontendConfig.ExtStageFiles(base_URL, descript_fname, validate_algo, signature_hash)[source]¶
Bases:
StageFiles
- class glideinwms.frontend.glideinFrontendConfig.FrontendDescript(config_dir)[source]¶
Bases:
ConfigFile
Description of the Frontand
Only one Content comes from the global configuration File name: frontend.descript cWDictFile.StrDictFile defined in cvWDictFile.get_main_dicts()
- class glideinwms.frontend.glideinFrontendConfig.GroupConfigFile(base_dir, group_name, config_file, convert_function=<built-in function repr>, validate=None)[source]¶
Bases:
ConfigFile
Config file from the group subdirectory
- class glideinwms.frontend.glideinFrontendConfig.GroupSignatureDescript(base_dir, group_name)[source]¶
Bases:
object
- class glideinwms.frontend.glideinFrontendConfig.HistoryFile(base_dir, group_name, load_on_init=True, default_factory=None)[source]¶
Bases:
object
- class glideinwms.frontend.glideinFrontendConfig.JoinConfigFile(base_dir, group_name, config_file, convert_function=<built-in function repr>, main_validate=None, group_validate=None)[source]¶
Bases:
ConfigFile
Joint main and group configuration
- class glideinwms.frontend.glideinFrontendConfig.MergeStageFiles(base_URL, validate_algo, main_descript_fname, main_signature_hash, group_name, group_descript_fname, group_signature_hash)[source]¶
Bases:
object
- class glideinwms.frontend.glideinFrontendConfig.ParamsDescript(base_dir, group_name)[source]¶
Bases:
JoinConfigFile
Global and grup parameters in a Frontend
One per group/element Content has parameter=”True” in the <attrs> sections in the global and group configuration Files: params.cfg in the main directory and group subdirectory cvWDictFile.ParamsDictFile defined in cvWDictFile.get_common_dicts()
- class glideinwms.frontend.glideinFrontendConfig.SignatureDescript(config_dir)[source]¶
Bases:
ConfigFile
glideinwms.frontend.glideinFrontendDowntimeLib module¶
glideinwms.frontend.glideinFrontendElement module¶
This is the main of the glideinFrontend
- param $1 = parent PID:
- param $2 = work dir:
- param $3 = group_name:
- param $4 = operation type:
- type $4 = operation type:
optional, defaults to “run”
- glideinwms.frontend.glideinFrontendElement.expand_DD(qstr, attr_dict)[source]¶
expand $$(attribute)
- Parameters:
qstr (str) – string to be expanded
attr_dict (dict) – attributes to use in the expansion
- Returns:
expanded string
- Return type:
str
- class glideinwms.frontend.glideinFrontendElement.glideinFrontendElement(parent_pid, work_dir, group_name, action)[source]¶
Bases:
object
Processing the Frontend group activity
Spawned by glideinFrontend. Aware of the available Entries in the Factory and of the job requests from schedds Send requests to the Factory: either to submit new glideins, or to remove them
- build_resource_classad(this_stats_arr, request_name, glidein_el, glidein_in_downtime, factory_pool_node, my_identity, limits_triggered)[source]¶
- check_removal_type_config(glideid)[source]¶
Decides what kind of excess glideins to remove depending on the configuration requests (glideins_remove) “ALL”, “IDLE”, “WAIT”, “NO” (default) or “DISABLE” (disable also automatic removal)
If removal_requests_tracking or active removal are enabled, this may result in Glidein removals depending on the parameters in the configuration and the current number of Glideins and requests
- Parameters:
glideid (str) – ID of the glidein request
- Returns:
remove excess string from configuration, one of: “DISABLE”, “ALL”, “IDLE”, “WAIT”, or “NO”
- Return type:
str
- choose_remove_excess_type(count_jobs, count_status, glideid)[source]¶
- Decides what kind of excess glideins to remove: control for request and automatic trigger:
“ALL”, “IDLE”, “WAIT”, or “NO”
If it is a request from the client (command line) then execute that Otherwise calculate the result of the automatic removal mechanism: increasingly remove WAIT, IDLE and ALL depending on how long (measured in Frontend cycles) there have been no requests.
- Parameters:
count_jobs (dict) – dict with job stats
count_status (dict) – dict with glidein stats
glideid (str) – ID of the glidein request
- Returns:
remove excess string from automatic mechanism, one of: “ALL”, “IDLE”, “WAIT”, or “NO”
- Return type:
str
- compute_glidein_max_run(prop_jobs, real, idle_glideins)[source]¶
Compute max number of running glideins for this entry
@param prop_jobs: Proportional idle multicore jobs for this entry @type prop_jobs: dict
@param real: Number of jobs running at given glideid @type real: int
@param idle_glideins: Number of idle startds at this entry @type idle_glideins: int
- compute_glidein_min_idle(count_status, total_glideins, total_idle_glideins, fe_total_glideins, fe_total_idle_glideins, global_total_glideins, global_total_idle_glideins, effective_idle, effective_oldidle, limits_triggered)[source]¶
Compute min idle glideins to request for this entry
Compute min idle glideins to request for this entry after considering all the relevant limits and curbs. Identify the limits and curbs triggered for advertising the info in glideresource classad
- Parameters:
count_status – dictionary with counters for glideins in the different state (from condor_q)
total_glideins – total number of glideins for the Entry
total_idle_glideins – number of idle glideins for the Entry
fe_total_glideins – total number of glideins for this Frontend at the Entry
fe_total_idle_glideins – number of idle glideins for this Frontend at the Entry
global_total_glideins – total number of glideins for all Entries
global_total_idle_glideins – number of idle glideins for all Entries
effective_idle
effective_oldidle
limits_triggered – dictionary used to return the limits triggered
- Returns:
- decide_removal_type(count_jobs, count_status, glideid)[source]¶
Pick the max removal type (unless disable is requested) - if it was requested explicitly, send that one - otherwise check automatic triggers and configured removal and send the max of the 2
If configured removal is selected, take into account also the margin and the tracking This handles all the Glidein removals triggered by the Frontend. It does not affect automatic mechanisms in the Factory, like Glidein timeouts
- Parameters:
count_jobs (dict) – dict with job stats
count_status (dict) – dict with glidein stats
glideid (str) – ID of the glidein request
- Returns:
remove excess string to send to the Factory, one of: “DISABLE”, “ALL”, “IDLE”, “WAIT”, or “NO”
- Return type:
str
- do_match()[source]¶
Do the actual matching.
This forks subprocess_count… methods as children to do the work in parallel: - self.subprocess_count_glidein - self.subprocess_count_real - self.subprocess_count_dt
The results are stored in 2 dictionaries: - self.count_status_multi, self.count_status_multi_per_cred - self.count_real_jobs, self.count_real_glideins - self.condorq_dict_types
- Returns:
- generate_credential(elementDescript, glidein_el, group_name, trust_domain)[source]¶
Generates a credential with a credential generator plugin provided for the trust domain.
- Parameters:
elementDescript (ElementMergedDescript) – element descript
glidein_el (dict) – glidein element
group_name (string) – group name
trust_domain (string) – trust domain for the element
- Returns:
Credential or None if not generated
- Return type:
string, None
- get_condor_q(schedd_name)[source]¶
Retrieve the jobs a schedd is requesting
- Parameters:
schedd_name (str) – the schedd name
Returns (dict): a dictionary with all the jobs
- get_scitoken(elementDescript, trust_domain)[source]¶
Look for a local SciToken specified for the trust domain.
- Parameters:
elementDescript (ElementMergedDescript) – element descript
trust_domain (string) – trust domain for the element
- Returns:
SciToken or None if not found
- Return type:
string, None
- identify_bad_schedds()[source]¶
Identify the list of schedds that should not be considered when requesting glideins for idle jobs. Schedds with one of the criteria
Running jobs (TotalRunningJobs + TotalSchedulerJobsRunning) is greater than 95% of max number of jobs (MaxJobsRunning)
Transfer queue (TransferQueueNumUploading) is greater than 95% of max allowed transfers (TransferQueueMaxUploading)
CurbMatchmaking in schedd classad is true
- identify_limits_triggered(count_status, total_glideins, total_idle_glideins, fe_total_glideins, fe_total_idle_glideins, global_total_glideins, global_total_idle_glideins, limits_triggered)[source]¶
- refresh_entry_token(glidein_el)[source]¶
Create or update a condor token for an entry point
- Parameters:
glidein_el – a glidein element data structure
- Returns:
jwt encoded condor token on success None on failure
- subprocess_count_dt(dt)[source]¶
Count the matches (glideins matching entries) using glideinFrontendLib.countMatch Will make calculations in parallel, using multiple processes
- Parameters:
dt – index within the data dictionary
- Returns:
Tuple of 5 elements: count, prop, hereonly, prop_mc, total
- glideinwms.frontend.glideinFrontendElement.log_and_sum_factory_line(factory, is_down, factory_stat_arr, old_factory_stat_arr=None)[source]¶
Will log the factory_stat_arr (tuple composed of 17 numbers) and return a sum of factory_stat_arr+old_factory_stat_arr if old_factory_stat_arr is not None
- Parameters:
factory – Entry name (or string to write for totals)
is_down – True if the Entry is down
factory_stat_arr – Frontend stats for this line
old_factory_stat_arr – Accumulator for the line stats. If None the stats are just logged
- Returns:
new list with old_factory_stat_arr+factory_stat_arr. None if old_factory_stat_arr is None
glideinwms.frontend.glideinFrontendInterface module¶
This module implements the functions needed to advertise and get resources from the Collector
- class glideinwms.frontend.glideinFrontendInterface.AdvertizeParams(request_name, glidein_name, min_nr_glideins, max_run_glideins, idle_lifetime=0, glidein_params={}, glidein_monitors={}, glidein_monitors_per_cred={}, glidein_params_to_encrypt=None, security_name=None, remove_excess_str=None, remove_excess_margin=0)[source]¶
Bases:
object
- class glideinwms.frontend.glideinFrontendInterface.Credential(proxy_id, proxy_fname, elementDescript)[source]¶
Bases:
object
- getId(recreate=False)[source]¶
Generate the Credential id if we do not have one already Since the Id is dependent on the credential content for proxies recreate them if asked to do so
- getString(cred_file=None)[source]¶
Based on the type of credentials read appropriate files and return the credentials to advertise as a string. The output should be encrypted by the caller as required.
- renew()[source]¶
Renews credential if time_left()<update_frequency Only works if type is grid_proxy or creation_script is provided
- class glideinwms.frontend.glideinFrontendInterface.FactoryKeys4Advertize(classad_identity, factory_pub_key_id, factory_pub_key, glidein_symKey=None)[source]¶
Bases:
object
- class glideinwms.frontend.glideinFrontendInterface.FrontendDescript(my_name, frontend_name, group_name, web_url, main_descript, group_descript, signtype, main_sign, group_sign, x509_proxies_plugin=None, ha_mode='master')[source]¶
Bases:
object
- class glideinwms.frontend.glideinFrontendInterface.FrontendMonitorClassad(frontend_ref)[source]¶
Bases:
Classad
This class describes the frontend monitor classad. Frontend advertises the monitor classad to the user pool as an UPDATE_AD_GENERIC type classad
- setFrontendDetails(frontend_name, groups, ha_mode)[source]¶
Add the detailed description of the frontend. @type frontend_name: string @param frontend_name: A representation of the frontend MatchExpr @type group_name: string @param group_name: Representation of the job query_expr
- class glideinwms.frontend.glideinFrontendInterface.FrontendMonitorClassadAdvertiser(pool=None, multi_support=False)[source]¶
Bases:
ClassadAdvertiser
Class to handle the advertisement of frontend monitor classads to the user pool
- class glideinwms.frontend.glideinFrontendInterface.Key4AdvertizeBuilder[source]¶
Bases:
object
Class for creating FactoryKeys4Advertize objects will reuse the symkey as much as possible
- clear(created_after=None, accessed_after=None)[source]¶
Clear the cache
- Parameters:
created_after – if not None, only clear entries older than this
accessed_after – if not None, only clear entries not accessed recently
- get_key_obj(classad_identity, factory_pub_key_id, factory_pub_key, glidein_symKey=None)[source]¶
Get a key object
- Parameters:
classad_identity
factory_pub_key_id
factory_pub_key
glidein_symKey – will use one, if provided, but better to leave it blank and let the Builder create one whoever can decrypt the pub key can anyhow get the symkey
Returns:
- class glideinwms.frontend.glideinFrontendInterface.MultiAdvertizeWork(descript_obj)[source]¶
Bases:
object
- add(factory_pool, request_name, glidein_name, min_nr_glideins, max_run_glideins, idle_lifetime=0, glidein_params={}, glidein_monitors={}, glidein_monitors_per_cred={}, key_obj=None, glidein_params_to_encrypt=None, security_name=None, remove_excess_str=None, remove_excess_margin=0, trust_domain='Any', auth_method='Any', ha_mode='master')[source]¶
- createAdvertizeWorkFile(factory_pool, params_obj, key_obj=None, file_id_cache=None)[source]¶
Create the advertize file Expects the object variables
adname, unique_id and x509_proxies_data
to be set.
- createGlobalAdvertizeWorkFile(factory_pool)[source]¶
Create the advertize file for globals with credentials Expects the object variables
adname and x509_proxies_data
to be set.
- do_advertize(file_id_cache=None, adname=None, create_files_only=False, reset_unique_id=True)[source]¶
Do the advertizing of the requests Returns a dictionary of files that still need to be advertised.
The key is the factory pool, while the element is a list of file names
Expects that the credentials have already been loaded.
- do_advertize_batch(filename_dict, remove_files=True)[source]¶
- Advertize the classad files in the dictionary provided
The keys are the factory names, while the elements are lists of files
Safe to run in parallel, guaranteed to not modify the self object state.
- do_advertize_batch_one(factory_pool, filename_arr, remove_files=True)[source]¶
Advertize to a factory the clasad files provided Safe to run in parallel, guaranteed to not modify the self object state.
- do_advertize_one(factory_pool, file_id_cache=None, adname=None, create_files_only=False, reset_unique_id=True)[source]¶
Do the advertizing of requests for one factory Returns the list of files that still need to be advertised. Expects that the credentials have already been loaded.
- do_global_advertize(adname=None, create_files_only=False, reset_unique_id=True)[source]¶
Advertize globals with credentials Returns a dictionary of files that still need to be advertised.
The key is the factory pool, while the element is a list of file names
Expects that the credentials have been already loaded.
- do_global_advertize_one(factory_pool, adname=None, create_files_only=False, reset_unique_id=True)[source]¶
Advertize globals with credentials to one factory Returns the list of files that still need to be advertised. Expects that the credentials have been already loaded.
- initialize_advertize_batch(adname_prefix='gfi_ad_batch')[source]¶
Initialize the variables that are used for batch avertizement Returns the adname to pass to do*advertize methods (will have to set reset_unique_id=False there, too)
- renew_and_load_credentials()[source]¶
Get the list of proxies, invoke the renew scripts if any, and read the credentials in memory. Modifies the self.x509_proxies_data variable.
- exception glideinwms.frontend.glideinFrontendInterface.NoCredentialException[source]¶
Bases:
Exception
- class glideinwms.frontend.glideinFrontendInterface.ResourceClassad(factory_ref, frontend_ref)[source]¶
Bases:
Classad
This class describes the resource classad. Frontend advertises the resource classad to the user pool as an UPDATE_AD_GENERIC type classad
- setCurbsAndLimits(limits_triggered)[source]¶
- Set descriptive messages about which limits and curbs
have been triggered in deciding number of glideins to request
@type limits_triggered: dictionary @param limits_triggered: limits and curbs that have been triggered
- setEntryInfo(info)[source]¶
Set the useful entry specific info for the resource in the classad
@type info: dict @param info: Useful info from the glidefactory classad
- setEntryMonitorInfo(info)[source]¶
Set the useful entry specific monitoring info for the resource in the classad Monitoring info from the glidefactory classad (e.g. CompletedJobs )
@type info: dict @param info: Useful monitoring info from the glidefactory classad
- setFrontendDetails(frontend_name, group_name, ha_mode)[source]¶
Add the detailed description of the frontend. @type frontend_name: string @param frontend_name: A representation of the frontend MatchExpr @type group_name: string @param group_name: Representation of the job query_expr
- setGlideClientConfigLimits(info)[source]¶
Set the GlideClientConfig* for the resource in the classad
@type info: dict @param info: Useful config information
- setGlideClientMonitorInfo(monitorInfo)[source]¶
Set the GlideClientMonitor* for the resource in the classad
@type monitorInfo: list @param monitorInfo: GlideClientMonitor information.
- setGlideFactoryMonitorInfo(info)[source]¶
Set the GlideinFactoryMonitor* for the resource in the classad
@type info: dict @param info: Useful information from the glidefactoryclient classad
- setInDownTime(downtime)[source]¶
Set the downtime flag for the resource in the classad
@type downtime: bool @param downtime: True if the entry is in down time.
- setMatchExprs(match_expr, job_query_expr, factory_query_expr, start_expr)[source]¶
Sets the matching expressions for the resource classad Thus, it would be possible to find out why a job is not matching. @type match_expr: string @param match_expr: A representation of the frontend MatchExpr @type job_query_expr: string @param job_query_expr: Representation of the job query_expr @type factory_query_expr: string @param factory_query_expr: Representation of the factory query_expr @type start_expr: string @param start_expr: Representation of the match start expr (on the glidein)
- class glideinwms.frontend.glideinFrontendInterface.ResourceClassadAdvertiser(pool=None, multi_support=False)[source]¶
Bases:
ClassadAdvertiser
Class to handle the advertisement of resource classads to the user pool
- glideinwms.frontend.glideinFrontendInterface.advertizeWorkFromFile(factory_pool, fname, remove_file=True, is_multi=False)[source]¶
- glideinwms.frontend.glideinFrontendInterface.deadvertizeAllGlobals(factory_pool, my_name, ha_mode='master')[source]¶
Removes all globals classads for the client in the factory.
- glideinwms.frontend.glideinFrontendInterface.deadvertizeAllWork(factory_pool, my_name, ha_mode='master')[source]¶
Removes all work requests for the client in the factory.
- glideinwms.frontend.glideinFrontendInterface.exe_condor_advertise(fname, command, pool, is_multi=False)[source]¶
- glideinwms.frontend.glideinFrontendInterface.findGlideinClientMonitoring(factory_pool, factory_identity, my_name, additional_constraint=None)[source]¶
- glideinwms.frontend.glideinFrontendInterface.findGlideins(factory_pool, factory_identity, signtype, additional_constraint=None)[source]¶
- glideinwms.frontend.glideinFrontendInterface.findGlobals(pool_name, auth_identity, classad_type, additional_constraint=None)[source]¶
Query the given pool to find the globals classad. Can be used to query glidefactoryglobal and glidefrontendglobal classads.
- glideinwms.frontend.glideinFrontendInterface.findMasterFrontendClassads(pool_name, frontend_name)[source]¶
Query the given pool to find master frontend classads
glideinwms.frontend.glideinFrontendLib module¶
This module implements the functions needed to keep the required number of idle glideins plus other miscelaneous functions
- glideinwms.frontend.glideinFrontendLib.appendRealRunning(condorq_dict, status_dict)[source]¶
Adds provenance information from condor_status to the condor_q dictionary The name of static or pslots is the value of RemoteHost NOTE: HTC 8.5 may change RemoteHost to be the DynamicSlot name
- Parameters:
condorq_dict – adding ‘RunningOn’ to each job
status_dict – running jobs from condor_status
- Returns:
- glideinwms.frontend.glideinFrontendLib.countCondorStatus(status_dict)[source]¶
Return the number of items (slots) in the dictionary Use the output of getCondorStatus
- Parameters:
status_dict (dict) – output of getCondorStatus
- Returns:
number of slots in the dictionary
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.countCoresCondorStatus(status_dict, state='TotalCores')[source]¶
Return the number of cores in the dictionary based on the status_type Use the output of getCondorStatus
- Parameters:
status_dict (dict) – output of getCondorStatus
state (str) – status to count (TotalCores, IdleCores, RunningCores)
- Returns:
number of cores counted
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.countGlideinsCondorStatus(status_dict)[source]¶
Return the number of Glideins in the dictionary
- A Glidein is an execution of the glidein_startup.sh script
may be different from job submitted by the factory (for multinode jobs - future)
is different from a slot (or schedd or vm)
It defines GLIDEIN_MASTER_NAME which is the part after ‘@’ in the slot name Sets from different collectors are assumed disjunct
- Parameters:
status_dict (dict) – output of getCondorStatus
- Returns:
number of glideins in the dictionary
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.countIdleCoresCondorStatus(status_dict)[source]¶
Counts the Idle cores in the status dictionary The status is redundant in part but necessary to handle correctly partitionable slots which are 1 glidein but may have some running cores and some idle cores
- Parameters:
status_dict (dict) – a dictionary with the Machines to count
- Returns:
number of cores for Idle slots in the Machine classads
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.countMatch(match_obj, condorq_dict, glidein_dict, attr_dict, ignore_down_entries, condorq_match_list=None, match_policies=[], group_name=None)[source]¶
Get the number of jobs that match each glidein
- Parameters:
match_obj – output of re.compile(match string,’<string>’,’eval’)
condorq_dict (dictionary: sched_name->CondorQ object) – output of getidleCondorQ
glidein_dict (dictionary: glidein_name->dictionary of params and attrs) – output of interface.findGlideins
attr_dict – dictionary of constant attributes
condorq_match_list – list of job attributes from the XML file
- Returns:
tuple of 4 elements, where first 3 are a dictionary of glidein name where elements are number of jobs matching First tuple : Straight match Second tuple : The entry proportion based on unique subsets Third tuple : Elements that can only run on this site Forth tuple : The entry proportion glideins to be requested based
on unique subsets after considering multicore jobs, GLIDEIN_CPUS/GLIDEIN_ESTIMATED_CPUS (cores in glideins) GLIDEIN_NODES (number of nodes in multinode submissions)
A special ‘glidein name’ of (None, None, None) is used for jobs that don’t match any ‘real glidein name’ in all 4 tuples above
- glideinwms.frontend.glideinFrontendLib.countRealRunning(match_obj, condorq_dict, glidein_dict, attr_dict, condorq_match_list=None, match_policies=[])[source]¶
Counts all the running jobs on an entry
- Parameters:
match_obj – selection for the jobs
condorq_dict – result of condor_q, keyed by schedd name
glidein_dict – glideins, keyed by entry (glidename)
attr_dict – entry attributes, NOT USED
condorq_match_list – match attributes used for clustering
match_policies
- Returns: Tuple with the job counts (used for stats) and glidein counts (used for glidein_max_run)
Both are dictionaries keyed by glidename (entry)
- glideinwms.frontend.glideinFrontendLib.countRunningCondorStatus(status_dict)[source]¶
Return the number of running slots in the dictionary Use the output of getCondorStatus for running slots The counting loop skips partitionable slots
- Parameters:
status_dict (dict) – output of getCondorStatus for running slots
- Returns:
number of slots in the dictionary (dynamic + statis)
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.countRunningCoresCondorStatus(status_dict)[source]¶
Counts the running cores in the status dictionary The status is redundant in part but necessary to handle correctly partitionable slots which are 1 glidein but may have some running cores and some idle cores
- Parameters:
status_dict (dict) – a dictionary with the Machines to count
- Returns:
number of cores for Running slots in the Machine classads
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.countTotalCoresCondorStatus(status_dict)[source]¶
Return the number of cores in the dictionary Use the output of getCondorStatus
Counts the cores in the status dictionary The status is redundant in part but necessary to handle correctly partitionable slots which are 1 glidein but may have some running cores and some idle cores
- Parameters:
status_dict (dict) – output of getCondorStatus, dictionary with the Machines to count
- Returns:
number of cores in the Machine classads
- Return type:
int
- glideinwms.frontend.glideinFrontendLib.getClientCondorStatus(status_dict, frontend_name, group_name, request_name)[source]¶
Return a dictionary of collectors containing all slots for a request (idle, running, …) Each element is a condorStatus
Use the output of getCondorStatus
- Parameters:
status_dict (dict) – output of getCondorStatus
frontend_name (str) – frontend name
group_name (str) – group name
request_name (str) – request name
- Returns:
dictionary of collectors containing all slots for a request
- Return type:
dict
- glideinwms.frontend.glideinFrontendLib.getClientCondorStatusCredIdOnly(status_dict, cred_id)[source]¶
Return a dictionary of collectors containing slots of a specific credential
Input should be the output of getClientCondorStatus or equivalent Each element is a condorStatus
Use the output of getCondorStatus
- Parameters:
status_dict (dict) – output of getCondorStatus()
cred_id (str) – credential ID
- Returns:
dictionary of collectors containing slots of a specific credential
- Return type:
dict
- glideinwms.frontend.glideinFrontendLib.getClientCondorStatusPerCredId(status_dict, frontend_name, group_name, request_name, cred_id)[source]¶
Return a dictionary of collectors containing slots at a client split for a specific credential Each element is a condorStatus
Use the output of getCondorStatus
- Parameters:
status_dict (dict) – output of getCondorStatus
frontend_name (str) – frontend name
group_name (str) – group name
request_name (str) – request name
cred_id (str) – credential ID
- Returns:
dictionary of collectors containing slots of a specific request and credential
- Return type:
dict
- glideinwms.frontend.glideinFrontendLib.getCondorQ(schedd_names, constraint=None, format_list=None, want_format_completion=True, job_status_filter=(1, 2))[source]¶
Return a dictionary of schedds containing interesting jobs Each element is a condorQ
If not all the jobs of the schedd has to be considered, specify the appropriate constraint
- Parameters:
schedd_names
constraint (str) – constraint string or None
format_list
want_format_completion (bool)
job_status_filter
Returns:
- glideinwms.frontend.glideinFrontendLib.getCondorQConstrained(schedd_names, type_constraint, constraint=None, format_list=None)[source]¶
- glideinwms.frontend.glideinFrontendLib.getCondorStatus(collector_names, constraint=None, format_list=None, want_format_completion=True, want_glideins_only=True)[source]¶
Return a dictionary of collectors containing interesting classads Each element is a condorStatus @param collector_names: @param constraint: @param format_list: @param want_format_completion: @param want_glideins_only: @return:
- glideinwms.frontend.glideinFrontendLib.getCondorStatusConstrained(collector_names, type_constraint, constraint=None, format_list=None, subsystem_name=None)[source]¶
- glideinwms.frontend.glideinFrontendLib.getCondorStatusNonDynamic(status_dict)[source]¶
Return a dictionary of collectors containing static+partitionable slots and exclude any dynamic slots
Each element is a condorStatus Use the output of getCondorStatus
- glideinwms.frontend.glideinFrontendLib.getCondorStatusSchedds(collector_names, constraint=None, format_list=None, want_format_completion=True)[source]¶
Return a dictionary of collectors containing interesting classads Each element is a condorStatus
Return the schedd classads
- Parameters:
collector_names
constraint (str, None)
format_list (list, None)
want_format_completion (bool) – add default elements to the format_list if True (default)
Returns:
- glideinwms.frontend.glideinFrontendLib.getFactoryEntryList(status_dict)[source]¶
Given startd classads, return the list of all the factory entries Each element in the list is (req_name, node_name)
- Parameters:
status_dict (dict) – a dictionary with the Machines to count from condorStatus
- Returns:
list of tuples with all the factory entries (req_name, node_name)
- Return type:
list
- glideinwms.frontend.glideinFrontendLib.getGlideinCpusNum(glidein, estimate_cpus=True)[source]¶
Given the glidein data structure, get the GLIDEIN_CPUS and GLIDEIN_ESTIMATED_CPUS configured. If estimate_cpus is false translate keywords to numerical equivalent (auto/slot -> -1, node -> 0), otherwise estimate CPUs If GLIDEIN_CPUS is not configured ASSUME it to be 1, if it is set to auto/slot/-1 or node/0, use GLIDEIN_ESTIMATED_CPUS if provided, otherwise ASSUME it to be 1 In the future there should be better guesses
- glideinwms.frontend.glideinFrontendLib.getGlideinNodesNum(glidein, estimate_nodes=True)[source]¶
Given the glidein data structure, get the GLIDEIN_NODES configured. If estimate_nodes is false translate keywords to numerical equivalent (and raise ValueError if no valid keyword), otherwise estimate nodes. If GLIDEIN_NODES is not configured, ASSUME it to be 1 Currently no keyword is allowed. estimate_nodes is there for future expansions.
- glideinwms.frontend.glideinFrontendLib.getHACheckInterval(frontend_data)[source]¶
Given the frontendDescript return if this frontend is to be run in ‘master’ or ‘slave’ mode
- glideinwms.frontend.glideinFrontendLib.getHAMode(frontend_data)[source]¶
Given the frontendDescript return if this frontend is to be run in ‘master’ or ‘slave’ mode
- glideinwms.frontend.glideinFrontendLib.getIdleCondorStatus(status_dict, min_memory=2500)[source]¶
Return a dictionary of collectors containing idle(unclaimed) vms Each element is a condorStatus
Exclude partitionable slots with no free memory/cpus Minimum memory required by CMS is 2500 MB If the node had GPUs, there should be at least one available (requested by CMS)
1. (el.get(‘PartitionableSlot’) != True) Includes static slots irrespective of the free cpu/mem
2. (el.get(‘TotalSlots’) == 1) p-slots not yet partitioned
- (el.get(‘Cpus’, 0) > 0 and
el.get(‘Memory’, 2501) > min_memory) and (el.get(‘TotalGpus’, 0) == 0 or el.get(‘Gpus’, 0) > 0))
p-slots that have enough idle resources.
- Parameters:
status_dict (dict) – all condor status jobs as returned by getCondorStatus
min_memory (int) – minimum memory in MB for partitionable slots (default=2500)
- Returns:
condorStatus with Idle jobs
- Return type:
dict
- glideinwms.frontend.glideinFrontendLib.getRunningCondorStatus(status_dict)[source]¶
Return a dictionary of collectors containing running(claimed) slots Each element is a condorStatus
- Parameters:
status_dict – output of getCondorStatus
- Returns:
dictionary of collectors containing running(claimed) slots
- glideinwms.frontend.glideinFrontendLib.getRunningCoresCondorStatus(status_dict)[source]¶
Return a dictionary of collectors containing running(claimed) cores Each element is a condorStatus
Use the output of getCondorStatus
- Parameters:
status_dict (dict) – output of getCondorStatus()
- Returns:
dictionary of collectors containing running(claimed) cores
- Return type:
dict
- glideinwms.frontend.glideinFrontendLib.getRunningJobsCondorStatus(status_dict)[source]¶
Return a dictionary of collectors containing running(claimed) slots This includes Fixed slots and Dynamic slots (no partitionable slots) Each one is matched with a single job (gives number of running jobs) Each element is a condorStatus
- Parameters:
status_dict – output of getCondorStatus
- Returns:
dictionary of collectors containing running(claimed) slots
- glideinwms.frontend.glideinFrontendLib.getRunningPSlotCondorStatus(status_dict)[source]¶
Return a dictionary of collectors containing running(claimed) partitionable slots Each element is a condorStatus
- Parameters:
status_dict – output of getCondorStatus
- Returns:
collectors containing running(claimed) partitionable slots
glideinwms.frontend.glideinFrontendMonitorAggregator module¶
This module implements the functions needed to aggregate the monitoring fo the frontend
- class glideinwms.frontend.glideinFrontendMonitorAggregator.MonitorAggregatorConfig[source]¶
Bases:
object
- glideinwms.frontend.glideinFrontendMonitorAggregator.verifyRRD(fix_rrd=False, backup=False)[source]¶
Go through all known monitoring rrds and verify that they match existing schema (could be different if an upgrade happened) If fix_rrd is true, then also attempt to add any missing attributes.
- Parameters:
fix_rrd (bool) – if True, will attempt to add missing attrs
backup (bool) – if True, backup the old RRD before fixing
- Returns:
True if all OK, False if there is a problem w/ RRD files
- Return type:
bool
glideinwms.frontend.glideinFrontendMonitoring module¶
- class glideinwms.frontend.glideinFrontendMonitoring.factoryStats[source]¶
Bases:
object
- logClientMonitor(client_name, client_monitor, client_internals)[source]¶
client_monitor is a dictinary of monitoring info client_internals is a dictinary of internals
- At the moment, it looks only for
‘Idle’ ‘Running’ ‘GlideinsIdle’ ‘GlideinsRunning’ ‘GlideinsTotal’ ‘LastHeardFrom’
- class glideinwms.frontend.glideinFrontendMonitoring.groupStats[source]¶
Bases:
object
- glideinwms.frontend.glideinFrontendMonitoring.write_frontend_descript_xml(frontendDescript, monitor_dir)[source]¶
Writes out the frontend descript.xml file in the monitor web area.
@type frontendDescript: FrontendDescript @param frontendDescript: contains the data in the frontend.descript file in the frontend instance dir @type monitor_dir: string @param monitor_dir: filepath the the monitor dir in the frontend instance dir
glideinwms.frontend.glideinFrontendPidLib module¶
- class glideinwms.frontend.glideinFrontendPidLib.ElementPidSupport(startup_dir, group_name)[source]¶
Bases:
PidWParentSupport
- class glideinwms.frontend.glideinFrontendPidLib.FrontendPidSupport(startup_dir)[source]¶
Bases:
PidSupport
- glideinwms.frontend.glideinFrontendPidLib.get_element_pid(startup_dir, group_name)[source]¶
Raise an exception if not running
- Parameters:
startup_dir
group_name
Returns:
- Raises:
RuntimeError – if the Group element process is not running or has no parent
glideinwms.frontend.glideinFrontendPlugins module¶
This module implements plugins for the VO frontend
- class glideinwms.frontend.glideinFrontendPlugins.ProxyAll(config_dir, proxy_list)[source]¶
Bases:
object
This plugin returns all the proxies
This is can be a very useful default policy
- get_credentials(params_obj=None, credential_type=None, trust_domain=None)[source]¶
get the credentials, given the condor_q and condor_status data
- Parameters:
params_obj – optional parameters to be used in job splitting
credential_type (str) – optional credential type to match with a supported auth_metod
trust_domain (str) – optional trust domain
- Returns:
list of credentials
- Return type:
list
- get_required_classad_attributes()[source]¶
what glidein attributes are used by this plugin
- Returns:
used glidein attributes, none
- Return type:
list
- class glideinwms.frontend.glideinFrontendPlugins.ProxyFirst(config_dir, proxy_list)[source]¶
Bases:
object
This plugin always returns the first proxy Useful when there is only one proxy or for testing
- class glideinwms.frontend.glideinFrontendPlugins.ProxyProjectName(config_dir, proxy_list)[source]¶
Bases:
object
Given a ‘normal’ credential, create sub-credentials based on the ProjectName attribute of jobs
- class glideinwms.frontend.glideinFrontendPlugins.ProxyUserCardinality(config_dir, proxy_list)[source]¶
Bases:
object
This plugin uses the first N proxies where N is the number of users currently in the system
This is useful if the first proxies are higher priority then the later ones Also good for testing
- class glideinwms.frontend.glideinFrontendPlugins.ProxyUserMapWRecycling(config_dir, proxy_list)[source]¶
Bases:
object
This plugin implements a user-based mapping policy with possibility of recycling of accounts: * when a user first enters the system, it gets mapped to a pilot proxy that was not used for the longest time * for existing users, just use the existing mapping * if an old user comes back, it may be mapped to the old account, if not yet recycled,
else it is treated as a new user
- class glideinwms.frontend.glideinFrontendPlugins.ProxyUserRR(config_dir, proxy_list)[source]¶
Bases:
object
This plugin implements a user-based round-robin policy The same proxies are used as long as the users don’t change (we keep a disk-based memory for this purpose) Once any user leaves, the most used credential is rotated to the back of the list If more users enter, they will reach farther down the list to access less used credentials
- glideinwms.frontend.glideinFrontendPlugins.createCredentialList(elementDescript)[source]¶
Creates a list of Credentials for a proxy plugin
- glideinwms.frontend.glideinFrontendPlugins.fair_assign(cred_list, params_obj)[source]¶
Assigns requests to each credentials in cred_list max run will remain constant between iterations req idle will be shuffled each iteration.
Note that shuffling will tend towards rounding up ReqIdle over the long run, but that, since this is partially a throttling mechanism, it is okay to slow this down a little bit with shuffling.
glideinwms.frontend.gwms_renew_proxies module¶
Automatical renewal of proxies necessary for a glideinWMS frontend
- exception glideinwms.frontend.gwms_renew_proxies.ConfigError[source]¶
Bases:
BaseException
Catch-all class for errors in proxies.ini or system VO configuration
- class glideinwms.frontend.gwms_renew_proxies.Proxy(cert, key, output, lifetime, uid=0, gid=0, rfc='true', pathlength='20', bits='2048')[source]¶
Bases:
object
Class for holding information related to the proxy
- _voms_proxy_info(*opts)[source]¶
Run voms-proxy-info. Returns stdout, stderr, and return code of voms-proxy-info
- actimeleft()[source]¶
Safely return the remaining lifetime of the proxy’s VOMS AC, in seconds (returns 0 if unexpected stdout)
- timeleft()[source]¶
Safely return the remaining lifetime of the proxy, in seconds (returns 0 if unexpected stdout)
- classmethod timeleft_from_file(filename)[source]¶
Safely return the remaining lifetime of the proxy in the arbitrary file, in seconds (returns 0 if unexpected stdout)
- class glideinwms.frontend.gwms_renew_proxies.VO(vo, fqan)[source]¶
Bases:
object
Class for holding information related to VOMS attributes
- glideinwms.frontend.gwms_renew_proxies._run_command(command)[source]¶
Runs the specified command, specified as a list. Returns stdout, stderr and return code
- glideinwms.frontend.gwms_renew_proxies._safe_int(string_var)[source]¶
Convert a string to an integer. If the string cannot be cast, return 0.
- glideinwms.frontend.gwms_renew_proxies.parse_vomses(vomses_contents)[source]¶
Parse the contents of a vomses file with the the following format per line:
“<VO ALIAS> “ “<VOMS ADMIN HOSTNAME>” “<VOMS ADMIN PORT>” “<VOMS CERT DN>” “<VO NAME>”
And return two mappings:
Case insensitive VO name to their canonical versions
VO certificate DN to URI, i.e. HOSTNAME:PORT
- Parameters:
vomses_contents (str) – vomses file content
- Returns:
lower case VO names to correct case, DN to “host:port”
- Return type:
dict, dict