scheduler monitoring
Scheduler monitoring can be helpful to find out the reason why
certain jobs are not dispatched. However providing this information
for all jobs at any time can be resource consuming and is usually
not needed. To disable scheduler monitoring set 'schedd_job_info' to
'false' in scheduler configuration sched_conf(5).
finished jobs
In case of array jobs the finished job list in qmaster can become
quite big. Switching it off will save memory and speed up qstat
commands, because qstat also fetches the finished jobs list. Set
'finished_jobs' to '0' in global confiugration sge_conf(5).
job verification
Forcing a validation at jobs submission time can be valuable tool
to prevent non dispatchable jobs in pending state. Especially in
heterogenous environments with a varity of different execution nodes
and consumable resources and every user having it's own job profile
it can be a time consuming job to handle non dispatchable jobs. In
homogenous environments with only a couple of different jobs a
general expensive job validation usually can be omitted. Job
verification is disabled by adding the qsub(1) option "-w n"
in the cluster wide default requests (see sge_request(5)).
load thresholds and suspend thresholds
The load thresholds are needed if you consciously oversubscribe
your machines and you need mechanism to limit oversubscription and
also suspend thresholds are used in connection with
oversubscription. The other case in which load thresholds are needed
is when the execution node is still open for interactive load which
is not under control of Grid Engine and you wan't to prevent the
node from being overloaded. If a compute farm is that easy, that
each CPU at a compute node is represended by only one queue slot and
no interactive load is expected at these nodes then
'load_thresholds' can be omitted. To disable both thresholds set
'load_thresholds' to 'none' and 'suspend_thresholds' to 'none' (see
queue_conf(5)).
load adjustments
The load adjustments are used to virually increase the measured
load after a job has been dispachted. This mechanism is helpful in
case of oversubscribed machines to align with load thresholds. Load
adjustments should be switched off if they are not needed because
they impose the scheduler some additional work in connection sorting
hosts and load thresholds verification. To disable load adjustments
set 'job_load_adjustments' to 'none' and
'load_adjustment_decay_time' to '0' in the scheduler configuration
sched_conf(5).
scheduling-on-demand
The default for Grid Engine is to start scheduling runs in a fixed
schedule interval (see schedule_interval in schedd_conf(5)). The good
thing with fixed intervals is that they limit the cpu time consumption
of the qmaster/scheduler. The bad thing is that it throttles the scheduler
artificially resulting in a limited throughput. In many compute farms
there are machines specifically dedicated to qmaster/scheduler and in
such setups there is no reason for throttling the scheduler.
Scheduling-on-demand can be configured using the FLUSH_SUBMIT_SEC and
FLUSH_FINISH_SEC settings in the schedd_params section of
the global cluster configuration sge_conf(5). If scheduling-on-demand
is activated the throuput of a compute farm is only limited by the power
of the machine hosting qmaster/scheduler.