Parent

Chef::Expander::ClusterSupervisor

ClusterSupervisor

Manages a cluster of chef-expander processes. Usually this class will
be instantiated from the chef-expander-cluster executable.

ClusterSupervisor works by forking the desired number of processes, then
running VNodeSupervisor.start_cluster_worker within the forked process.
ClusterSupervisor keeps track of the process ids of its children, and will
periodically attempt to reap them in a non-blocking call. If they are
reaped, ClusterSupervisor knows they died and need to be respawned.

The child processes are responsible for checking on the master process and
dying if the master has died (VNodeSupervisor does this when started in 
with start_cluster_worker).

TODO:

* This implementation currently assumes there is only one cluster, so it
  will claim all of the vnodes. It may be advantageous to allow multiple
  clusters.
* There is no heartbeat implementation at this time, so a zombified child
  process will not be automatically killed--This behavior is left to the
  meatcloud for now.

Public Class Methods

new() click to toggle source
# File lib/chef/expander/cluster_supervisor.rb, line 54
def initialize
  @workers = {}
  @running = true
  @kill    = :TERM
end

Public Instance Methods

maintain_workers() click to toggle source
# File lib/chef/expander/cluster_supervisor.rb, line 102
def maintain_workers
  while @running
    sleep 1
    workers_to_replace = {}
    @workers.each do |process_id, worker_params|
      if result = Process.waitpid2(process_id, Process::WNOHANG)
        log.error { "worker #{worker_params[:index]} (PID: #{process_id}) died with status #{result[1].exitstatus || '(no status)'}"}
        workers_to_replace[process_id] = worker_params
      end
    end
    workers_to_replace.each do |dead_pid, worker_params|
      @workers.delete(dead_pid)
      start_worker(worker_params[:index])
    end
  end

  @workers.each do |pid, worker_params|
    log.info { "Stopping worker #{worker_params[:index]} (PID: #{pid})"}
    Process.kill(@kill, pid)
  end
  @workers.each do |pid, worker_params|
    Process.waitpid2(pid)
  end

end
start() click to toggle source
# File lib/chef/expander/cluster_supervisor.rb, line 60
def start
  trap(:INT)  { stop(:INT) }
  trap(:TERM) { stop(:TERM)}
  Expander.init_config(ARGV)

  log.info("Chef Expander #{Expander.version} starting cluster with #{Expander.config.node_count} nodes")
  configure_process
  start_workers
  maintain_workers
  release_locks
rescue Configuration::InvalidConfiguration => e
  log.fatal {"Configuration Error: " + e.message}
  exit(2)
rescue Exception => e
  raise if SystemExit === e

  log.fatal {e}
  exit(1)
end
start_worker(index) click to toggle source
# File lib/chef/expander/cluster_supervisor.rb, line 86
def start_worker(index)
  log.info { "Starting cluster worker #{index}" }
  worker_params = {:index => index}
  child_pid = fork do
    Expander.config.index = index
    VNodeSupervisor.start_cluster_worker
  end
  @workers[child_pid] = worker_params
end
start_workers() click to toggle source
# File lib/chef/expander/cluster_supervisor.rb, line 80
def start_workers
  Expander.config.node_count.times do |i|
    start_worker(i + 1)
  end
end
stop(signal) click to toggle source
# File lib/chef/expander/cluster_supervisor.rb, line 96
def stop(signal)
  log.info { "Stopping cluster on signal (#{signal})" }
  @running = false
  @kill    = signal
end

[Validate]

Generated with the Darkfish Rdoc Generator 2.