Update Lifecycle
There are several stages that all jobs (and their associated processes) on each VM go through during a deployment process.
When start is issued¶
-
Persistent disks are mounted on the VM, if configured and not yet mounted
-
All jobs and their dependent packages are downloaded and placed onto a machine
-
pre-start scripts run for all jobs on the VM in parallel
- (waits for all pre-start scripts to finish)
- if bpm is used, bpm's pre-start will run first and has a timeout of 30 seconds
- a jobs pre-start does not time out
-
monit start
is called for each process in no particular order- each job can specify zero or more processes
- times out based on
canary_watch_time
/update_watch_time
settings
-
post-start scripts run for all jobs on the VM in parallel
- (waits for all post-start scripts to finish)
- does not time out
-
post-deploy scripts run for all jobs on all VMs in parallel
- (waits for all post-deploy scripts to finish)
- does not time out
Note
Scripts should not rely on the order they are run. Agent may decide to run them serially or in parallel.
When processes are running¶
- Monit will automatically restart processes that failed their associated checks
- A common pattern used is a PID check: when no process ID can be found in
any
.pid
file, or the process ID is not alive anymore, then the process is restarted. - A usual pitfall arrise when the process ID is not properly written in
the
.pid
file, in which case Monit looses its handle on the actual process state and things start diverging. Using bpm is an effective solution to avoid falling in that trap.
- A common pattern used is a PID check: when no process ID can be found in
any
When stop is issued (or before update and subsequent start happens)¶
-
monit unmonitor
is called for each process -
pre-stop scripts run for all jobs on the VM in parallel
- (waits for all pre-stop scripts to finish)
- does not time out
- requires BOSH v269+ and minimum Xenial stemcell
v315.x
-
drain scripts run for all jobs on the VM in parallel
- (waits for all drain scripts to finish)
- does not time out
-
monit stop
is called for each process- times out after 5 minutes as of bosh v258+ on 3302+ stemcells
- if bpm is used, it will send a SIGTERM, wait for 15 seconds for the process to stop gracefully, and if necessary send a SIGQUIT, wait for 2 seconds, and finally send a SIGKILL if anything still lives
-
post-stop scripts run for all jobs on the VM in parallel
- (waits for all post-stop scripts to finish)
- does not time out
- requires bosh v265+
-
Persistent disks are unmounted on the VM, if configured
Non-Bosh VM Operations¶
Any deployed VM may be rebooted due to infrastructure disruptions or other operations. In general, the deployment lifecycle hooks are not executed. Only local monitoring is invoked to restart jobs.
- The VM reboot occurs, and VM is successfully booted. OS processes and services start.
monit
starts runningmonit
begins starting processes registered. The job'sstart program
is executed as per themonitrc
file.
Note
pre-start
, post-start
, post-deploy
are not executed, since the bosh lifecycle is not invoked. It is recommended that a job's monitrc
start program
perform all operations required to start a job without depending on pre-start
executing.