-
Notifications
You must be signed in to change notification settings - Fork 359
Stack upgrades & ProcessSync: What happens if apps use a stack the platform no longer supports?
braa braa braa edited this page May 26, 2020
·
9 revisions
If I upgrade CC to a version that drops support for a stack, but some of my running apps still use that stack...
- Diego will be very careful not to cause unexpected app downtime.
- Those freshly invalid, old-stack apps will continue to run and be routable, but CC will no longer be able to send updates of them to Diego.
- The system will recognize this and refuse to delete any compute resources until it can confirm that they aren't the old-stack apps that it can no longer sync.
- They continue to exist in CCDB
- They continue to exist as BBS as Diego DesiredLRPs
- They continue to run on Diego Cells as Diego ActualLRPs (?)
- They continue to be routable (?)
- They can no longer be updated or created in Diego
- Updates and creates will result in the error
no compiler defined for requested stack
- Any change to the process'
updated_at
will make Diego's DesiredLRP out-of-date - The ProcessSync loop will attempt to update all out-of-date DesiredLRPs
- Updates and creates will result in the error
- Because the domain is unfresh:
- They can be deleted in the CF API, but Diego will not stop running their ActualLRPs (?)
- It continues to run
- In parallel, it continues to sync as many CC processes as possible to Diego as DesiredLRPs
- Any app with an unsupported stack will error on update if Diego's DesiredLRP is out-of-date.
- Update errors will prevent freshness from being bumped
- All errors encountered should be logged by the clock
- see the BBS documentation for domain freshness
- tldr
- No destructive action will be taken against LRPs in that domain
- Processes with unsupported stacks will continue to run (unless Diego has dropped them during evacuation?)
- Processes that have been deleted in CC but exist in Diego will continue to run
- Creates and updates of processes will continue to work fine
- tldr
- They can be created, updated, scaled, etc
- Because the domain is unfresh:
- They can be deleted in the CF API
- BUT Diego will not stop running their ActualLRPs
- They cannot be deleted in the CF API
- Because the domain is unfresh:
- Diego will not stop running their ActualLRPs
- Is this the best we can do to handle this class of failure?
- Should we tolerate unknown stack errors for bumping freshness?
- What does Diego do if you're evacuating the last
cflinuxfs2
cells?- Do the apps stop running?
- Does the deployment error?
- If the apps stop running and the mitigation here is to STOP them in CCDB, would it be better to bump freshness if the only errors during sync are about unknown stacks?
- October 2018: #156029607 We made uncaught errors on the clock log and
exit 1
. - November 2018: #162064721 We made most errors log, but continue to sync and refuse to bump freshness.
- November 2018: #161800100 We verified this behavior applies to apps with absent stacks.
- December 2018: A KB Article was written about recovering from this issue
- May 2020: Pivotal Slack We started seeing a rash of this in escalations, with log lines where
cc.diego.sync.processes
loggedsync-failed
anderror-updating-lrp-state
-
Pipelines
-
Contributing
- Tips and Tricks
- Cloud Controller API v3 Style Guide
- Playbooks
- Development configuration
- Testing
-
Architectural Details
-
CC Resources
- Apps
- Audit Events
- Deployments
- Labels
- Services
- Sidecars
-
Dependencies
-
Troubleshooting
- Ruby Console Script to Find Fields that Cannot Be Decrypted
- Logging database queries in unit tests
- Inspecting blobstore cc resources and cc packages(webdav)
- How to Use USR1 Trap for Diagnostics
- How to Perf: Finding and Fixing Bottlenecks
- How to get access to mysql database
- How To Get a Ruby Heap Dumps & GC Stats from CC
- How to curl v4 internal endpoints with mtls
- How to access Bosh Director console and restore an outdated Cloud Config
- Analyzing Cloud Controller's NGINX logs using the toplogs script
-
k8s
-
Archive