-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] upgrade/chain halt recovery #837
base: main
Are you sure you want to change the base?
Conversation
Items to add:
|
@okdas Let's add the details here as well: https://x.com/olshansky/status/1846211059989778741 |
@Olshansk can I get a review on this PR, please? I think I want to investigate/add information about the new Cosmovisor feature mentioned in |
The CI will now also run the e2e tests on devnet, which increases the time it takes to complete all CI checks. You may need to run GCP workloads (requires changing the namespace to 837) |
# Use the direct download link for the latest release | ||
LATEST_RELEASE_URL="https://github.com/pokt-network/poktroll/releases/latest/download/poktroll_linux_${ARCH}.tar.gz" | ||
# Get the version genesis started from | ||
POKTROLLD_VERSION=$(curl -s https://raw.githubusercontent.com/pokt-network/pocket-network-genesis/master/poktrolld/testnet-validated.init-version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always going to point github?
#PUC or add TODO to answer the question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rephrased this part, but I don't fully understand what you're asking.
If you think that there's a better way, please let me know. I thought we could use the app_version
from genesis.json
(looks like this: "app_version": "0.0.9"
) but decided I'd rather store this information in a separate file.
|
||
Read more about [upgrade contingency plans](../../protocol/upgrades/contigency_plans.md). | ||
|
||
### Manual binary replacement (preferred) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this section w/ links to the binaries for easier access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no binaries to point to. Rephrased.
|
||
Instead, we need **social consensus** to manually replace the binary and get the chain moving. | ||
|
||
Currently this involves synching the network from genesis breaking a way to sync the network from genesis without human interaction, but there are some plans to make the process less painful in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand what you're trying to say with this sentence. #PUC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docusaurus/docs/develop/developer_guide/recovery_from_chain_halt.md
Outdated
Show resolved
Hide resolved
In such a case, we need to: | ||
|
||
- Roll back validators to the backup (a snapshot is taken by `cosmovisor` automatically prior to upgrade, if `UNSAFE_SKIP_BACKUP` is set to `false`). | ||
- Skip the upgrade handler and store migrations with `--unsafe-skip-upgrade=$upgradeHeightNumber`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please show full command (or link to script).
I don't know what this means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like with --halt-height=
there's no command that will work for all use cases. I'll add an example, but it's not very usable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#PUC
- Bullet points
- Consider making it a separate section below
|
||
- The old binary should be compiled to work before the upgrade. | ||
- The new binary should contain the upgrade logic to be executed immediately after the node is started using the new binary. | ||
9. Wait until the height is reached and the old node dies due to the error: `ERR UPGRADE "v0.0.9-2" NEEDED at height`, which is expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a little bit of partial cleanup, but can you do another look through this and make it so copy-pastable that any idiot can follow?
For example, instead of "from the new version", just add the appropriate cd ../...
in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I know what the new version path on the developer's system? It might be a primary poktroll
repo on the developer machine, it might not be. ¯\_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
- The old binary should be compiled to work before the upgrade. | ||
- The new binary should contain the upgrade logic to be executed immediately after the node is started using the new binary. | ||
9. Wait until the height is reached and the old node dies due to the error: `ERR UPGRADE "v0.0.9-2" NEEDED at height`, which is expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I observe the height?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's all on the screen when you run the command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#PUC? Screenshot? Video?
@okdas Friendly reminder on this when you're back. Making it easier to inspect/find/identify chain halts on testnet would be great as well! |
Summary
Performed the first upgrade on the Alpha TestNet. Add some documentation changes to prevent some issues in the future.
Issue
N/A
Type of change
Select one or more from the following:
consensus-breaking
label if so. See [Infra] Automatically add theconsensus-breaking
label #791 for detailsTesting
make docusaurus_start
; only needed if you make doc changesmake go_develop_and_test
make test_e2e
devnet-test-e2e
label to the PR.Sanity Checklist