Securely upgradeable work contracts are important for protecting stakers' tokens. The current design for dividing work contracts into operator contract and service contract components permits secure upgrades of operator contracts without interfering with service functionality. To gain the theoretical benefits from this scheme, it is necessary to specify how contract upgrades should be authorized, and by whom.
In RFC 9: Upgrading contracts by separate components, the division between operator and service contracts is established. Apart from the existing consensus that stakers should authorize each operator contract individually, there is no overarching design for secure contract upgrades yet.
To achieve secure upgrades, this RFC proposes a scheme where the upgrade process is centered on operator contracts, which tie all components of the network together.
Operator contracts are subject to approval globally on a registry, in their respective service contracts, and individually from each staker in the staking contract. This provides multilayered protection against a variety of different risks.
Other contracts such as service contracts and staking contracts are hard-coded in operator contracts, so attackers are prevented from inserting arbitrary contracts of any type without needing to have separate approval processes for different contract types.
Different keys are used in different parts of the upgrade process. The roles of global Registry Keeper and Panic Button, service contract-specific Operator Contract Upgraders, and staker-specific Authorizers, are used to divide the responsibilities and to provide resiliency against compromise of individual keys. A Governance scheme can be used to rekey the roles to recover from non-catastrophic compromise.
The multilayered approach permits upgrading staking contracts, or even the token itself, without disrupting user experience on the service contract.
The main goals are to provide a smooth upgrade path for future functionality, limiting the impact of single-key compromise, and protecting stakers' funds from ever being subject to code they haven’t individually authorized.
Governance is the final arbiter of authority in the Keep Network. While exact details of the implementation of Governance are out of scope for this RFC, the role of Governance in the upgrade process is to enable recovery from key compromise by rekeying other roles.
Governance has the authority to change the addresses of the Registry Keeper, Panic Button, and the service contracts' Operator Contract Upgraders.
The rekeying process should prioritize security over convenience to the greatest feasible extent, being unsuitable for ordinary upgrades but permitting recovery from partial system compromise under extraordinary circumstances.
The Registry Keeper maintains the global registry of approved operator contracts. Each operator contract must be approved by the Registry Keeper before it can be authorized by a staker or used by a service contract.
The Registry Keeper can be rekeyed by Governance.
The Panic Button can disable malicious or malfunctioning contracts that have been previously approved by the Registry Keeper. When a contract is disabled by the Panic Button, its status on the registry changes to reflect this, and it becomes ineligible to penalize stakers. Contracts disabled by the Panic Button can not be reactivated.
The Panic Button can be rekeyed by Governance.
Each service contract has a Operator Contract Upgrader whose purpose is to manage operator contracts for that specific service contract. The Operator Contract Upgrader can add new operator contracts to the service contract’s operator contract list, and deprecate old ones.
The Operator Contract Upgraders can be rekeyed by Governance.
Each staker has an Authorizer whose purpose is to authorize operator contracts the staker wishes to operate on. Authorized operator contracts are permitted to impact the staker’s staked tokens via punishments or rewards.
The Authorizer cannot be rekeyed except by undelegating and redelegating.
A global registry is used to keep track of all operator contracts, ensuring that only Keep Org-approved operator contracts are used, and enabling the panic button to disable operator contracts in an emergency.
The registry contains all operator contracts of the Keep network.
Each contract’s status may be NULL
, APPROVED
or DISABLED
.
A status of NULL
is the default
and means that the operator contract has not been approved by the Registry Keeper.
When the Registry Keeper approves a operator contract,
its status switches to APPROVED
in the registry.
Approved operator contracts can be authorized by stakers to impact their stakes,
and service contracts may utilize them.
The Panic Button can be used
to set the status of an APPROVED
contract to DISABLED
.
Operator Contracts disabled with the Panic Button cannot be re-enabled.
Operator contracts serve other contracts, and are responsible for all operations involving staked tokens.
The requirement for stakers' funds to never be subject to code they haven’t approved individually has certain implications on operator contracts: the contracts must be immutable; and any external sources of truth, e.g. token contract and staking contract/s, must be hard-coded.
In the current design, operator contracts authorized to punish the staker can never be deauthorized. Even if authorization was revokable, the stakers should not be expected to react promptly to malicious changes to operator contracts.
An operator contract can either provide services to any contract that makes a valid request and pays the correct fee, or it can be owned is owned by a specific contract and only serve its owner.
- Recognized staking contracts
-
Each operator contract specifies one or more staking contracts it recognizes. Every operator contract that handles stakes must recognize at least one staking contract. Recognized staking contracts are hard-coded and unchangeable.
Service contracts provide services without directly interacting with operators, but using a set of operator contracts in some way. Service Contracts don’t need to be aware of tokens or staking in any way; these functions are entirely intermediated by the operator contracts. A service contract only needs a list of which operator contracts it uses. To permit system upgrades, the list of used operator contracts can be updated with proper authorization.
- Used operator contracts
-
A mutable list of operator contracts used by the service contract.
Each service contract has a list of zero or more operator contracts it may use.
A service contract is deployed with zero operator contracts, rendering the service contract inactive until at least one operator contract is activated.
Each service contract has a Operator Contract Upgrader
who can add used operator contracts.
To add a used operator contract,
the operator contract must have been APPROVED
on the registry.
If a operator contract has been DISABLED
by the Panic Button,
it is ineligible for work selection.
This must be checked when the service contract selects an operator contract.
Staking contracts hold staked tokens and enforce staking rules. They must permit authorized operator contracts to slash the stakes of misbehaving operators, but stakers must be protected from code they haven’t authorized individually.
For this purpose, each staking contract maintains a list of operator contracts that have been authorized by each staker’s Authorizer. The list of operator contracts could also be maintained globally, removing the need for entry duplication when stakers on different staking contracts have the same Authorizer and operate on the same operator contract. However, maintaining the authorizations locally may be cheaper than cross-contract calls, and the scenario where gas would be saved is likely to be rare.
(If fully backed operation is used, it may not be necessary to have separate authorizations as stakes are explicitly allocated for each operator contract.)
Staking contracts are also aware of the token contract by necessity.
The authorized operator contracts are a mapping
of (address authorizer, address operator_contract) → status
.
The status of a contract may be either NULL
or AUTHORIZED
.
A status of NULL
is the default
and means the operator contract is not authorized.
A status of AUTHORIZED
means that the operator contract
may affect the stakes of those stakers
who have assigned that authorizer
as their Authorizer.
To authorize a operator contract on a staking contract, the following conditions must apply:
-
the operator contract has been
APPROVED
on the registry -
the operator contract recognizes the staking contract
Once a operator contract has been authorized,
authorization cannot be withdrawn by the staker.
However, a operator contract that has been DISABLED
by the Panic Button
may not punish stakers.
-
Deploy the new operator contract
-
Approve the operator contract on the registry
-
Wait for stakers to authorize the operator contract
-
Activate the operator contract on the relevant service contract/s
-
Deploy the new service contract
-
Deploy a new operator contract serving the new service contract
-
Approve the operator contract on the registry
-
Wait for stakers to authorize the operator contract
-
Activate the operator contract on the service contract
-
Deploy the new staking contract
-
Deploy new operator contracts recognizing the new staking contract
-
Approve the operator contracts on the registry
-
Wait for stakers to migrate to the new staking contract
-
Wait for stakers to authorize the new operator contracts
-
Activate the operator contracts on the service contracts
The upgrade process makes it possible to even hard-fork the token without disrupting service contract user experience:
-
Deploy the new token contract
-
Deploy a migration contract that lets holders convert old tokens to new tokens
-
Deploy a new staking contract for the new tokens
-
Deploy new operator contracts recognizing the new token and staking contract
-
Approve the operator contracts on the registry
-
Wait for stakers to convert their tokens, stake on the new contract and authorize the new operator contracts
-
Activate the operator contracts on the service contracts
A compromised Registry Keeper can approve arbitrary operator contracts. Because using those operator contracts for a service contract requires the service contract’s Operator Contract Upgrader as well, the impact is limited to stakers being able to instantly unstake by authorizing a malicious operator contract which slashes their stakes and sends the tokens to an address controlled by the staker.
A compromised Panic Button can disable all operator contracts and halt all network services. Recovery is impossible until Governance has rekeyed the Panic Button.
This is inevitable due to the functionality of the Panic Button, but the impact could be mitigated by setting a cap on how many times the Panic Button can be invoked within a particular timeframe. However, such a cap would be overwhelmed by a mass approval of malicious contracts by the other roles.
A compromised Operator Contract Upgrader can activate arbitrary operator contracts within the strict constraints of the upgrade process. Without compromise of the Registry Keeper to approve new malicious operator contracts, it is unlikely that a compromised Operator Contract Upgrader alone would have significant impact on the network.
If only the Authorizer of some staker is compromised, the attacker can authorize operator contracts that have been approved by the Registry Keeper, and that recognize the contract that staker stakes on.
This has a very limited negative impact unless the Registry Keeper has approved a faulty or malicious operator contract.
If a malicious operator contract can get globally approved, the impacted service contract can be completely subverted by deprecating all other operator contracts and returning malicious values. While already existing operations should finish normally, the service contract can be rendered effectively useless for new requests.
Service contracts could have upgradeable components for performing various sub-tasks. These components could be upgraded with a process similar to that of operator contracts except without staker involvement.
Keep factories are operator contracts that create keeps for customer applications.
Like all operator contracts, each Keep factory recognizes one or more staking contracts for the purpose of determining operators' eligibility to join keeps.
Each keep factory implements one or more keep interfaces. The factory records its interfaces and the addresses of the corresponding keep vendors.
Keeps are operator contracts created by keep factories. When a contract requests a keep from a factory, the factory creates a new contract owned by the customer contract, the keep, and hands it off to the customer contract.
Keeps aren’t individually authorized to slash stakers. Instead, they have to use the authorization of their creator factory.
Once created, a keep cannot be upgraded in any way, except by closing the keep and opening another one.
Keep vendors are service contracts which perform version management of keep factories. Keep vendors provide customers a single unified interface to request up-to-date keeps.
The upgrade process of the Keep Network is designed to eliminate the security threat posed by unilateral smart contract upgrades. However, the consent-centered upgrade process is inherently more complex to accommodate than a simple switchover to a new version. Stakers will authorize a new contract and operators will upgrade their client software on their own schedule, so the initial capacity of a new keep version will be seriously limited.
Instead of updating the factory address when a type of keep is upgraded, and explicitly accommodating for the friction in the migration, a customer application can go through the vendor of the corresponding keep type to receive a recent version of the keep. For most applications, the convenience of having the version migration managed automatically by the keep vendor is likely to be more significant than the slight security impact.
Some threats may be mitigated by allowing or requiring routine rekeying of the upgrade roles using the upgrade roles' own keys instead of relying on governance. This has not been investigated yet. Alternatively, each role could have a backup key in cold storage, usable as the first-line rekeying option.
The governance process for recovery from key compromise is left open. Involving a significant fraction of stakers (e.g. 33-50%) has the attractive property that an adversary capable of subverting the governance process would necessarily be powerful enough to subvert the honest majority assumption in individual Keeps. This means that rekeying is robust against attacks unless the network as a whole is compromised.
It is not immediately clear whether service contracts should completely block operator contracts disabled with the panic button, or only deprecate them without regard for the normal limitations.
Rate-limiting the Panic Button can help prevent total DoS if the panic button is ever compromised, but also permits flooding the system with malicious operator contracts unless the Registry Keeper is similarly rate-limited.