-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a document describing key revocation #47
Conversation
This document may eventually be part of the key management requirements. It describes a few common mechanisms for key revocation. Signed-off-by: Marina Moore <[email protected]>
key-revocation.md
Outdated
|
||
One of the goals of Notary v2 is to build in solutions for key revocation that are easy to use and ensure that users will always use non-compromised keys. This document discusses some potential mechanisms for key revocation. | ||
|
||
In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a wholistic approach to key revocation in TUF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a wholistic approach to key revocation in TUF. | |
In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a holistic approach to key revocation in TUF. |
key-revocation.md
Outdated
|
||
## Distribution of trusted keys | ||
|
||
Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique as the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique as the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation. | |
Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique has the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these options has pros and cons. Thinking through them:
Key Expiration: this has the advantage of being automatically enforced, even for disconnected environments. I'd question if this means everything the key has previously signed would need to be resigned. Fairly certain the answer is yes (otherwise an attacker could sign a malicious image for 10 years when they breach a key that only has 2 hours left until the certificate on the key expires). That would result in lots of re-signing of old images for every key rotation. Perhaps that could be made easier by having a single signature for a list of images (digests), rather than a separate signature for each individual image.
Revocation lists: while it's convenient that this has an immediate affect when the revocation is published, I'm seeing multiple downsides. In disconnected environments, that query may fail, or it may be sent to a mirror server the client in that disconnected environment is told to trust. A stale mirror in the disconnected environment could be used to send malicious images, though in those cases it's the client intentionally indicating they want to trust a mirror that shouldn't have been trusted. The more concerning scenario to me are the devices with access to the public internet using the upstream revocation list. What do we do when access to that revocation list goes down? Do we fail insecure and potentially allowing a vulnerability, or fail secure and cause an outage. Last year's Apple scenario showed we can have the worst of both, where the revocation server could be extremely slow to eventually timeout on the response.
For the TUF scenario, I think we want to explore what it would look like for the root key to be eventually expired (with a relatively long lifetime). And with short lifetimes on the timestamp signing, what does that look like for mirrors and popular registries that want to push as much out to CDN's as possible.
And it's bigger than this document scoped, but we also need to explore what key distribution looks like with v2. If we are avoiding TOFU by having clients explicitly trusting a root key for the organization, how is that root key first deployed, how does it get rotated, and if we do in-band rotation, can we trust that chain of rotated root keys that lead back to one or more now expired keys.
key-revocation.md
Outdated
|
||
However, the user must be able to ensure that the key revocation list is accurate and up to date. If an attacker is able to replay an old revocation list, the user may continue to trust compromised keys. Therefore the distribution of the key revocation list must allow the user to verify authenticity and timeliness. | ||
|
||
Also, for security reasons, keys cannot be removed from a key revocation list, so the list will grow larger and larger over time and may eventually have a noticeable bandwidth impact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To mitigate the risk of ever growing revocation list, there can be a combination, a longish expiration time on keys that can be revoked. Then a revoked key can be removed from the list after it eventually expires.
I could also see a use case for a middle ground, where there's several types of keys used by an organization. One to identify "this content was produced by this organization" and another key that claims "we believe this content is currently secure". That would allow Docker to use a longer lifetime key on something like a 4 year old ubuntu image, that we know has vulnerabilities, but still want to know that Docker signed it. And since that image isn't changing, it could go through a more controlled offline signing process with a longer lived key. While the ubuntu image that was just updated yesterday, that we think is secure, may get signed with a key that expires in a week. |
I haven't been looking at the problem as broadly as all of you - on purpose - but I'm strongly in favor of expiration approaches rather than revocation ones. Transparency logs + timestamps are better than revocation IMO - and can also be made compatible with most air-gapped scenarios. So, overall strong +1 to this entire document @mnm678 :) |
Signed-off-by: Marina Moore <[email protected]>
Add an initial list of pros and cons for each technique and add a few clarifications Signed-off-by: Marina Moore <[email protected]>
This LGTM! |
I started writing up some thoughts and concerns for the scaling problem of maintaining all the public and private content we expect to see currently, and the coming years. My hope is we can have a baseline of scale, and then the various approaches may start to become more obvious. |
To be honest those are still tiny numbers compared to other public-key crypto key management systems in the wild today. Could you be more specific on what you think will be a scaling issue here? |
@mnm678, I can see this doc being used in one of two ways:
If you can strip this down to general background on the scenarios, we can merge it into the repo as good overview for the reader, regardless of implementation. As we develop the key management specs of Notary v2, we can reference this, and update it to reflect how Notary v2 solves these problems. Or, we can transfer this as a discussion, capturing an opinionated view on TUF. The difference allows us to merge content (1) we haven't closed on a direction with, vs. opinionated content on a specific solution we haven't committed to yet (2) |
I think there's a different take on the challenge that TUF provides, but agree that doesn't need to be specifically called out as requiring TUF in this document. Instead I'd keep the last section but rephrase it to not specifically name TUF, but instead describe some of the qualities of an intermediate solution, one that compresses the short expiration certificates into a single signature on a collection of artifacts, which is separate from the individual signatures within that collection. |
Signed-off-by: Marina Moore <[email protected]>
Thanks @SteveLasker and @sudo-bmitch. I updated the description of TUF's approach to key revocation to more generally describe combining explicit and implicit revocation. |
Thanks @mnm678, |
I guess I wasn't clear. I re-worded the final option to explain how implicit and explicit key revocation can be used in general so that it can be discussed in the context of other approaches to key revocation. The technique of combining implicit and explicit key revocation was introduced in this paper by @JustinCappos and others and refined in this paper by @trishankatdatadog and others. It has only been used widely in TUF and related projects, so I think a mention of TUF is necessary to understand the technique and know where to look for more information. More fundamentally, when talking about key revocation the specific implementation is important, because like other security systems, it is only as secure as its weakest link. That's the benefit of using existing, well tested security mechanisms instead of attempting to build them from the ground up. |
Steve, I'm hard-pressed to see how you could discuss a comprehensive background without discussing specific implementations. That's like citing papers without naming its authors.
Is there a good reason to close this PR and open a discussion instead, other than citing specific implementations? |
For the last option, while TUF may be the only solution we know of that takes this approach, we've done such a good job keeping the other sections abstract and not listing the various implementations of each technique that naming TUF in the last section comes across as a sales pitch. Here's my own attempt to reword this: Combining explicit and implicit revocationBy using a hierarchical combination of keys, a trusted root key can delegate signing to various keys that expire. Additionally, artifacts may be signed by more than one key, allowing automated tooling to provide short lived signatures that verify the signer and artifact have not been revoked. Clients then verify the necessary collection of signatures is found on the artifact. This method allows signers to have relatively long lived keys, to simplify their workflow and avoid needing to resign the artifacts themselves, while enabling timely revoking of the signer key or a single artifact signature. For efficiency, a meta-artifact can be created and maintained, containing references to a collection currently signed artifacts. And the short lived signature can be created for this single artifact, rather than every artifact individually. Pros:
Cons:
|
Thanks @sudo-bmitch. I updated the pr. |
key-revocation.md
Outdated
|
||
## Key Expiration | ||
|
||
Adding an expiration time to every key allows keys to automatically be revoked after a certain period of time. The expiration time is usually included with the key so that it is easy for users to find. This technique does not require any action from the key holder, and ensures that users will have to refresh their trusted keys before those keys expire. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might help to clarify that the key expiry (metadata) needs to be signed by a issuing key that the client trusts.
key-revocation.md
Outdated
|
||
Cons: | ||
* Keys can't be revoked before expiration | ||
* Artifacts must be re-signed after expiration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamping is an option that allows use of signed artifacts after the key expires.
* Artifacts must be re-signed after expiration | ||
|
||
|
||
## Key revocation lists (Deny lists) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section and next is similar to Allowlist and Denylist in Repository, the challenge is synchronizing deny lists in multi-registry scenarios. Also we don't cover artifact level revocation in this doc, and specify where allow/deny lists will be stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note about synchronization. I purposefully left out artifact level revocation in this document for now, but we can combine those discussions is a later draft.
threatmodel.md
Outdated
@@ -10,3 +10,10 @@ It is assumed that an attacker may perform one or more the following actions: | |||
|
|||
While it is not always possible to protect against all scenarios, the system should to the extent possible mitigate and/or reduce the damage caused by a successful attack, detect the occurrence of an attack and notify appropriate parties, yet remain usable for parties operating the system. Furthermore, the system should recover from successful attacks in a way that presents low operational overhead and risk to users. | |||
|
|||
Attacker Goals: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be reworded to be less generic terminology, and targeted for artifact registry and consumers? Also is this intended to be an initial version? I think the final threat model and analysis would be detailed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these are duplicated in #35, and they are a bit out of scope for this pr, so I'll remove them here and try to address your comment there
Thanks for the review @gokarnm! I updated the document and responded inline to a couple of comments. |
@sudo-bmitch @gokarnm I updated the intro as discussed in the meeting. Can I get a review/approval to merge? |
Just noticed the DCO validation error. Not sure if that will block the ability to merge. Still LGTM and hope we can merge and iterate forward. Thanks for driving this @mnm678 ! |
@mnm678 LGTM! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can add a con to the meta-artifact section, addressing the multi-registry challenge for moving individual artifacts
|
||
However, the user must be able to ensure that the key revocation list is accurate and up to date. If an attacker is able to replay an old revocation list, or show different versions to different registries, the user may continue to trust compromised keys. Therefore the distribution of the key revocation list must allow the user to verify authenticity and timeliness. | ||
|
||
Also, for security reasons, keys cannot be removed from a key revocation list, so the list will grow larger and larger over time and may eventually have a noticeable bandwidth impact, although this can be mitigated by combining key revocation lists with keys that expire. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true, that once in a list, it's never removed? Or, can keys that are known to have expired be removed at a later date? Perhaps 50% longer than the life of the key or something. It does seem like a non-scalable solution that needs mitigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be combined with strongly enforced key expiration to allow the keys to eventually be deleted.
key-revocation.md
Outdated
|
||
This method allows signers to have relatively long lived keys, to simplify their workflow and avoid needing to resign the artifacts themselves, while enabling timely revoking of the signing key or a single artifact signature. | ||
|
||
For efficiency, a meta-artifact can be created and maintained, containing references to a collection currently signed artifacts. And the short lived signature can be created for this single artifact, rather than every artifact individually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this interact with individual artifacts moving within and across registries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it depends on the implementation, but the meta-artifact would have to be updated as the collection changes.
Cons: | ||
* Requires maintenance of an automated system to refresh short lived signatures | ||
* A root key compromise requires updating all signers, clients, and signatures on the artifacts | ||
* Updating short lived signatures on a large number of artifacts may encounter scaling challenges and loses some of the caching efficiencies of content addressable storage in registries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a con here for the meta-artifact collection of keys needs to somehow be parseable for individual artifact movement within and across registries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meta-artifact is an optional efficiency feature, so I added the con to the paragraph describing the feature.
@mnm678, can you also solve the DCO issue? |
Update the combined key revocation option to remove references to TUF and more generically describe the way it allows for both explicit and implicit key revocation. Thanks to @sudo-bmitch for wording suggestions. Signed-off-by: Marina Moore <[email protected]>
Signed-off-by: Marina Moore <[email protected]>
Signed-off-by: Marina Moore <[email protected]>
Signed-off-by: Marina Moore <[email protected]>
Signed-off-by: Marina Moore <[email protected]>
This looks great to me, we can definitely keep iterating after merge. |
+1 on merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This document lays out some of the options for key revocation that have been discussed. These options might eventually fit better as part of the key management document, but are posted separately for the sake of discussion.