Replies: 3 comments 1 reply
-
Could you clarify what you mean by 'latest' ... if it is just datetime or is it 'latest released' ? This type of question affects the component level as well (though it is unclear if trustify is going down to that level) eg. latest or latest released ... if the later then 'latest released' is also bounded by product eg. latest released in product . |
Beta Was this translation helpful? Give feedback.
-
My idea of "sorted by SBOM date" was to use the "date created" which SBOMs have. Independent of the upload time. |
Beta Was this translation helpful? Give feedback.
-
what about the situation where an sbom contains 2 versions of the same component (common example = libcurl) ... both being used (with presumably same datetime). |
Beta Was this translation helpful? Give feedback.
-
Preface
The trigger for this was: #303 … Things have changed since then, but I think we still don't have a clear strategy for this.
I created two PRs (#451 and #452). Both add tests that re-ingest SBOMs. Both accept the status quo as "ok". This discussion is there to find out what we want. My proposal is to merge those PRs anyway, as creating them uncovered additional issues which get fixed by them.
While this discussion focuses on SBOMs, I think it might be valid for advisories too. But we should check.
The tests
They are all under
integration_tests::sbom::reingest
. I'll explain them in more detail in the next sub-sections. In general the idea is to upload two versions of "the same" SBOM and see what happens. They also all take actual data that is out there, not artificially altered (except for one case).quarkus
There are two versions of the same SBOM. Released at different points in history. The structure of them changed massively. Neither the name, nor the document namespace is the same. Only the describing PURL is.
This results in two different SBOMs, which can be located using the same PURL.
I think that behavior is actually ok, as the document namespace and the name did change. They don't really have much in common.
The background of those files is, that at some point the tool generating the SBOMs was changed. Creating fundamentally different SBOMs.
nhc
This is a variation of the
quarkus
test. However, both SBOMs haven been created with the same (or similar) tool version. Creating the same structure. The name is the same, as is the document namespace. That latter is actually a violation of the spec. It's a known issue, that will not be fixed in the foreseeable future.The result again is two different SBOMs, as the digest of the SBOMs is different.
nhc_same
This is a variation of the
nhc
test. Re-ingesting the same version of the SBOM twice.This will result in a single SBOM, as the digest matches.
nhc_same_content
This is a variation of the
nhc_same
test, having the exact same content and structure, be re-serialized without "pretty print".This results in two different SBOMs, as the digest is different.
syft_rerun
This uses the
syft
tool to generate an SBOM from the same container twice. Ingesting the two versions results again in two different SBOMs. The name of the SBOM is the same, however the document namespace is not. All according to the spec.As the digest is different, we get two different SBOMs.
The inconsistencies
The outcome feels rather inconsistent to me. The spec (both SPDX and CycloneDX) say that the "document namespace" ("serialNumber" in CycloneDX) uniquely identify an SBOM. However, we uniquely identify an SBOM by its digest.
This leads to altering a single byte triggering a new SBOM instance. And this is not about "someone" altering a single byte, it could also come from the vendor, properly hashed and signed.
This also leads to us accepting multiple versions of the (claimed) same SBOM, with different content.
Proposal
First of all I think we should not design for spec non-compliance. If SBOMs are wrong (according to the spec), the we can reject them, or claim that things might explode. But we should not try to fix content.
Second, if the unique ID of a document (document namespace or serial number) says the document is the same, then we should accept that as a fact (like we accept all the other content of that file). This should either lead to a "duplicate ID" error, or replace the existing document. Maybe we even allow both and let some admin decide how to deal with this situation.
Third, I think we need a way to get the "most recent" SBOM by some non-unique identifier. That could be the SBOM name (plus version? need to find something), and we could simply offer an endpoint which returns a list of SBOMs matching that name. Maybe we already have that. And maybe we add a convenience endpoint which only returns one (or none) sorted by SBOM date.
Beta Was this translation helpful? Give feedback.
All reactions