🪄 Schema version 6 wishlist #108

wagoodman · 2023-05-24T13:14:16Z

As these are implemented, please edit this field to include the PR that implements it within the wishlist below:

Focuses:

look at space savings for uncompressed DB while keeping matching speed a priority

The text was updated successfully, but these errors were encountered:

willmurphyscode · 2023-08-07T19:42:09Z

This wouldn't save space, but currently when RHEL and Mariner feeds report a package as "not affected" we just drop the record in vunnel. It would be helpful if this was instead expressed in the database. (Might overlap with the disputed requirement mentioned above.)

westonsteimel · 2023-08-18T20:57:28Z

v6 should have some way of looking up the correct namespace off of something more than just version. For instance, we're already maintaining mappings in vunnel of codenames to versions for ubuntu and debian and it would be useful to retain that information in the db itself so that grype can still find the correct vuln namespace even if one of the pieces of information might be missing (like in anchore/grype#1446)

Keeping it in the db would mean it could be updated automatically as new namespaces are added and grype could make use of it for lookups immediately whereas maintaining a static mapping in grype means we'd need to remember to maintain that mapping in multiple places (vunnel and grype), and users would need to upgrade to the latest grype for newer namespaces

westonsteimel · 2023-09-13T09:02:35Z

The ability to know when a particular record was added or modified within the grype database came up in a community discussion. Although we are currently always building up the database from scratch, I think it may make sense to include something like an added, and modified timestamp column to each record in the db so that if/when we add diffing/partial updates like in #143 we'd already have the necessary columns in place and not necessarily need another schema bump at that time

willmurphyscode · 2023-09-18T15:17:02Z

Updated the top comment to point to anchore/grype#1498 as a specific issue for "Capture dates where available".

willmurphyscode · 2023-11-06T20:04:51Z

We need the ability to represent a CVE that affects different packages, but has different severity ratings for each package. As a concrete example, https://security-tracker.debian.org/tracker/CVE-2023-44487 lists a table with multiple severities for different packages. Here's a snippet of the relevant table on that page, in case it changes or moves:

Package	Type	Release	Fixed Version	Urgency	Origin	Debian Bugs
h2o	source	buster	2.2.5+dfsg2-2+deb10u2		DLA-3638-1
h2o	source	(unstable)	2.2.5+dfsg2-8			1054232
haproxy	source	(unstable)	1.8.13-1
jetty9	source	buster	9.4.50-4+deb10u1		DLA-3641-1
jetty9	source	bullseye	9.4.50-4+deb11u1		DSA-5540-1
jetty9	source	bookworm	9.4.50-4+deb12u2		DSA-5540-1
jetty9	source	(unstable)	9.4.53-1
netty	source	(unstable)	(unfixed)			1054234
nghttp2	source	buster	1.36.0-2+deb10u2		DLA-3621-1
nghttp2	source	(unstable)	1.57.0-1			1053769
nginx	source	(unstable)	1.24.0-2	unimportant		1053770
tomcat10	source	bookworm	10.1.6-1+deb12u1		DSA-5521-1

Note the "urgency" is marked as "unimportant" on the row for nginx. This rating translates in Vunnel to "negligible" in grype-db.

Currently, in the database, this CVE is represented like this:

-- VULNERABILITY TABLE
sqlite> select id, package_name from vulnerability where 
id="CVE-2023-44487" and namespace="debian:distro:debian:12";
id              package_name
--------------  -------------
CVE-2023-44487  h2o
CVE-2023-44487  haproxy
CVE-2023-44487  jetty9
CVE-2023-44487  netty
CVE-2023-44487  nghttp2
CVE-2023-44487  nginx
CVE-2023-44487  tomcat10
CVE-2023-44487  tomcat9
CVE-2023-44487  trafficserver
-- VULNERABILITY_METADATA table
sqlite> select id, severity from vulnerability_metadata where 
id="CVE-2023-44487" and namespace="debian:distro:debian:12";
id              severity
--------------  ----------
CVE-2023-44487  Negligible

As you can see, the database has no good way of writing down, "this CVE is more severe if matched against tomcat than against nginx," but Debian's data is clearly trying to tell us that.

I believe this would be fixed by having a proper foreign key from vulnerability to vulnerability_metadata, rather than just relying on ID+Namespace to match. We could also move the severity column to the vulnerability table, but that would probably result in a lot of duplicate values.

westonsteimel · 2023-12-20T13:19:33Z

We need to better capture the relationships between identifiers between ecosystems. Currently, we have the related_vulnerabilities column, but that is on the package vuln record and requires knowing the grypedb vuln namespace that an id corresponds to, which is difficult when that namespace is something dynamic like with github where the GHSA could cover several package ecosystems. We don't really care about the namespace, only that a particular CVE corresponds to a GHSA or vice versa.

westonsteimel · 2024-01-20T07:53:55Z

Tables per provider/ecosystem pair with a schema specific to the ecosystem
So we'd have a table with a lookup on provider name and ecosystem name (likely based on the package url spec, though extended in the case of generic purls so we could have specific schemas tailored to classes of binary packages like java, python interpreter, etc?) that would get the name of the table to load the relevant data from.
Should it also be separate tables for affected/not affected or the same table?

Signed-off-by: Alex Goodman <[email protected]>

westonsteimel · 2024-08-17T06:44:18Z

@wagoodman, I was just wondering in light of all of the CDN issues that have been cropping up if it might make sense to do something like partition v6 databases per provider and then perhaps make grype smarter about what it downloads based on what it needs? Like we know currently the sles data far outnumbers the rest but if someone is never scanning sles containers there would be no need to ever fetch that subset of the data in their ci pipelines running grype

Anyways, just a very rough thought and apologies if this has already been discussed elsewhere. I'm "on vacation" so haven't yet seen all of the discussion that may have occurred on Friday.

willmurphyscode · 2024-08-17T09:32:47Z

@westonsteimel we were talking about that yesterday. There might be some big performance gains by splitting the DB by provider.

The main drawback I see is that, right now, grype does 2 relatively slow things at the same time: it generates the SBOM, and it downloads the database. But I don't think it can know what database pieces to download before the SBOM is generated. Maybe we could do something like check the distro early, and then download the database for the distro while making the rest of the SBOM.

We've also talked about trying to make incremental updates to the database available, but I think partitioning by provider would be much simpler to implement.

wagoodman · 2024-09-25T18:10:56Z

I'm going to close this since this wishlist has been converted into tangible issues for v6

wagoodman pinned this issue May 24, 2023

willmurphyscode mentioned this issue Sep 18, 2023

add createdAt or updatedAt for vulnerabilities scheme anchore/grype#1498

Open

joshbressers mentioned this issue Nov 16, 2023

Add capability to add/remove/change vulnerability data between upstream sources and grype-db anchore/grype#1607

Open

westonsteimel mentioned this issue Dec 7, 2023

schemas version 2 wishlist anchore/vunnel#266

Open

3 tasks

willmurphyscode mentioned this issue Jan 9, 2024

hack: hard-code severity for debian CVE-2023-44487 anchore/vunnel#448

Merged

wagoodman added the planning high level epic that should be broken into smaller tasks label Feb 16, 2024

wagoodman added this to OSS Feb 16, 2024

wagoodman moved this to Backlog in OSS Feb 16, 2024

willmurphyscode pushed a commit that referenced this issue Mar 27, 2024

add more helpful wording around quality gate evaluations (#108)

73c56bb

Signed-off-by: Alex Goodman <[email protected]>

asomya unpinned this issue May 6, 2024

wagoodman self-assigned this Jun 7, 2024

tomerse-sg mentioned this issue Jul 2, 2024

Using the information from cisa in grype anchore/grype#1511

Open

willmurphyscode mentioned this issue Aug 5, 2024

Convenient support for db downloads from artifactory. anchore/grype#2004

Open

wagoodman mentioned this issue Sep 17, 2024

Add DB v6 schema anchore/grype#2128

Closed

7 tasks

wagoodman closed this as not planned Won't fix, can't repro, duplicate, stale Sep 25, 2024

github-project-automation bot moved this from Backlog to Done in OSS Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🪄 Schema version 6 wishlist #108

🪄 Schema version 6 wishlist #108

wagoodman commented May 24, 2023 •

edited

Loading

willmurphyscode commented Aug 7, 2023

westonsteimel commented Aug 18, 2023 •

edited

Loading

westonsteimel commented Sep 13, 2023

willmurphyscode commented Sep 18, 2023

willmurphyscode commented Nov 6, 2023

westonsteimel commented Dec 20, 2023

westonsteimel commented Jan 20, 2024

westonsteimel commented Aug 17, 2024 •

edited

Loading

willmurphyscode commented Aug 17, 2024

wagoodman commented Sep 25, 2024

🪄 Schema version 6 wishlist #108

🪄 Schema version 6 wishlist #108

Comments

wagoodman commented May 24, 2023 • edited Loading

willmurphyscode commented Aug 7, 2023

westonsteimel commented Aug 18, 2023 • edited Loading

westonsteimel commented Sep 13, 2023

willmurphyscode commented Sep 18, 2023

willmurphyscode commented Nov 6, 2023

westonsteimel commented Dec 20, 2023

westonsteimel commented Jan 20, 2024

westonsteimel commented Aug 17, 2024 • edited Loading

willmurphyscode commented Aug 17, 2024

wagoodman commented Sep 25, 2024

wagoodman commented May 24, 2023 •

edited

Loading

westonsteimel commented Aug 18, 2023 •

edited

Loading

westonsteimel commented Aug 17, 2024 •

edited

Loading