Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Update docs for structured metadata blooms #14555

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

rfratto
Copy link
Member

@rfratto rfratto commented Oct 21, 2024

What this PR does / why we need it:

  1. Remove remaining references to the removed Bloom Compactor component.
  2. Update Query Acceleration with Blooms topic to describe structured metadata blooms.

Documentation for how to configure the new bloom components has been temporarily removed as we are still actively making changes to the architecture.

Which issue(s) this PR fixes:
Closes #14414

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

The bloom compactor component has been removed in favor of the bloom
planner/builder/gateway, none of which use hash rings.
Replace references to bloom compactor with the new bloom planner and
bloom builder components.
@rfratto rfratto requested a review from a team as a code owner October 21, 2024 13:18
@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Oct 21, 2024
Copy link
Contributor

@salvacorts salvacorts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 LGTM

Update the Query Acceleration with Blooms topic to reference structured
metadata blooms over the removed line blooms.

Documentation on how to configure bloom components has been temporarily
removed while the architecture is still under active changes.
@rfratto rfratto force-pushed the docs-structured-metadata-blooms branch from 6c94ad6 to 0117f2b Compare October 21, 2024 13:24
The reason is that bloom filters also come with a relatively high cost for both building
and querying the bloom filters that only pays off at large scale deployments.
{{< /admonition >}}
## Adding data to blooms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we still need the whole section about how to enable Blooms?
This section was updated in #13965 (August 27th) and #13997 (September 2), is it already out of date?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-read the live content and it does seem correct to me; maybe I misunderstood what the team was saying we should do to the docs.

cc @chaudum Should I keep the content for the bloom components? Or was there a reason we wanted to remove it in the short term?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless Blooms are enabled by default now, we should keep the content about how to enable it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JStickler I brought back the old content, with some changes to make it accurate (some of it was outdated after all).

I also divided the document into a using/operating section, since the most important content for us at the moment is how to guide users to use query acceleration in Grafana Cloud.

@rfratto rfratto force-pushed the docs-structured-metadata-blooms branch from a8acc16 to 4561c6f Compare October 22, 2024 14:10
Copy link
Contributor

@JStickler JStickler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[docs team]

docs/sources/operations/query-acceleration-blooms.md Outdated Show resolved Hide resolved
statistical confidence that the string might be present.
The underlying blooms are built by the [Bloom Builder](#bloom-planner-and-builder) component
and served by the new [Bloom Gateway](#bloom-gateway) component.
### Query blooms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the query section AFTER the section about how to enable blooms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, can we discuss options here?

It's likely that there are going to be many more people going to this page who aren't Loki operators but need to know the rules for using blooms, like a cloud user or someone on a team that uses Loki but does not operate it themselves. I'm concerned that moving information for how to take advantage of blooms after all the operator content would hurt discoverability.

Do you think that's a risk? Would there be another way to organize the content?

Copy link
Contributor

@JStickler JStickler Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you can't really query blooms if they're not enabled yet....
Also I don't think this topic has so many headings that the secondary TOC (on the right) that the query section would be hidden if you don't scroll.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but the audience using blooms and the audience managing advantage of blooms are sometimes distinct. How can we educate people on how to write queries which take advantage of the enabled blooms without forcing them to scan through information about how to enable them?

Or is this problem not worth solving?

(For example: if I was a Grafana Cloud user, only the information about sending structured metadata and how to write queries to use blooms is relevant to me)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to split the information out about querying blooms and put it in a topic under Query instead?

docs/sources/operations/query-acceleration-blooms.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Outdated Query Acceleration with Blooms Doc
4 participants