-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relocate additional metrics v2 content #1327
base: main
Are you sure you want to change the base?
Conversation
source/operations/monitoring/collect-minio-metrics-using-prometheus.rst
Outdated
Show resolved
Hide resolved
|
||
- Set the ``targets`` array with a hostname that resolves to the MinIO deployment. | ||
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this "for v3" stuff is clumsy, but I'm concerned users will forget which they are using and follow incorrect instructions. Ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.
static_configs: | ||
- targets: [minio.example.net] | ||
|
||
To scrape multiple types of metrics, run :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` for each type and add the ``job_name`` section to the ``scrape_configs`` in your Prometheus configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is how it works? I found examples in the prom docs that have multiple job_name
sections.
Also, do we call parts of yaml files "sections?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they are nodes officially, but that doesn't seem very widely known or user friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, nodes is a conflict with other terms for our product.
So, section
works well enough, I think?
global: | ||
scrape_interval: 60s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this global:
so if when somebody copies this into their own config there's less a chance they cause a problem with a too-small interval.
If needed, edit the generated configuration for your environment. | ||
Common changes include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of stuff here for an unordered list. But tables suck. Ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.
|
||
minio_node_drive_latency_us{job-"minio-job"}[5m] | ||
Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is accurate?
.. code-block:: shell | ||
:class: copyable | ||
|
||
* - ``minio_node_drive_used_bytes`` | ||
- Total storage used on a drive. | ||
minio_system_drive_used_bytes{job-"minio-job"}[5m] | ||
minio_system_drive_used_inodes{job-"minio-job"}[5m] | ||
|
||
* - ``minio_node_drive_errors_timeout`` | ||
- Total number of drive timeout errors since server start. | ||
minio_cluster_usage_buckets_total_bytes{job-"minio-job"}[5m] | ||
minio_cluster_usage_buckets_objects_count{job-"minio-job"}[5m] | ||
|
||
* - ``minio_node_drive_errors_availability`` | ||
- Total number of drive I/O errors, permission denied and timeouts since server start. | ||
minio_api_requests_total{job-"minio-job"}[5m] | ||
minio_api_requests_errors_total{job-"minio-job"}[5m] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced the original v2 metrics with something that looked likely from v3. Are there better examples?
@@ -349,7 +170,7 @@ You can modify or otherwise use these examples as guidance in building your own | |||
- name: minio-alerts | |||
rules: | |||
- alert: NodesOffline | |||
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0 | |||
expr: avg_over_time(minio_cluster_health_nodes_offline_count{job="minio-job"}[5m]) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here with what appeared to be a rational v2-v3 swap
@@ -358,7 +179,7 @@ You can modify or otherwise use these examples as guidance in building your own | |||
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes" | |||
|
|||
- alert: DisksOffline | |||
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0 | |||
expr: avg_over_time(minio_system_drive_offline_count{job="minio-job"}[5m]) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here with what appeared to be a rational v2-v3 swap
To enable historical data visualization in MinIO Console, set the following environment variables on each node in the MinIO deployment: | ||
|
||
- Set :envvar:`MINIO_PROMETHEUS_URL` to the URL of the Prometheus service | ||
- Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics | ||
|
||
MinIO Grafana Dashboard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove reference to Grafana entirely in the v3 text. Moved to the v2 page.
@@ -67,6 +70,310 @@ The following sections describe the deprecated endpoints and metrics. | |||
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer. | |||
|
|||
|
|||
Configure Prometheus to Collect and Alert using MinIO Metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not bothering to make the headings sentence case on this deprecated page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No actually I'll go back and do that. Because Reasons
[--api_version v3] \ | ||
[TYPE --bucket <bucket name> --api_version v3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is a reasonable way to represent optional parameters that themselves have required parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
I think I'd list them as separate, optional flags. Then in the description of them mention they are required if using TYPE
. And in the TYPE
description, add that it requires those two flags.
source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst
Outdated
Show resolved
Hide resolved
@@ -90,18 +129,97 @@ Global Flags | |||
:end-before: end-minio-mc-globals | |||
|
|||
|
|||
Example | |||
------- | |||
Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to add examples that would clarify the complexity of invoking two incompatible APIs with the same command. Suggestions welcome.
@@ -145,7 +145,6 @@ server cannot support may result in undesired behavior. | |||
Setting or modifying the default server-side encryption settings does *not* | |||
automatically encrypt or decrypt the existing bucket contents. If the bucket | |||
contents *must* have consistent encryption, use the | |||
:mc:`mc mv` mc with the :mc-cmd:`~mc mv --encrypt` or | |||
:mc-cmd:`~mc mv --encrypt-key` arguments to manually modify the | |||
:mc:`mc mv` mc with the :mc-cmd:`~mc mv --enc-c` argument to manually modify the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drive-by build warning fix
source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions for your consideration.
We should also verify whether v2 really is deprecated. If it isn't, we need to back off that language throughout.
source/operations/monitoring/collect-minio-metrics-using-prometheus.rst
Outdated
Show resolved
Hide resolved
|
||
- Set the ``targets`` array with a hostname that resolves to the MinIO deployment. | ||
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.
source/operations/monitoring/collect-minio-metrics-using-prometheus.rst
Outdated
Show resolved
Hide resolved
|
||
minio_node_drive_total{job-"minio-job"}[5m] | ||
- Set the ``scheme`` to http for MinIO deployments not using TLS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is the value you're setting, seems like http
should also be monotyped.
Generate a v3 cluster metrics config | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment: | |
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster type metrics for a MinIO deployment: |
Generate a v3 bucket replication metrics config | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``: | |
The following example generates a scrape configuration for v3 replication type metrics of bucket ``mybucket``: |
I was confused by cluster metrics earlier. I don't think the type
is as needed here, but if you agree to put it with the cluster metrics, then have to be consistent.
source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst
Outdated
Show resolved
Hide resolved
Use :mc-cmd:`mc admin prometheus generate` to generate a scrape configuration that collects bucket metrics for a MinIO deployment: | ||
|
||
Generate a default metrics v2 config | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So perhaps consider having headings like:
Examples
----------
v3 Examples
~~~~~~~~~~
v3 example 1
++++++++++
v2 examples
~~~~~~~~~~
etc.
If needed, edit the generated configuration for your environment. | ||
Common changes include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.
More rearranging to feature metrics v3 information:
mc admin prometheus generate
.mc admin prometheus metrics
.Update: v2 isn't actually deprecated.
Note: refs to the new
mc admin prometheus
parameters don't work yet and cause build warnings. Likely require a fixup PR after merge, because intersphinx.Staged:
Metrics and alerts
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-and-alerts.html
Prometheus
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/collect-minio-metrics-using-prometheus.html
Deprecated v2
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-v2-deprecated.html#minio-metrics-v2
Deprecated (?) Grafana dashboards
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/grafana.html
mc admin prometheus generate
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-generate.html
mc admin prometheus metrics
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-metrics.html