From b244a77d681f204cbead92d5512ac5ac94e7dae8 Mon Sep 17 00:00:00 2001 From: Amilcar Aponte Date: Thu, 16 Jan 2025 16:46:42 +0000 Subject: [PATCH] docs: Update installation docs (#1226) # Description Minor changes applied to update the installation documentation. ## Checklist - [X] I have read the [contributing documentation](https://retina.sh/docs/contributing). - [X] I signed and signed-off the commits (`git commit -S -s ...`). See [this documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification) on signing commits. - [X] I have correctly attributed the author(s) of the code. - [X] I have tested the changes locally. - [X] I have followed the project's style guidelines. - [X] I have updated the documentation, if necessary. - [ ] I have added tests, if applicable. --- Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more information on how to contribute to this project. --- docs/02-Installation/01-Setup.md | 4 +- docs/02-Installation/03-Config.md | 59 +++++++++++++++++++++------ docs/02-Installation/04-prometheus.md | 9 +++- 3 files changed, 57 insertions(+), 15 deletions(-) diff --git a/docs/02-Installation/01-Setup.md b/docs/02-Installation/01-Setup.md index 34caeb1eca..5b436d21ad 100644 --- a/docs/02-Installation/01-Setup.md +++ b/docs/02-Installation/01-Setup.md @@ -6,7 +6,9 @@ Note: you can also run captures with just the [CLI](./02-CLI.md). ## Installation -Requires Helm version >= v3.8.0. +### Requirements + +- Helm version >= v3.8.0. ### Basic Mode diff --git a/docs/02-Installation/03-Config.md b/docs/02-Installation/03-Config.md index 23c893fd3c..799d25ad1b 100644 --- a/docs/02-Installation/03-Config.md +++ b/docs/02-Installation/03-Config.md @@ -2,25 +2,60 @@ ## Overview -To customize metrics and other options, modify the `retina-config` ConfigMap. Default settings for each component are specified in *deploy/legacy/manifests/controller/helm/retina/values.yaml*. +### Default Configuration -## Agent Config +Default settings for each component are specified in [Values file](../../deploy/legacy/manifests/controller/helm/retina/values.yaml). + +### Deployed Configuration + +Configuration of an active Retina deployment can be seen in `retina-config` and `retina-operator-config` configmaps. + +```shell +kubectl get configmap retina-config -n kube-system -o yaml +kubectl get configmap retina-operator-config -n kube-system -o yaml +``` + +### Updating Configuration + +If the Retina installation was done via Helm, configuration updates should be done via `helm upgrade` defining the specific attribute name and value as part of the command. + +The example below enables gathering of advance pod-level metrics. + +```shell +VERSION=$( curl -sL https://api.github.com/repos/microsoft/retina/releases/latest | jq -r .name) +helm upgrade --install retina oci://ghcr.io/microsoft/retina/charts/retina \ + --version $VERSION \ + --namespace kube-system \ + --set image.tag=$VERSION \ + --set operator.tag=$VERSION \ + --set logLevel=info \ + --set enabledPlugin_linux="\[dropreason\,packetforward\,linuxutil\,dns\]" + --set enablePodLevel=true +``` + +## General Configuration + +Apply to both Agent and Operator. * `enableTelemetry`: Enables telemetry for the agent for managed AKS clusters. Requires `buildinfo.ApplicationInsightsID` to be set if enabled. -* `enablePodLevel`: Enables gathering of advanced pod-level metrics, attaching pods' metadata to Retina's metrics. * `remoteContext`: Enables Retina to watch Pods on the cluster. -* `enableAnnotations`: Enables gathering of metrics for annotated resources. Resources can be annotated with `retina.sh=observe`. Requires the operator and `enableRetinaEndpoint` to be enabled. -* `enabledPlugin`: List of enabled plugins. + +## Agent Configuration + +* `logLevel`: Define the level of logs to store. +* `enabledPlugin_linux`: List of enabled plugins. * `metricsInterval`: Interval for gathering metrics (in seconds). (@deprecated, use `metricsIntervalDuration` instead) * `metricsIntervalDuration`: Interval for gathering metrics (in `time.Duration`). +* `enablePodLevel`: Enables gathering of advanced pod-level metrics, attaching pods' metadata to Retina's metrics. +* `enableConntrackMetrics`: Enables conntrack metrics for packets and bytes forwarded/received. +* `enableAnnotations`: Enables gathering of metrics for annotated resources. Resources can be annotated with `retina.sh=observe`. Requires the operator and `operator.enableRetinaEndpoint` to be enabled. * `bypassLookupIPOfInterest`: If true, plugins like `packetparser` and `dropreason` will bypass IP lookup, generating an event for each packet regardless. `enableAnnotations` will not work if this is true. * `dataAggregationLevel`: Defines the level of data aggregation for Retina. See [Data Aggregation](../05-Concepts/data-aggregation.md) for more details. -## Operator Config +## Operator Configuration -* `installCRDs`: Allows the operator to manage the installation of Retina-related CRDs. -* `enableTelemetry`: Enables telemetry for the operator in managed AKS clusters. Requires `buildinfo.ApplicationInsightsID` to be set if enabled. -* `captureDebug`: Toggles debug mode for captures. If true, the operator uses the image from the test container registry for the capture workload. Refer to *pkg/capture/utils/capture_image.go* for details on how the debug capture image version is selected. -* `captureJobNumLimit`: Sets the maximum number of jobs that can be created for each Capture. -* `enableRetinaEndpoint`: Allows the operator to monitor and update the cache with Pod metadata. -* `enableManagedStorageAccount`: Enables the use of a managed storage account for storing artifacts. +* `operator.installCRDs`: Allows the operator to manage the installation of Retina-related CRDs. +* `operator.enableRetinaEndpoint`: Allows the operator to monitor and update the cache with Pod metadata. +* `capture.captureDebug`: Toggles debug mode for captures. If true, the operator uses the image from the test container registry for the capture workload. Refer to [Capture Image file](../../pkg/capture/utils/capture_image.go) for details on how the debug capture image version is selected. +* `capture.captureJobNumLimit`: Sets the maximum number of jobs that can be created for each Capture. +* `capture.enableManagedStorageAccount`: Enables the use of a managed storage account for storing artifacts. diff --git a/docs/02-Installation/04-prometheus.md b/docs/02-Installation/04-prometheus.md index 781c610289..bebabb90b6 100644 --- a/docs/02-Installation/04-prometheus.md +++ b/docs/02-Installation/04-prometheus.md @@ -6,6 +6,7 @@ Prometheus is an open-source system monitoring and alerting toolkit originally b 1. Create a Kubernetes cluster. 2. Install Retina DaemonSet (see [Quick Installation](./01-Setup.md)). +3. Clone [Retina Repository](https://github.com/microsoft/retina) or download [Prometheus Values File](../../deploy/legacy/prometheus/values.yaml). ## Install Prometheus via Helm @@ -19,13 +20,17 @@ Prometheus is an open-source system monitoring and alerting toolkit originally b 1. Install the Prometheus chart ```shell - helm install prometheus -n kube-system -f deploy/legacy/prometheus/values.yaml prometheus-community/kube-prometheus-stack + # The value of VALUE_FILE_PATH is relative to the repo root folder. Update this according to the location of your file. + VALUE_FILE_PATH=deploy/legacy/prometheus/values.yaml + helm install prometheus -n kube-system -f $VALUE_FILE_PATH prometheus-community/kube-prometheus-stack ``` Or if you already have the chart installed, upgrade how you see fit, providing the new job name as an additional scrape config, ex: ```shell - helm upgrade prometheus -n kube-system -f deploy/legacy/prometheus/values.yaml prometheus-community/kube-prometheus-stack + # The value of VALUE_FILE_PATH is relative to the repo root folder. Update this according to the location of your file. + VALUE_FILE_PATH=deploy/legacy/prometheus/values.yaml + helm upgrade prometheus -n kube-system -f $VALUE_FILE_PATH prometheus-community/kube-prometheus-stack ``` > Note: Grafana and kube-state metrics may schedule on Windows nodes, the current chart doesn't have node affinity for those components. Some manual intervention may be required.