Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add k8s scripts for monitoring and alerting manager #925

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

kmehant
Copy link
Collaborator

@kmehant kmehant commented Jul 8, 2020

Issue #924

Todo

  • Prometheus monitoring
    • service
    • deployment
    • alert rules
    • recording rules
    • basic config (prometheus.yml)
    • PVC for metrics
  • metric exporters
    • kube state metrics
      • service account
      • service
      • cluster role
      • cluster role binding
      • deployment
    • node exporter
      • daemon set
  • Prometheus Alerting manager : Can have more design modifications by classifying and grouping the alerts based on labels
    • Gmail receiver
  • Grafana Dashboards (Visualizer)
    • deployment
    • service
    • data source: Prometheus
  • Ingress to access services from outside

@kmehant kmehant added the GSoC 2020 Tasks for GSoC 2020 label Jul 8, 2020
@kmehant kmehant changed the title Add k8s scripts for monitoring alerting manager [WIP] Add k8s scripts for monitoring alerting manager Jul 8, 2020
@TravisBuddy
Copy link

Travis tests have failed

Hey @kmehant,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

1st Build

View build log

docker build -f ui/Dockerfile.dev -t scoreucsc/bassa-ui:dev ui >/dev/null
The command '/bin/sh -c apt-get update &&     npm install &&     npm install --global bower gulp-cli &&     bower --allow-root install' returned a non-zero code: 1
TravisBuddy Request Identifier: 507e5690-c0fc-11ea-9c30-6301f8ea01dd

@TravisBuddy
Copy link

Hey @kmehant,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 52f24750-c0fd-11ea-9c30-6301f8ea01dd

@TravisBuddy
Copy link

Hey @kmehant,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: be16e890-c433-11ea-9d9d-6dff9152c98a

@TravisBuddy
Copy link

Travis tests have failed

Hey @kmehant,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

1st Build

View build log

docker build -f ui/Dockerfile.dev -t scoreucsc/bassa-ui:dev ui >/dev/null
The command '/bin/sh -c apt-get update &&     npm install &&     npm install --global bower gulp-cli &&     bower --allow-root install' returned a non-zero code: 1
TravisBuddy Request Identifier: b7956960-c6a4-11ea-aa06-e17841301e13

@kmehant kmehant changed the title [WIP] Add k8s scripts for monitoring alerting manager [WIP] Add k8s scripts for monitoring and alerting manager Jul 15, 2020
Signed-off-by: K mehant <[email protected]>
@TravisBuddy
Copy link

Hey @kmehant,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 84c10e10-c6bb-11ea-aa06-e17841301e13

@kmehant kmehant changed the title [WIP] Add k8s scripts for monitoring and alerting manager Add k8s scripts for monitoring and alerting manager Jul 17, 2020
@kmehant kmehant requested a review from JaDogg July 17, 2020 18:16
@TravisBuddy
Copy link

Hey @kmehant,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: e039aef0-c85a-11ea-8ad2-cd0863c1f5b3

Signed-off-by: K mehant <[email protected]>
@TravisBuddy
Copy link

Hey @kmehant,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: 002da3d0-c863-11ea-8ad2-cd0863c1f5b3

JaDogg
JaDogg previously approved these changes Jul 17, 2020
- alert: Container restarted
annotations:
summary: Container named {{$labels.container}} in {{$labels.pod}} in {{$labels.namespace}} was restarted
expr: |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add some documentation about this expressions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure 😁

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JaDogg Added the comment and wiki as well :)

@TravisBuddy
Copy link

Hey @kmehant,
Your changes look good to me!

View build log

TravisBuddy Request Identifier: e6de6110-d16f-11ea-bb2f-e1a2ee6530fc

@@ -6,6 +6,7 @@ metadata:
name: prometheus-rules
namespace: bassa-monitoring
data:
# you can modify alert rules as you wish, useful explanation is added here https://github.com/scorelab/Bassa/wiki/Prometheus-for-monitoring-and-alerting-K8s-cluster
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Collaborator

@vivonk vivonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GSoC 2020 Tasks for GSoC 2020
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add monitoring and alerting manager to the existing Bassa kubernetes cluster
4 participants