Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding self-signed-certificates and removing the relation breaks the charm #441

Closed
dparv opened this issue Jun 13, 2024 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@dparv
Copy link

dparv commented Jun 13, 2024

Bug Description

istio-pilot/0*                error        idle   10.244.2.10                 hook failed: "certificates-relation-broken"

and can't access kubeflow dashbboard

To Reproduce

juju deploy self-signed-certificates --channel edge
juju relate istio-pilot:certificates self-signed-certificates:certificates
and
juju remove-relation istio-pilot:certificates self-signed-certificates:certificates

Environment

juju 3.4.3
istio-pilot 1.17/stable 965
self-signed-certificates latest/edge 145

Relevant Log Output

unit-istio-pilot-0: 13:52:26 WARNING unit.istio-pilot/0.juju-log certificates:56: 'app' expected but not received.
unit-istio-pilot-0: 13:52:26 WARNING unit.istio-pilot/0.juju-log certificates:56: 'app_name' expected in snapshot but not found.
unit-istio-pilot-0: 13:52:26 INFO unit.istio-pilot/0.juju-log certificates:56: Creating CSR for 57.152.89.25 with DNS ['istio-pilot-0.istio-pilot-endpoints.kubeflow.svc.cluster.local'] and IPs []
unit-istio-pilot-0: 13:52:26 ERROR unit.istio-pilot/0.juju-log certificates:56: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1203, in <module>
    main(Operator)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 544, in main
    manager.run()
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 520, in run
    self._emit()
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 509, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 143, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1582, in _on_relation_broken
    self.on.all_certificates_invalidated.emit()
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 420, in _on_all_certificates_invalidated
    self._generate_csr(overwrite=True, clear_cert=True)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 272, in _generate_csr
    self.certificates.request_certificate_creation(certificate_signing_request=csr)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1421, in request_certificate_creation
    raise RuntimeError(
RuntimeError: Relation certificates does not exist - The certificate request can't be completed


### Additional Context

_No response_
@dparv dparv added the bug Something isn't working label Jun 13, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5876.

This message was autogenerated

@DnPlas
Copy link
Contributor

DnPlas commented Jun 17, 2024

Reported issue

I was able to reproduce the issue.

My model:

Model      Controller  Cloud/Region        Version  SLA          Timestamp
istio-441  uk8s-343    microk8s/localhost  3.4.3    unsupported  18:54:00Z

App                       Version  Status   Scale  Charm                     Channel       Rev  Address         Exposed  Message
istio-ingressgateway               active       1  istio-gateway             1.17/stable  1000  10.152.183.212  no
istio-pilot                        waiting      1  istio-pilot               1.17/stable   965  10.152.183.210  no       installing agent
self-signed-certificates           active       1  self-signed-certificates  latest/edge   147  10.152.183.76   no

Unit                         Workload  Agent  Address      Ports  Message
istio-ingressgateway/0*      active    idle   10.1.60.140
istio-pilot/0*               error     idle   10.1.60.137         hook failed: "certificates-relation-broken" for self-signed-certificates:certificates
self-signed-certificates/0*  active    idle   10.1.60.138

Integration provider                   Requirer                          Interface          Type     Message
istio-pilot:istio-pilot                istio-ingressgateway:istio-pilot  k8s-service        regular
istio-pilot:peers                      istio-pilot:peers                 istio_pilot_peers  peer
self-signed-certificates:certificates  istio-pilot:certificates          tls-certificates   regular

juju debug-log output:

unit-istio-pilot-0: 18:53:15 INFO unit.istio-pilot/0.juju-log certificates:1: Creating CSR for 10.64.140.43 with DNS ['istio-pilot-0.istio-pilot-endpoints.istio-441.svc.cluster.local'] and IPs []
unit-istio-pilot-0: 18:53:15 ERROR unit.istio-pilot/0.juju-log certificates:1: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1203, in <module>
    main(Operator)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 544, in main
    manager.run()
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 520, in run
    self._emit()
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 509, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 143, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1582, in _on_relation_broken
    self.on.all_certificates_invalidated.emit()
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 420, in _on_all_certificates_invalidated
    self._generate_csr(overwrite=True, clear_cert=True)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 272, in _generate_csr
    self.certificates.request_certificate_creation(certificate_signing_request=csr)
  File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1421, in request_certificate_creation
    raise RuntimeError(
RuntimeError: Relation certificates does not exist - The certificate request can't be completed

Potential cause

The error message comes from the cert_handler library. At first glance it looks like on a relation_broken event, a all_certificates_invalidated event is emitted by the tls_certificates library (which is used under the hood by the cert handler lib). The cert_handler lib then calls _on_all_certificates_invalidated which tries to generate a CRS, but since the relation is not established anymore, generating the CSR will fail.

I have pinged the maintainers of the library I'm referring to, will come back with an update.

State of TLS certificates integration

Just as a quick check, I did the following to ensure the TLS certificates were in fact passed and rendered correctly in the Gateway and Secret objects:

  1. Deploy istio-operators 1.17/stable
  2. Deploy self-signed-certificates latest/edge
  3. Add relations
  4. Checked the Gateway object and the Secret it references

My model:

Model      Controller  Cloud/Region        Version  SLA          Timestamp
istio-441  uk8s-343    microk8s/localhost  3.4.3    unsupported  18:49:30Z

App                       Version  Status  Scale  Charm                     Channel       Rev  Address         Exposed  Message
istio-ingressgateway               active      1  istio-gateway             1.17/stable  1000  10.152.183.212  no
istio-pilot                        active      1  istio-pilot               1.17/stable   965  10.152.183.210  no
self-signed-certificates           active      1  self-signed-certificates  latest/edge   147  10.152.183.76   no

Unit                         Workload  Agent  Address      Ports  Message
istio-ingressgateway/0*      active    idle   10.1.60.140
istio-pilot/0*               active    idle   10.1.60.137
self-signed-certificates/0*  active    idle   10.1.60.138

Integration provider                   Requirer                          Interface          Type     Message
istio-pilot:istio-pilot                istio-ingressgateway:istio-pilot  k8s-service        regular
istio-pilot:peers                      istio-pilot:peers                 istio_pilot_peers  peer
self-signed-certificates:certificates  istio-pilot:certificates          tls-certificates   regular

The Gateway and Secret objects:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  creationTimestamp: "2024-06-17T18:44:40Z"
  generation: 2
  labels:
    app.juju.is/created-by: istio-pilot
    app.kubernetes.io/instance: istio-pilot-istio-441
    kubernetes-resource-handler-scope: gateway
  name: istio-gateway
  namespace: istio-441
  resourceVersion: "1420"
  uid: f56edf34-5dd7-4d5a-b50c-1e6b7f977e89
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: https
      number: 8443
      protocol: HTTPS
    tls: # <--- it is configured for TLS
      credentialName: istio-gateway-gateway-secret # <--- it references this secret
      mode: SIMPLE
apiVersion: v1
data:
  tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURyakNDQXBhZ0F3SUJBZ0lVWlV3L0x0aWs2MjNic3FSWWNyRlBjMUJwYjhNd0RRWUpLb1pJaHZjTkFRRUwKQlFBd09URUxNQWtHQTFVRUJoTUNWVk14S2pBb0JnTlZCQU1NSVhObGJHWXRjMmxuYm1Wa0xXTmxjblJwWm1sagpZWFJsY3kxdmNHVnlZWFJ2Y2pBZUZ3MHlOREEyTVRjeE9EUTFNRFJhRncweU5UQTJNVGN4T0RRMU1EUmFNRWN4CkZqQVVCZ05WQkFNTURXbHpkR2x2TFhCcGJHOTBMVEF4TFRBckJnTlZCQzBNSkRGaU4yRXlaRGhsTFdVNU4yVXQKTkRBd09TMDRPVGN5TFRRMU1HRXdZbVUxT1RnME1EQ0NBU0l3RFFZSktvWklodmNOQVFFQkJRQURnZ0VQQURDQwpBUW9DZ2dFQkFNOU1yS1VkZXRJOGJMeFo0Mi9VY2FXaGtKVEpzT0IwRVRxTzlENUxNSUdtZXI1d3ZLc1dmc2Q4CmMxOHV2bUtnc2pCM2tZVVV0bDNIa0xxdHlwU1ZXNkZyOUVPaWI2TGVadFFSTmFYZm11RFN1UjBqMk9jRTJzem8KdDVwRDM3MFJOTVB2eG9BT0szN3U3dkM2VjRaL2ZudnFPaWlaVDZjaU5UQjJSWmpzYTVoWjdSUHZSOW5WaXRhLwpoODZhQmkxdThaNDFpUlhTZkxlTUxDNFdYcEhwL2x2a0JRVVNwWUIyRGs0VDF0Mm90cjNhbjEzbGdMYWtmdk5XCmJYWmpzRWxYWVFCeEhHQmYzN0oraUhjOU9YM25ybnVVY3o2SGgzbG9WekpROGIwTktvMTlDbFZWbTdtbThWVjIKMnIvcVR0VXQ2dDViWE12QUQwRUFtQWhHNFEvL3dXOENBd0VBQWFPQm56Q0JuREFoQmdOVkhTTUVHakFZZ0JZRQpGSVo1ZkVqeUowNEM5U1IrbW80WWRvRWlnRkFzTUIwR0ExVWREZ1FXQkJUL0RFZmxvbTdUNGF6VUpmNXl6L09GCkpNS05KVEFNQmdOVkhSTUJBZjhFQWpBQU1Fb0dBMVVkRVFSRE1FR0NQMmx6ZEdsdkxYQnBiRzkwTFRBdWFYTjAKYVc4dGNHbHNiM1F0Wlc1a2NHOXBiblJ6TG1semRHbHZMVFEwTVM1emRtTXVZMngxYzNSbGNpNXNiMk5oYkRBTgpCZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFkdzc5UWhJN0pVcUV3MzRwZysrRkJDSitKVUU3OFpZVURrVHNIQWVZClZEUlpWcUwyaENnL0poU2k3RHFrYUcwRjh6UkdadGxzcUFCdEdEYmhPZC9WM3BiOUtYVTRUSHl6UWhPYmlEWHkKYXRwQ2REUnEwUDVUeGpBT2l6YnJIZHlyOXc2c0FFd1VEcldKclQ2NjFOVjFNazE3YUluTVZZdFlNMExsS0h5YgpkTGZ4NmZjcGVCeXJXVjQ2cjZLTVlKQWoyd2lORjhlSXdpK0NMd2tiUGwwR1FHd3lVK3NSV1EwVmtuWk5ESVlyCjhzT0wrbUV5VVNBNmJ3RmF3dFVxUHJPRDI5RXJ5VlF0RkVtYit6cWVuN2VUNXBLN2FRRDE2NDR2TEdEajJEYzkKNkVIM1pRZ2VjUHBFcHJWTW04NTdEWC9XTTdDcDQxUThURDF5SUVGREZCSThSdz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0=
  tls.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcFFJQkFBS0NBUUVBejB5c3BSMTYwanhzdkZuamI5UnhwYUdRbE1tdzRIUVJPbzcwUGtzd2dhWjZ2bkM4CnF4Wit4M3h6WHk2K1lxQ3lNSGVSaFJTMlhjZVF1cTNLbEpWYm9XdjBRNkp2b3Q1bTFCRTFwZCthNE5LNUhTUFkKNXdUYXpPaTNta1BmdlJFMHcrL0dnQTRyZnU3dThMcFhobjkrZStvNktKbFBweUkxTUhaRm1PeHJtRm50RSs5SAoyZFdLMXIrSHpwb0dMVzd4bmpXSkZkSjh0NHdzTGhaZWtlbitXK1FGQlJLbGdIWU9UaFBXM2FpMnZkcWZYZVdBCnRxUis4MVp0ZG1Pd1NWZGhBSEVjWUYvZnNuNklkejA1ZmVldWU1UnpQb2VIZVdoWE1sRHh2UTBxalgwS1ZWV2IKdWFieFZYYmF2K3BPMVMzcTNsdGN5OEFQUVFDWUNFYmhELy9CYndJREFRQUJBb0lCQVFDZ0RXc2U4T3ZyZG92ZAp3T2xCWnAxNGJJM2Mwdnlsei9lZFp0SmRabUJGT2V4N0xULytPSmdhSFpSV1lSak52WlRXcHZyTDdYb0FYaHo0CmhVWnNBZ1dGVkh4NzIrYWxzV0ZqU3daSTA2UVpBWm03VGZvaUpEVnJFQ0x5RUlXbXpLb1l2Z0JjenBQMnBUUUcKMlZqS2w1Vm94eWV3UU82bTlGcHMyR1JUOWZYODRjMS90bUkraTZOOVN1ak5wcDloVzhoMS81cjBaRnZha2VJSgpxaTgyb2IyMFhWQmhOeGg0Z0x1YWw2aGtiTE9WUWVmNWZSTThxOFRVWlVGTjRycEpvbi9XaHNBTnMxRkpaVE5kCngza2JHZWh0TEc0OXNzSzdqVG5KK0ZFeEZSM0pjZjBieHJKV1JEbkJFN0JCMGxzSmR4anVUTFhCVlZ1dGxHazMKVDNENWtQckJBb0dCQVBDelNiWDVuTnl1NjZaVmNLcTB4a3plNVhiaUR4Ri95aW1yaUtybDRHU1czSWx5bnNBdQprVG5rVGtUVDB1YithSzJGUlJiQ3hIM3dCMW9HQjlhUG1RaFBtY1BDMGJ1TlJLN3M2MGhnWVV1Y0o3czNPdE5KCk5ubm1JN0laTVpQTURweHVoekFhNHU2NGFBZ21QNWdkNWpVSjJkNGZtZkJrU1YyYzlwRG9xM29OQW9HQkFOeDUKNGhHck5mVk9BTWJCNVpQdTNrTkdzdkhzaURRalA3TWZOVTB2MjRUYW9KS3d1dXR1L1NKUE5FQ3FGbkx1S2tkWgo2cDZ1akN5UC9LNlBvQUlqSjZjMHJGUkRIbHMxQk1PTU5VQlBmdGNVUW5KSHo0Ty93ZUt2U2FWRGsvU0FXWTFICnpkS2toVEdvaG1yZ2tJcnhvaDRsUTViT0YxSnp1WlR3L3ozYXFUWnJBb0dCQU14ZW5qNXBjenVaTmJKa0p5WjYKR1VrWmxHR05iVmZoVmVodG9idmhOTmFUbFNzSzdDbW5JRjIwTUpTVitpTnhiYld2UzBzWkVqY1AvMTM3Y3RwRgowSnpTNFc3cTBxTlpQakQ4TG9Xa2Q5ZjMvWEFqWThvVUJySVhxc1ZFU09rQndJSW9BcGJnclVBZHlRN3FVdUs0CnVFYmVWMk1YRitDWmRnV0xDWHRlWW9KZEFvR0JBTkpVY0VlODFzLzdKeUIxNzRjdUZObUhnOFRwaXBKNm9oVkcKaTNua1V2NHQ5NHVaaitoMFRJYkRtcXlwMXBxei9KOXU5eldFZlBNeU5iTnVEdzZhN1FSRmFyVkVCcHlxT3E0Mgpmc0tvVSsvcFF1NTA5VkhSeUt4eDNzY0xiZ1dOd0dEWWhGRVVaSUNZTGd1ZHlpYlRGMzY4dS9zTkJ4REFsK1d2Cjl6L1I3eVdiQW9HQUx6Qk5WSTh0U2RkbTYwWEhpVFJVL012dzdFOHNXbU1Gdlp2VGRXdmZMcVY3N0V2cE1FdEEKdTJzQmhNdUJHbkVvVjV0ZDR6aVVpb2xDQms2c2UxNi8xWlpydUtvaGJIdFZobXlTTG8xalFtSXlicjB4RFBOYgpOY3k0UHNkNTVhVEZNSVd3RWhwbWovMjB0UmlQaGNadlBEWlVleHBIWDRHYi9mRmZ4QTJlTGc4PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
kind: Secret
metadata:
  creationTimestamp: "2024-06-17T18:45:05Z"
  labels:
    app.juju.is/created-by: istio-pilot
    app.kubernetes.io/instance: istio-pilot-istio-441
    kubernetes-resource-handler-scope: gateway
  name: istio-gateway-gateway-secret
  namespace: istio-441
  resourceVersion: "1419"
  uid: 0dfe5ae8-ed60-4f54-8e3f-b3b6abed5e4e
type: kubernetes.io/tls

Based on this we can confirm the relation and the reconciler in the istio-operators seem to be working just fine.

@DnPlas
Copy link
Contributor

DnPlas commented Jun 17, 2024

I have confirmed with @sed-i that this issue is caused by the cert_library not handling relation broken events correctly. I have tested the fix in canonical/observability-libs#99 and it seems to be working for v0 of the library.

To fix the issue @dparv reported, we'll have to:

  1. Wait for [cert handler] do not observe rel broken directly observability-libs#99 to be merged
  2. Bump the library to bring in all the changes for cert_handler v0

For more recent versions of the istio-operators we'd ideally use the cert_handler v1, which we'll also have to pull once the mentioned PR is merged.

@DnPlas
Copy link
Contributor

DnPlas commented Jun 21, 2024

I have submitted multiple PRs to bump the cert_handler library as it was recently updated. Thanks @sed-i!

@dparv we'll soon be publishing a new revision of istio-pilot 1.17/stable that includes the newer library version. We'll keep you posted.

DnPlas added a commit that referenced this issue Jun 25, 2024
DnPlas added a commit that referenced this issue Jun 25, 2024
DnPlas added a commit that referenced this issue Jun 25, 2024
DnPlas added a commit that referenced this issue Jun 25, 2024
DnPlas added a commit that referenced this issue Jun 25, 2024
DnPlas added a commit that referenced this issue Jun 25, 2024
DnPlas added a commit that referenced this issue Jun 25, 2024
@DnPlas
Copy link
Contributor

DnPlas commented Jun 25, 2024

The fix has been released to 1.17/stable. Closing this issue, but feel free to re-open or file a new one should you find any other error. Thanks!

@DnPlas DnPlas closed this as completed Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants