Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Link(s) to Jira
Description of Intent of Change(s)
The what, why and how.
Right now our stage DB is not supporting replication. That means our outbox is idle.
This is causing our WAL to fill up and have no opportunity to progress. More info: https://www.morling.dev/blog/insatiable-postgres-replication-slot/
This change will add a heartbeat to the outbox connector and force progress at least once every 5 minutes (300,000 ms).
Before this fix we will see the source connector in stage with the status
NotReady
.If we look at the status in the connector's yaml, under connectorStatus -> tasks -> trace, you'll see:
Unable to obtain valid replication slot. Make sure there are no long-running transactions ...
To fix this I use GABI:
debezium
.curl -H "Authorization: Bearer <TOKEN>" https://gabi-rbac-stage.apps.crcs02ue1.urby.p1.openshiftapps.com/query -d "{\"query\": \"Select * from pg_replication_slots;\"}" | jq
curl -H "Authorization: Bearer <TOKEN>" https://gabi-rbac-stage.apps.crcs02ue1.urby.p1.openshiftapps.com/query -d "{\"query\": \"select pg_drop_replication_slot('debezium');\"}" | jq
curl -H "Authorization: Bearer <TOKEN>" https://gabi-rbac-stage.apps.crcs02ue1.urby.p1.openshiftapps.com/query -d "{\"query\": \"select pg_create_logical_replication_slot('debezium', 'pgoutput');\"}" | jq
curl -X POST localhost:8083/connectors/rbac-debezium/tasks/0/restart
in the connect pod in stage. The pod being platform-kafka-connect-connect in the platfrom-mq-stage namespace. Note: this may take a couple minutes to show the connector as ready.All that being said, this is an edge case that occurs only on idle tables. Once we have replication online, this won't be a problem anymore.
Local Testing
How can the feature be exercised?
How can the bug be exploited and fix confirmed?
Is any special local setup required?
Checklist
Secure Coding Practices Checklist Link
Secure Coding Practices Checklist