Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics endpoint starts timing out at intermittent intervals #1035

Open
ayush-rathore-quartic opened this issue May 23, 2024 · 1 comment
Open

Comments

@ayush-rathore-quartic
Copy link

ayush-rathore-quartic commented May 23, 2024

What did you do?

Running postgres exporter as a container in a kubernetes pod which also hosts the postgresql server at localhost:5432

What did you expect to see?

The /metrics endpoint should have returned the prometheus metrics at all times

What did you see instead? Under which circumstances?

  1. While the /metrics endpoint worked well for about an hour, after some time the metrics server starts timing out i.e. there is no response at :9187/metrics (Postgres Exporter is running at port 9187 of the pod). There are no logs about the failure to serve these requests in the postgres exporter logs
  2. This issue often gets fixed when the postgres server is restarted,but only for sometime.
  3. I can connect to the postgres server through psql at the same time

More Information
The requests at :9187/ and :9187/probe are being served well. When i try probing my postgresql server through the below commands:

curl "<POD IP>:9187/probe?target=127.0.0.1:5432&sslmode=disable"
curl "<POD IP>:9187/probe?target=:5432&sslmode=disable"
curl "<POD IP>:9187/probe?target=/var/run/postgresql:5432&sslmode=disable"

Output

# HELP pg_exporter_last_scrape_duration_seconds Duration of the last scrape of metrics from PostgreSQL.
# TYPE pg_exporter_last_scrape_duration_seconds gauge
pg_exporter_last_scrape_duration_seconds{cluster_name="mydb",namespace="default"} 1.002118094
# HELP pg_exporter_last_scrape_error Whether the last scrape of metrics from PostgreSQL resulted in an error (1 for error, 0 for success).
# TYPE pg_exporter_last_scrape_error gauge
pg_exporter_last_scrape_error{cluster_name="mydb",namespace="default"} 1

.... 
....
....
pg_up{cluster_name="mydb",namespace="default"} 0

Logs emitted in postgres exporter EVERYTIME the above /probe requests are fired to check reach-ability to postgres server

ts=2024-05-23T20:30:23.947Z caller=probe.go:41 level=info msg="no auth_module specified, using default"
ts=2024-05-23T20:30:23.947Z caller=server.go:74 level=info msg="Established new database connection" fingerprint=localhost:5432
ts=2024-05-23T20:30:23.949Z caller=collector.go:194 level=error target=:5432 msg="collector failed" name=bgwriter duration_seconds=0.001488188 err="pq: SSL is not enabled on the server"
ts=2024-05-23T20:30:23.950Z caller=collector.go:194 level=error target=:5432 msg="collector failed" name=replication_slot duration_seconds=0.002488279 err="pq: SSL is not enabled on the server"
ts=2024-05-23T20:30:23.950Z caller=collector.go:194 level=error target=:5432 msg="collector failed" name=database duration_seconds=0.003197173 err="pq: SSL is not enabled on the server"
ts=2024-05-23T20:30:24.949Z caller=postgres_exporter.go:716 level=error err="Error opening connection to database (postgresql://:5432): pq: SSL is not enabled on the server"

Environment

Linux/Kubernetes

  • System information:

    Linux 5.15.0-1054-azure x86_64

  • postgres_exporter version:

	postgres_exporter, version 0.12.1 (branch: HEAD, revision: 1c063b1b1913db029d449818e9cd1750c2282198)
        build user:       root@a5fc99238ef0
        build date:       20230613-16:18:22
        go version:       go1.20.5
        platform:         linux/amd64
        tags:             netgo static_build
  • postgres_exporter flags:
/usr/local/bin/postgres_exporter --log.level=info
    {
      "name": "DATA_SOURCE_NAME",
      "value": "postgresql://postgres@:5432/postgres?host=/var/run/postgresql&sslmode=disable"
    },
    {
      "name": "PG_EXPORTER_EXTEND_QUERY_PATH",
      "value": "/var/opt/postgres-exporter/queries.yaml"
    },
    {
      "name": "PG_EXPORTER_CONSTANT_LABELS",
      "value": "cluster_name=mydb, namespace=default"
    }
  • PostgreSQL version:
sh-4.4$ psql --version
psql (PostgreSQL) 16.2 (OnGres 16.2-build-6.31)
  • Logs:
+ . /templates/shell-utils
++ LOCK_DURATION=60
++ LOCK_SLEEP=5
++ QUEUE_NAME=create_event.pipe
+++ readlink /proc/412/exe
++ SHELL=/usr/bin/bash
++ command -v /usr/bin/bash
+++ basename /usr/bin/bash
++ '[' xbash = x ']'
+++ basename /usr/bin/bash
++ '[' xbash = xbusybox ']'
+++ echo ehxB
+++ grep -q x
+++ echo ' -x'
++ SHELL_XTRACE=' -x'
+ set +x
+ exec /usr/local/bin/postgres_exporter --log.level=info
ts=2024-05-23T08:42:45.986Z caller=main.go:86 level=error msg="Error loading config" err="Error opening config file \"postgres_exporter.yml\": open postgres_exporter.yml: no such file or directory"
ts=2024-05-23T08:42:46.046Z caller=proc.go:250 msg="Excluded databases" databases=[]
ts=2024-05-23T08:42:46.046Z caller=tls_config.go:232 level=info msg="Listening on" address=[::]:9187
ts=2024-05-23T08:42:46.047Z caller=tls_config.go:235 level=info msg="TLS is disabled." http2=false address=[::]:9187
ts=2024-05-23T08:43:07.472Z caller=server.go:74 level=info msg="Established new database connection" fingerprint=/var/run/postgresql:5432
ts=2024-05-23T08:43:07.478Z caller=postgres_exporter.go:647 level=info msg="Semantic version changed" server=/var/run/postgresql:5432 from=0.0.0 to=16.2.0
ts=2024-05-23T20:07:25.624Z caller=probe.go:41 level=info msg="no auth_module specified, using default"
ts=2024-05-23T20:07:25.624Z caller=server.go:74 level=info msg="Established new database connection" fingerprint=localhost:5432


<No logs are emitted even when the requests at /metrics are timing out>
@ayush-rathore-quartic
Copy link
Author

cc @sysadmind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant