Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set. #370

usma0118 · 2024-09-21T09:52:41Z

Summary

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set.

Steps to reproduce

Set DEFAULT_CLEANUP_TIME to 4320 #(3 days)
set DEFAULT_SPLIT_DB to true

What is the expected correct behavior?

          - name: TEMP_PATH
            value: "/data"
          - name: DEFAULT_BACKUP_LOCATION
            value: "S3"
          - name: DEFAULT_S3_PROTOCOL
            value: "http"
          - name: DEFAULT_CLEANUP_TIME
            value: 4320 # 3 days
          - name: SCHEDULE
            value: "@daily"
          - name: DEFAULT_TYPE
            value: "Postgresql"
          - name: DEFAULT_USER
            valueFrom:
              secretKeyRef:
                name: &db-secret postgres
                key: username
          - name: DEFAULT_PASS
            valueFrom:
              secretKeyRef:
                name: *db-secret
                key: password
          - name: DEFAULT_NAME
            value: "ALL"
          - name: DEFAULT_NAME_EXCLUDE
            value: "postgres"
          - name: DEFAULT_BACKUP_GLOBALS
            value: "false"
          - name: DEFAULT_SPLIT_DB
            value: true
          - name: DEFAULT_HOST
            valueFrom:
              secretKeyRef:
                name: *app
                key: DB_HOST
          - name: DEFAULT_EXTRA_OPTS
            value: "--clean --if-exists"
          - name: CONTAINER_ENABLE_MONITORING
            value: "false"

Relevant logs and/or screenshots

2024-09-20.16:45:31 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] DB Backup of 'pgsql_[redacted]_postgres-rw.database.svc.cluster.local_20240920-164530.sql.zst' completed successfully
2024-09-20.16:45:31 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Encrypting with GPG Passphrase
2024-09-20.16:45:33 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Generating MD5 sum for 'pgsql_[redacted]_postgres-rw.database.svc.cluster.local_20240920-164530.sql.zst.gpg'
2024-09-20.16:45:34 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup of 'pgsql_[redacted]_postgres-rw.database.svc.cluster.local_20240920-164530.sql.zst.gpg' created with the size of 891111 bytes
2024-09-20.16:45:43 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] DB Backup for '[redacted]' time taken: Hours: 0 Minutes: 00 Seconds: 13
2024-09-20.16:45:43 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] Cleaning up old backups on S3 storage
2024-09-20.16:45:46 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Dumping PostgresSQL globals: with 'pg_dumpall -g' and compressing with 'zstd'
2024-09-20.16:45:46 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Encrypting with GPG Passphrase
2024-09-20.16:45:48 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Generating MD5 sum for 'pgsql_globals_postgres-rw.database.svc.cluster.local_20240920-164546.sql.zst.gpg'
2024-09-20.16:45:49 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup of 'pgsql_globals_postgres-rw.database.svc.cluster.local_20240920-164546.sql.zst.gpg' created with the size of 1233 bytes
2024-09-20.16:45:56 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] DB Backup for 'globals' time taken: Hours: 0 Minutes: 00 Seconds: 10
2024-09-20.16:45:56 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] Cleaning up old backups on S3 storage
2024-09-20.16:45:58 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup 01 routines finish time: 2024-09-20 16:45:58 CEST with exit code 0
2024-09-20.16:45:58 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup 01 routines time taken: Hours: 0 Minutes: 00 Seconds: 46
2024-09-20.16:45:58 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Sleeping for another 86354 seconds. Waking up at 2024-09-21 16:45:12 CEST

Environment

Kubernetes

Image version / tag:
Host OS:

Any logs | docker-compose.yml

Possible fixes

The text was updated successfully, but these errors were encountered:

pimjansen · 2024-10-22T07:01:29Z

@usma0118 i got something similar:

[DB] Moving backup to external storage with blobxfer
mv: cannot stat '/tmp/backups/01_dbbackup.FrSQds/*.': No such file or directory
mv: preserving times for '/backup/myname_20241022-085939.sql.gz': Operation not permitted
mv: preserving permissions for ‘/backup/myname.sql.gz’: Operation not permitted

and my env vars:

CONTAINER_ENABLE_MONITORING : false
DB_CLEANUP_TIME : 10080
DB_HOST : mysql-xxx.mysql.database.azure.com
DB_NAME : myname01,myname02
DB_PASS : secret(backup-settings)[DB_PASS] 
DB_TYPE : mysql
DB_USER : backup
DEFAULT_BACKUP_BEGIN : 0130
DEFAULT_BACKUP_LOCATION : blobxfer
DEFAULT_BLOBXFER_MODE : file
DEFAULT_BLOBXFER_REMOTE_PATH : my-backup-path
DEFAULT_BLOBXFER_STORAGE_ACCOUNT : myaccount-dev001
DEFAULT_BLOBXFER_STORAGE_ACCOUNT_KEY : secret(backup-settings)[BLOBXFER_STORAGE_ACCOUNT_KEY] 
DEFAULT_CHECKSUM : NONE
DEFAULT_COMPRESSION : GZ
DEFAULT_DEBUG_MODE : false
DEFAULT_EXTRA_OPTS : --complete-insert --no-create-db
DEFAULT_MYSQL_CLIENT : mysql
DEFAULT_SPLIT_DB : true
TIMEZONE : Europe/Amsterdam

As far as i can see nothing fancy so no idea what goes wrong here. I end up with broken backups all the time. The backup itself is fine however the copy goes completely wrong here and the remote receives a file with 0 bytes with exit code 0

@tiredofit got an idea what this could be? Seems user related? Then i go inside the container i start the process default as "root"?

UPDATE
What i do notice is this line:

if [ "${backup_job_checksum}" != "none" ] ; then run_as_user mv "${temporary_directory}"/*."${checksum_extension}" "${backup_job_filesystem_path}"/; fi

While there is no checksum_extension set it will move *. which ofc causes an error i think. When digging further i noticed the main issue is the permissions and the user. The scripts runs things as a different user and the Azure volume in this case is SMB. Therefor it cannot access the files generated. I now set the user DBBACKUP_USER to be root and all works fine (and the checksum is just throwing an error and therefor skipped, guess that needs to be addressed).

usma0118 added the bug label Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set. #370

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set. #370

usma0118 commented Sep 21, 2024

pimjansen commented Oct 22, 2024 •

edited

Loading

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set. #370

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set. #370

Comments

usma0118 commented Sep 21, 2024

Summary

Steps to reproduce

What is the expected correct behavior?

Relevant logs and/or screenshots

Environment

Possible fixes

pimjansen commented Oct 22, 2024 • edited Loading

pimjansen commented Oct 22, 2024 •

edited

Loading