Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set. #370

Open
usma0118 opened this issue Sep 21, 2024 · 1 comment
Labels

Comments

@usma0118
Copy link

Summary

Old backups not cleaned up with DEFAULT_CLEANUP_TIME set to 3 and split_db option set.

Steps to reproduce

  • Set DEFAULT_CLEANUP_TIME to 4320 #(3 days)
  • set DEFAULT_SPLIT_DB to true

What is the expected correct behavior?

          - name: TEMP_PATH
            value: "/data"
          - name: DEFAULT_BACKUP_LOCATION
            value: "S3"
          - name: DEFAULT_S3_PROTOCOL
            value: "http"
          - name: DEFAULT_CLEANUP_TIME
            value: 4320 # 3 days
          - name: SCHEDULE
            value: "@daily"
          - name: DEFAULT_TYPE
            value: "Postgresql"
          - name: DEFAULT_USER
            valueFrom:
              secretKeyRef:
                name: &db-secret postgres
                key: username
          - name: DEFAULT_PASS
            valueFrom:
              secretKeyRef:
                name: *db-secret
                key: password
          - name: DEFAULT_NAME
            value: "ALL"
          - name: DEFAULT_NAME_EXCLUDE
            value: "postgres"
          - name: DEFAULT_BACKUP_GLOBALS
            value: "false"
          - name: DEFAULT_SPLIT_DB
            value: true
          - name: DEFAULT_HOST
            valueFrom:
              secretKeyRef:
                name: *app
                key: DB_HOST
          - name: DEFAULT_EXTRA_OPTS
            value: "--clean --if-exists"
          - name: CONTAINER_ENABLE_MONITORING
            value: "false"

Relevant logs and/or screenshots

2024-09-20.16:45:31 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] DB Backup of 'pgsql_[redacted]_postgres-rw.database.svc.cluster.local_20240920-164530.sql.zst' completed successfully
2024-09-20.16:45:31 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Encrypting with GPG Passphrase
2024-09-20.16:45:33 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Generating MD5 sum for 'pgsql_[redacted]_postgres-rw.database.svc.cluster.local_20240920-164530.sql.zst.gpg'
2024-09-20.16:45:34 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup of 'pgsql_[redacted]_postgres-rw.database.svc.cluster.local_20240920-164530.sql.zst.gpg' created with the size of 891111 bytes
2024-09-20.16:45:43 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] DB Backup for '[redacted]' time taken: Hours: 0 Minutes: 00 Seconds: 13
2024-09-20.16:45:43 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] Cleaning up old backups on S3 storage
2024-09-20.16:45:46 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Dumping PostgresSQL globals: with 'pg_dumpall -g' and compressing with 'zstd'
2024-09-20.16:45:46 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Encrypting with GPG Passphrase
2024-09-20.16:45:48 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Generating MD5 sum for 'pgsql_globals_postgres-rw.database.svc.cluster.local_20240920-164546.sql.zst.gpg'
2024-09-20.16:45:49 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup of 'pgsql_globals_postgres-rw.database.svc.cluster.local_20240920-164546.sql.zst.gpg' created with the size of 1233 bytes
2024-09-20.16:45:56 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] DB Backup for 'globals' time taken: Hours: 0 Minutes: 00 Seconds: 10
2024-09-20.16:45:56 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] Cleaning up old backups on S3 storage
2024-09-20.16:45:58 [INFO] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup 01 routines finish time: 2024-09-20 16:45:58 CEST with exit code 0
2024-09-20.16:45:58 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Backup 01 routines time taken: Hours: 0 Minutes: 00 Seconds: 46
2024-09-20.16:45:58 [NOTICE] ** [01-postgres-rw.database.svc.cluster.local__ALL] Sleeping for another 86354 seconds. Waking up at 2024-09-21 16:45:12 CEST

Environment

Kubernetes

  • Image version / tag:
  • Host OS:
Any logs | docker-compose.yml

Possible fixes

@usma0118 usma0118 added the bug label Sep 21, 2024
@pimjansen
Copy link
Contributor

pimjansen commented Oct 22, 2024

@usma0118 i got something similar:

[DB] Moving backup to external storage with blobxfer
mv: cannot stat '/tmp/backups/01_dbbackup.FrSQds/*.': No such file or directory
mv: preserving times for '/backup/myname_20241022-085939.sql.gz': Operation not permitted
mv: preserving permissions for ‘/backup/myname.sql.gz’: Operation not permitted

and my env vars:

CONTAINER_ENABLE_MONITORING : false
DB_CLEANUP_TIME : 10080
DB_HOST : mysql-xxx.mysql.database.azure.com
DB_NAME : myname01,myname02
DB_PASS : secret(backup-settings)[DB_PASS] 
DB_TYPE : mysql
DB_USER : backup
DEFAULT_BACKUP_BEGIN : 0130
DEFAULT_BACKUP_LOCATION : blobxfer
DEFAULT_BLOBXFER_MODE : file
DEFAULT_BLOBXFER_REMOTE_PATH : my-backup-path
DEFAULT_BLOBXFER_STORAGE_ACCOUNT : myaccount-dev001
DEFAULT_BLOBXFER_STORAGE_ACCOUNT_KEY : secret(backup-settings)[BLOBXFER_STORAGE_ACCOUNT_KEY] 
DEFAULT_CHECKSUM : NONE
DEFAULT_COMPRESSION : GZ
DEFAULT_DEBUG_MODE : false
DEFAULT_EXTRA_OPTS : --complete-insert --no-create-db
DEFAULT_MYSQL_CLIENT : mysql
DEFAULT_SPLIT_DB : true
TIMEZONE : Europe/Amsterdam

As far as i can see nothing fancy so no idea what goes wrong here. I end up with broken backups all the time. The backup itself is fine however the copy goes completely wrong here and the remote receives a file with 0 bytes with exit code 0

@tiredofit got an idea what this could be? Seems user related? Then i go inside the container i start the process default as "root"?

UPDATE
What i do notice is this line:

if [ "${backup_job_checksum}" != "none" ] ; then run_as_user mv "${temporary_directory}"/*."${checksum_extension}" "${backup_job_filesystem_path}"/; fi

While there is no checksum_extension set it will move *. which ofc causes an error i think. When digging further i noticed the main issue is the permissions and the user. The scripts runs things as a different user and the Azure volume in this case is SMB. Therefor it cannot access the files generated. I now set the user DBBACKUP_USER to be root and all works fine (and the checksum is just throwing an error and therefor skipped, guess that needs to be addressed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants