Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ERROR: Failed to perform backup: error in addproc" around 75% complete #1588

Closed
qubenix opened this issue Jan 6, 2016 · 28 comments
Closed
Labels
C: core P: major Priority: major. Between "default" and "critical" in severity. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@qubenix
Copy link

qubenix commented Jan 6, 2016

Every time I attempt to do an encrypted backup I end up with an error around 75% complete. The error states:

ERROR: Failed to perform backup: error in addproc

As stated in the title, I am running RC3.1 and I have updated dom0 using current testing. Tell any other information you need.

@marmarek marmarek added the R: duplicate Resolution: Another issue exists that is very similar to or subsumes this one. label Jan 6, 2016
@marmarek
Copy link
Member

marmarek commented Jan 6, 2016

Duplicate of #1515

@marmarek marmarek closed this as completed Jan 6, 2016
@Rudd-O
Copy link

Rudd-O commented Jun 10, 2016

Backup just failed for me right as it was doing the second-to-last VM (I checked with lsof, the root.img of that second-to-last VM was open). Sure enough, error with addproc.

There are three dangling symlinks in /root, a few in /var/lib/qubes/, and three dangling symlinks in my home directory. None of these should have been cause for the backup to fail, because when I performed a backup of only Dom0, that worked fine.

I am so pissed right now. I had left the backup running overnight, it had backed up 400 GB of data, and then it craps out, leaving me with a corrupted file that cannot be used, and a machine I cannot move because the very thing that was supposed to protect its data from theft -- you guessed it, the backup -- simply cannot be completed.

Also today I discover that the backup doesn't even want to restore the VMs that were in fact successfully backed up! So, in this sense, the backup tool is inferior to even a simple tar cvzf | openssl enc. Not to mention much slower.

But I guess the thing that pisses me off the most, is that the backup tool EATS THE EXCEPTION. It shows me no error message whatsoever, just says "error in addproc". I could fix the issue if I had seen the exception, but as it is, after 400 GB of backing stuff up, to get such a cryptic error, I mean this is probably only slightly marginally better than just crapping out without any error message. WTH is addproc? Error in addproc? What error! Well, I can't know because the code won't tell me, and I do not have another eight hours to reproduce the issue at this point.

I think I will stop using the backup tool altogether, and start using some other method to back up data.

qubes-core-dom0-3.1.16-1.fc20.x86_64

@andrewdavidwong
Copy link
Member

@Rudd-O: Sorry to hear about your backup troubles. FWIW, I too have noticed that qvm-backup tends to fail on "large" backups. Not sure if this is the same issue, and I don't know what exactly the cause is, but my personal workaround has just been to back up smaller sets of VMs (rather than trying to back up all of my VMs at once).

@marmarek
Copy link
Member

@Rudd-O do you have a lot of small VMs, or few big ones? Do you have any VM larger than 100GB?

@Rudd-O
Copy link

Rudd-O commented Jun 12, 2016

Yes, one of my VMs is literally 115 GB. I do have dozens of small VMs though.

@marmarek
Copy link
Member

Yes, one of my VMs is literally 115 GB. I do have dozens of small VMs though.

Do you use compression?

I'll check that. Backup file format have archive split into 100MB files, and named with 3-digits sequential number. In theory having more than 1000 files (which is 100GB) should be handled by simply using 4-digits there, but maybe it doesn't work for some reason.

@marmarek marmarek reopened this Jun 12, 2016
@marmarek marmarek added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: core P: major Priority: major. Between "default" and "critical" in severity. and removed R: duplicate Resolution: Another issue exists that is very similar to or subsumes this one. labels Jun 12, 2016
@marmarek marmarek added this to the Release 3.1 updates milestone Jun 12, 2016
@Rudd-O
Copy link

Rudd-O commented Jun 12, 2016

On 06/12/2016 06:13 PM, Marek Marczykowski-Górecki wrote:

Yes, one of my VMs is literally 115 GB. I do have dozens of small
VMs though.

Do you use compression?

I'll check that. Backup file format have archive split into 100MB
files, and named with 3-digits sequential number. In theory having
more than 1000 files (which is 100GB) should be handled by simply
using 4-digits there, but maybe it doesn't work for some reason.

No compression.

Perhaps the 1000 files limit is something I hit. But I have no way of
knowing, and I see no way how that could have happened, since the
(unhelpful) "error in addproc" happened at the very end, way after that
big VM had been backed up.

Honestly, the backup tool should be replaced by something like
Duplicity. They get it right, they do incremental backups, they do
encryption, and they do arbitrary targets, so it would be extremely
easy to accommodate backing up to another VM, or even to damn S3 if we
wanted to. Ideally, the way I would see that working is:

  1. Take snapshot of file system containing VMs. Mount the snapshot
    somewhere, read-only.
  2. Execute duplicity pointing it to the read-only VM data source, and
    the target storage destination (VM or local mountpoint)
  3. Destroy snapshot.

This would also allow for backup of VMs that are running, so no need to
shut down VMs during backup.

I highly recommend we research replacing qvm-backup with Duplicity.

Rudd-O
http://rudd-o.com/

@andrewdavidwong
Copy link
Member

We already have an issue open for considering incremental backup support (#858). Please post your comment there so that we can keep distinct issues and discussions organized.

Note that we also had someone ask about Duplicity here: #971 (comment)

My response was here: #971 (comment)

@StevenLColeman42
Copy link

I just had this same problem. First I had my 3.0 fc23 template go south and all vms refused to run, so I upgraded to 3.2 and restored my 3.0 VM's into the new system. Since I lost a few things during the transition I wanted to be sure to do a backup of the new 3.2 system before doing much configuration other than getting up to date on patches.

When I did the first backup it failed with the same error, after searching for a solution I found this thread and checked my link files. I found two corrupted links which I fixed and my backups completed just now. What was in common between the two links is that they were both vm-template icon.png files, and both had an invalid path where one started with vm1//usr/share/... and the other vm8/vm8//usr/share/... but the remainder of the path was otherwise correct. Once I pointed these correctly the backup worked like a charm. Something obviously corrupted these two links in a similar manor but exactly when that happened I do not know. I'm still in recovery mode so I can not spend too much time on this right now but I thought this commonality between the two broken links might give you a clue.

@Rudd-O
Copy link

Rudd-O commented Aug 12, 2016

I got a functioning version of a Duplicity backend for a Qubes VM, which works well with the latest Duplicity 0.7.x present in Fedora 23. Place the following file qubesvmbackend.py along the other backends in the Duplicity site-packages backends directory, then try duplicity full /source/dir qubesvm://yourbackupvm/path/to/directory/inside/vm and report back.

Edit: deleted the file content from this one comment, as I am tracking it into a Github repository now. See comment below this one for more info.

@Rudd-O
Copy link

Rudd-O commented Aug 12, 2016

Excellent news! I have decided to track the development of that Duplicity component in its own repository: https://github.com/Rudd-O/duplicity-qubes.

Enjoy!

@cyrinux
Copy link

cyrinux commented Sep 4, 2016

Same problem here. With one big qube and other lighter

@jmitchell
Copy link

I did a fresh install of the newly released v3.2 and bumped into this without creating new VMs nor modifying any of the default VMs. Backing up a couple VMs at a time works, but when I tried shutting down and backing up all VMs in the same batch it failed with the same "addproc" error. Failure correlates with the size of the backup.

Are there any recommended diagnostics I should try? Running a large backup takes a while, so I'm not eager to fumble in the dark too much.

@maxbane
Copy link

maxbane commented Oct 9, 2016

I believe I'm encountering the same issue. Running 3.1, with dom0 fully updated as of 2016-10-09. In preparation for upgrading to 3.2, I've been attempting to backup my AppVMs, but most attempts encounter the "error in addproc" error and abort early. This happens even on quite small backups, e.g., if I select a few small VMs totalling 500MB. It's not fully deterministic, though, as I can sometimes successfully make very small backups, e.g., 120MB.

Obviously this is a bit of an impediment to upgrading to 3.2... I suppose I can manually backup the files I care about within each AppVM.

@marmarek
Copy link
Member

marmarek commented Oct 9, 2016

Try running backup from command line (qvm-backup), with --debug
option.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@DavidVorick
Copy link

Getting the same issue.

"-> Backing up files: 55%...tar: david/backups/qubes-2016-10-27-T004735: file changed as we read it
-> Backing up files: 55%...Wait_backup_feedback returned: addproc
ERROR: Failed to perform backup: error in addproc"

@DavidVorick
Copy link

A repeat job saving to /backups in dom0 yielded the same failure.

@marmarek
Copy link
Member

Ah, your destination directory is dom0 user home, right? When you make
full backup, it by default include that directory, so the backup archive
itself... Select other destination.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@marmarek
Copy link
Member

A repeat job saving to /backups in dom0 yielded the same failure.

And what error did you get then?

@DavidVorick
Copy link

Sorry, I just mis-typed the command. When I backed up to somewhere other than the dom0 user home, I did not run into problems. My mistake.

I think the right way to handle this issue is to serve a warning if the user selects a destination in the dom0 home. Maybe check that the destination isn't conflicting in any other ways as well, but I'm guessing this would cover 95% of the cases where people run into this error.

@andrewdavidwong andrewdavidwong removed this from the Release 3.1 updates milestone May 31, 2017
@jpouellet
Copy link
Contributor

I'm 100% sure I triggered this without including the backup dest in backups. The real problem does not seem related to backup target.

This is pretty annoying to try to trace down since I need to wait through like an hour of backups to trigger it. :(

@jpouellet
Copy link
Contributor

Appears that anything which causes tar to exit with non-zero status will cause this. In my case the root cause was existence of files with chmod 000 rights in my dom0 home dir while trying to back up dom0, causing tar to fail.

Such situations should never be possible in normal use of Qubes: "thou shalt not muck about with dom0" - especially once we have adminvm/guivm separation. I think we should close this as WONTFIX, and instead open bugs against anything in the future which is found to create unreadable files in the locations which Qubes tries to back up (not aware of any - my case and the others in this thread appear to all be user error).

@jpouellet
Copy link
Contributor

In case anyone cares... (from an R3.2 machine)

Before (with unreadable files in dom0 ~):

[user@dom0 ~]$ qvm-backup --debug -d sys-usb-trusted /run/media/user/ext/backups/dom0 dom0
------------------+--------------+--------------+
               VM |         type |         size |
------------------+--------------+--------------+
             Dom0 |    User home |     71.9 MiB |
------------------+--------------+--------------+
      Total size: |                    72.0 MiB |
------------------+--------------+--------------+
VMs not selected for backup:
[...]

NOTE: VM sys-usb-trusted will be excluded because it is the backup destination.
Do you want to proceed? [y/N] y
Please enter the passphrase that will be used to encrypt and verify the backup: 
Enter again for verification: 
-> Backing up files: 0%...Working in /tmp/backup_CK8ZS3
Creating pipe in: /tmp/backup_CK8ZS3/backup_pipe
Will backup: [{u'path': '/var/lib/qubes/qubes.xml', u'subdir': u'', u'size': 90112}, {u'path': '/home/user', u'subdir': u'dom0-home/', u'size': 75427840}]
Backing up {u'path': '/var/lib/qubes/qubes.xml', u'subdir': u'', u'size': 90112}
Using temporary location: /tmp/backup_CK8ZS3/qubes.xml
tar -Pc --sparse -f /tmp/backup_CK8ZS3/backup_pipe -C /var/lib/qubes --dereference --xform s:^qubes.xml:\0: qubes.xml
Started sending thread
Moving to temporary dir /tmp/backup_CK8ZS3
Sending file backup-header
Removing file backup-header
Sending file backup-header.hmac
Removing file backup-header.hmac
-> Backing up files: 0%...Wait_backup_feedback returned: 
Sending file qubes.xml.000
HMAC proc return code: 0
Writing hmac to /tmp/backup_CK8ZS3/qubes.xml.000.hmac
Finished tar sparse with exit code 0
Backing up {u'path': '/home/user', u'subdir': u'dom0-home/', u'size': 75427840}
Using temporary location: /tmp/backup_CK8ZS3/dom0-home/user
tar -Pc --sparse -f /tmp/backup_CK8ZS3/backup_pipe -C /home --xform s:^user:dom0-home/\0: user
Removing file qubes.xml.000
Sending file qubes.xml.000.hmac
Removing file qubes.xml.000.hmac
-> Backing up files: 75%...tar: user/etc/shadow: Cannot open: Permission denied
-> Backing up files: 86%...tar: user/etc/gshadow: Cannot open: Permission denied
-> Backing up files: 87%...tar: user/etc/gshadow-: Cannot open: Permission denied
-> Backing up files: 87%...tar: user/etc/shadow-: Cannot open: Permission denied
tar: Exiting with failure status due to previous errors
-> Backing up files: 88%...Wait_backup_feedback returned: addproc
 ERROR: Failed to perform backup: error in addproc

After (without unreadable files):

[user@dom0 ~]$ qvm-backup --debug -d sys-usb-trusted /run/media/user/ext/backups/dom0 dom0
------------------+--------------+--------------+
               VM |         type |         size |
------------------+--------------+--------------+
             Dom0 |    User home |     71.9 MiB |
------------------+--------------+--------------+
      Total size: |                    72.0 MiB |
------------------+--------------+--------------+
VMs not selected for backup:
[...]

NOTE: VM sys-usb-trusted will be excluded because it is the backup destination.
Do you want to proceed? [y/N] y
Please enter the passphrase that will be used to encrypt and verify the backup: 
Enter again for verification: 
-> Backing up files: 0%...Working in /tmp/backup_weSJEn
Creating pipe in: /tmp/backup_weSJEn/backup_pipe
Will backup: [{u'path': '/var/lib/qubes/qubes.xml', u'subdir': u'', u'size': 90112}, {u'path': '/home/user', u'subdir': u'dom0-home/', u'size': 75427840}]
Backing up {u'path': '/var/lib/qubes/qubes.xml', u'subdir': u'', u'size': 90112}
Using temporary location: /tmp/backup_weSJEn/qubes.xml
tar -Pc --sparse -f /tmp/backup_weSJEn/backup_pipe -C /var/lib/qubes --dereference --xform s:^qubes.xml:\0: qubes.xml
Started sending thread
Moving to temporary dir /tmp/backup_weSJEn
Sending file backup-header
Removing file backup-header
Sending file backup-header.hmac
Removing file backup-header.hmac
-> Backing up files: 0%...Wait_backup_feedback returned: 
Sending file qubes.xml.000
HMAC proc return code: 0
Writing hmac to /tmp/backup_weSJEn/qubes.xml.000.hmac
Finished tar sparse with exit code 0
Backing up {u'path': '/home/user', u'subdir': u'dom0-home/', u'size': 75427840}
Using temporary location: /tmp/backup_weSJEn/dom0-home/user
tar -Pc --sparse -f /tmp/backup_weSJEn/backup_pipe -C /home --xform s:^user:dom0-home/\0: user
Removing file qubes.xml.000
Sending file qubes.xml.000.hmac
Removing file qubes.xml.000.hmac
-> Backing up files: 88%...Wait_backup_feedback returned: 
Sending file dom0-home/user.000
HMAC proc return code: 0
Writing hmac to /tmp/backup_weSJEn/dom0-home/user.000.hmac
Finished tar sparse with exit code 0
Removing file dom0-home/user.000
Sending file dom0-home/user.000.hmac
Removing file dom0-home/user.000.hmac
Finished sending thread
VMProc1 proc return code: None
Sparse1 proc return code: 0

-> Backup completed.

@jpouellet
Copy link
Contributor

jpouellet commented Sep 10, 2017

And for those who follow who've somehow ended up with a file causing this issue and don't want to wait through backups in debug mode to figure out which and where, try:

[user@dom0 ~]$ find /home/* /var/lib/qubes '!' -readable '!' -type l

and stop doing stuff in dom0! :P

@qubenix
Copy link
Author

qubenix commented Mar 29, 2018

I'm sorry this has lingered open. I think it should be safe to close now as I haven't had backup errors in nearly a year.

@qubenix qubenix closed this as completed Mar 29, 2018
@andrewdavidwong andrewdavidwong changed the title Qubes 3.1, dom0 testing repo: backup fails every time "ERROR: Failed to perform backup: error in addproc" around 75% complete Jun 13, 2018
@andrewdavidwong
Copy link
Member

On 2018-06-12 03:12, [...] wrote:

Hi,

when trying to perform a backup of my VMs I get this error after about 75%

"ERROR: Failed to perform backup: error in addproc"

This is reproducible everytime. Which means I can no longer backup my VMs.

I'm using R3.2.

I found this error on the bugtracker but it was on R3.1 and is marked as closed:

#1588

Anyone else observing this and know of a workaround?

thanks!

Reopening.

@andrewdavidwong
Copy link
Member

On 2018-06-13 02:45, [...] wrote:

thanks!

I got it to work - backup done without failure.

Here is what I did:

'qvm-ls' was showing warnings about corrupt files for specific VMs.

Since these where old VMs that I deleted already in qubes manager I deleted them again with

qvm-remove --just-db

Then I performed a backup. I'm not entirely sure if that was the problem since I deleted some more
VMs but it sound reasonable.

If this was indeed the root-cause:
Maybe qvm-backup could check if there are any warnings in qvm-ls output before proceeding?

CC: @marmarek

@andrewdavidwong
Copy link
Member

This issue is being closed because:

If anyone believes that this issue should be reopened, please let us know in a comment here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core P: major Priority: major. Between "default" and "critical" in severity. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

10 participants