dd_rescue.1

.TH dd_rescue 1 "2017-06-23" "Kurt Garloff" "Data recovery and protection tool"
.
.SH NAME
dd_rescue \- Data recovery and protection tool
.
.SH SYNOPSIS
.na
.nh
.B dd_rescue
[options] infile outfile
.
.br
.B dd_rescue
[options] [-2/-3/-4/-z/-Z seed/seedfile] outfile
.
.br
.B dd_rescue
[options] [--shred2/--shred3/--shred4/--random/--frandom seed/seedfile] outfile
.
.SH DESCRIPTION
.B dd_rescue
is a tool that copies data from a source (file, block device, pipe, ...) 
to one (or several) output file(s). 
.PP
If input and output files are seekable (block devices or regular files),
.B dd_rescue
does copy with large blocks (softbs) to increase performance. When
a read error is encountered,
.B dd_rescue
falls back to reading smaller blocks (hardbs), to allow to recover the maximum
amount of data. If blocks can still not be read,
.B dd_rescue
by default skips over them also in the output file, avoiding to overwrite
data that might have been copied there successfully in a previous run.
(Option -A / --alwayswrite changes this.).
.
.PP
.B dd_rescue
can copy in reverse direction as well, allowing to approach a bad spot 
from both directions. As trying to read over a bad spot of significant size
can take very long (and potentially cause further damage), this is an important
optimization when recovering data. The
.B dd_rhelp
tool takes advantage of this and automates data recovery.
.B dd_rescue
does not (by default) truncate the output file.
.PP
.B dd_rescue
by default reports on progress, and optionally also writes into a logfile.
It has a progress bar and gives an estimate for the remaining time.
.B dd_rescue
has a wealth of options that influence its behavior, such as the possibility
to use direct IO for input/output, to use fallocate() to preallocate space
for the output file, using splice copy (in kernel zerocopy) for efficiency,
looking for empty blocks to create sparse files, or using a pseudo random
number generator (PRNG) to quickly overwrite data with random numbers.
.PP
The modes to overwrite partitions or files with pseudo random numbers make
.B dd_rescue
a tool that can be used for secure data deletion and thus not just a
data recovery and backup tool but also a data protection tool.
.PP
You can use "-" as infile or outfile, meaning stdin or stdout. Note that
this means that either file is not seekable, limiting the usefulness of
some of dd_rescues features.
.
.SH OPTIONS
When parsing numbers, 
.B dd_rescue
assumes bytes. It accepts the following suffixes:
.br 
b -- 512 size units (blocks)
.br 
k -- 1024 size units (binary kilobytes, kiB)
.br 
M -- 1024^2 size units (binary megabytes, MiB)
.br 
G -- 1024^3 size units (binary gigabytes, GiB)
.PP
The following options may be used to modify the behavior of 
.B dd_rescue .
.
.SS General options
.TP 8
.BR \-h ", " \-\-help
This option tells
.B dd_rescue
to output a list of options and exit.
.TP 8
.BR \-V ", " \-\-version
Display version number and exit.
.TP 8
.BR \-q ", " \-\-quiet
tells
.B dd_rescue
to be less verbose.
.TP 8
.BR \-v ", " \-\-verbose
makes
.B dd_rescue
more verbose.
.TP 8
.BI \-c\  0/1 \fR,\ \fB\-\-color= 0/1
controls whether
.B dd_rescue
uses colors. By default it does, unless the terminal type from TERM is 
unknown or dumb or ends in -m or -mono.
.TP 8
.BR \-f ", " \-\-force
makes
.B dd_rescue
skip some sanity checks (e.g. automatically setting reverse direction when 
input and output file are the same and ipos < opos).
.TP 8
.BR \-i ", " \-\-interactive
tells
.B dd_rescue
to ask before overwriting existing files.
.
.SS Block sizes
.TP 8
.BI \-b\  softbs \fR,\ \fB\-\-softbs= softbs \fR,\ \fB\-\-bs= softbs
sets the (larger) block size to
.IR softbs  
bytes.
.B dd_rescue
will transfer chunks of that size unless a read error is encountered (or the 
end of the input file or the maximum transfer size has been reached).
The default value for this is 64k for buffered I/O and 1M for direct I/O.
.TP 8
.BI \-B\  hardbs \fR,\ \fB\-\-hardbs= hardbs \fR,\ \fB\-\-block\-size= hardbs
sets the (smaller) fallback block size to
.IR hardbs
bytes. When
.B dd_rescue
encounters read errors, it will fall back to copying data in chunks of 
this size. This value defaults to 4k for buffered I/O and 512 bytes for
direct I/O.
.br
.IR hardbs
should be equal to or smaller than
.IR softbs .
If both block sizes are identical, no fallback mechanism (and thus no
retry) will take place on read errors.
.TP 8
.BI \-y\  syncsize \fR,\ \fB\-\-syncfreq= syncsize
tells
.B dd_rescue
to call fsync() on the output file every 
.IR syncsize
bytes (will be rounded to multiples of 
.IR softbs
sized blocks). It will also update the progress indicator at least as
often. By default,
.IR syncsize
is set to 0, meaning that fsync() is only issued at the end of the
copy operation.
.
.SS Positions and length
.TP 8
.BI \-s\  ipos \fR,\ \fB\-\-ipos= ipos \fR,\ \fB\-\-input\-position= ipos
sets the starting position of the 
.IR infile
to
.IR ipos .
Note that ipos is specified in bytes (but suffixes can be used, see above), 
not in terms of 
.IR softbs
or
.IR hardbs
sized blocks.
The default value for this is 0. When reverse direction copy is used, an
.IR ipos
of 0 is treated specially, meaning the end of file.
.br
Negative positions result in an error message.
.TP 8
.BI \-S\  opos \fR,\ \fB\-\-opos= opos \fR,\ \fB\-\-output\-position= opos
sets the starting position of the
.IR outfile
to
.IR opos .
If not specified,
.IR opos
is set to
.IR ipos ,
so the file offsets in input and output file are the same.
For reverse direction copy, an explicit 
.IR opos 
of 0 will position at the end of the output file.
.TP 8
.BR \-x ", " \-\-extend ", " \-\-append
changes the interpretation of the output position to start at the 
end of the existing output file, making appending to a file convenient.
If the output file does not exist, an error will be reported and
.B dd_rescue
aborted.
.TP 8
.BI \-m\  maxxfer \fR,\ \fB\-\-maxxfer= maxxfer \fR,\ \fB\-\-max\-size= maxxfer
specifies the maximum number of bytes (suffixes apply, but it's NOT
counted in blocks) that 
.B dd_rescue 
copies. If EOF is encountered before 
.IR maxxfer
bytes have been transferred, this option will be silently ignored.
.TP 8
.BR \-M ", " \-\-noextend
tells 
.B dd_rescue
to not extend the output file. This option is particularly helpful
when overwriting a file with random data or zeroes for safe data
destruction. If the output file does not exist, an error message
will be generated and the program be aborted.
.
.SS Error handling
.TP 8
.BI \-e\  maxerr \fR,\ \fB\-\-maxerr= maxerr
tells
.B dd_rescue
to exit, after
.IR maxerr
read errors have been encountered. By default, this is set to 0,
resulting in
.B dd_rescue
trying to move on until it hits EOF (or
.IR maxxfer
bytes have been transferred).
.TP 8
.BR \-w ", " \-\-abort_we
makes
.B dd_rescue
abort on any write errors. By default, on reported write errors,
.B dd_rescue
tries to rewrite the blocks with small block size writes, so a small
failure in a larger block will not cause the whole block not to
be written. Note that this may be handled similarly by your Operating
System kernel with buffered writes without the user or dd_rescue noticing;
the write retry logic in dd_rescue is mostly useful for direct I/O
writes where write errors can be reliably detected.
.br
Write error detection with buffered writes is unreliable; the
kernel reports success and traces of the failing writeback operations
later may only appear in your syslog. dd_rescue does try to notice the
user by calling fsync() and carefully checking the return values of
fsync() and close() calls.
.br
Note that
.B dd_rescue
does exit if writes to the output file result in the Operating
System reporting that no space is left.
.
.SS Sparse files and write avoidance
.TP 8
.BR \-A ", " \-\-alwayswrite
changes the behavior of
.B dd_rescue
to write zeroes to the output file when the input file could not
be read. By default, it just skips over, leaving whatever content
was in the output file at the file position before. The default
behavior may be desired, if e.g. previous copy operations may have
resulted in good data being in place; it may be undesired if the
output file may contain garbage (or sensitive information) that should
rather be overwritten with zeroes.
.TP 8
.BR \-a ", " \-\-sparse
will make 
.B dd_rescue
look for empty blocks (of at least half of 
.IR softbs
size), i.e. blocks filled with zeroes. Rather than writing those
zeroes to the output file, it will then skip forward in the output
file, resulting in a sparse file, saving space in the output file system
(if it supports sparse files). Note that if the output file does already
exist and already has data stored at the location where zeroes are skipped
over, this will result in an incomplete copy in that the output file is
different from the input file at the location where blocks of zeroes 
were skipped over.
.B dd_rescue
tries to detect this and issues a warning, but it does not prevent this
from happening
.TP 8
.BR \-W ", " \-\-avoidwrite
results in 
.B dd_rescue
reading a block (
.IR softbs
sized) from the output file prior to writing it. If it is already identical
with the data that would be written to it, the writes are actually avoided.
This option may be useful for devices, where e.g. writes should be avoided
(e.g. because they may impact the remaining lifetime or because they are very
slow compared to reads).
.
.SS Other optimization
.TP 8
.BR \-R ", " \-\-repeat
tells 
.B dd_rescue
to only read one block (
.IR softbs
sized) and then repeatedly write it to the output file.
Note that this results in never hitting EOF on the input file and should be
used with a limit for the transfer size (options -m or -M) or when filling
up an output device completely.
.br
This option is automatically set, if the input file name equals "/dev/zero".
.TP 8
.BR \-u ", " \-\-rmvtrim
instructs
.B dd_rescue
to remove the output file after writing to it has completed and issue
a FITRIM on the file system that contains the output file. This makes
only sense if writing zeros (or random numbers) as opposed to useful
content from another file. (dd_rescue will ask for confirmation if
this is specified with a normal input file and no \-f (\-\-force) is
used.) This option may be used to ensure that all empty
blocks of a file system are filled with zeros (rather than containing 
fragments of deleted files with possibly sensitive information).
.br
The FITRIM ioctl (on Linux) tells the storage to consider the freed
space as unused (like the fstrim tool or the discard option) by
issuing ATA TRIM or SCS DISCARD/WRITE_SAME commands. This will only
succeed with superuser privileges (but the error can otherwise be safely
ignored). This is useful to ensure full performance of flash
memory / SSDs or to free up space on thinly provisioned storage.
Note that FITRIM can take a while on large file systems, especially if
the file systems are not mounted with the discard option and have not
been trimmed (with e.g. fstrim) for a while. Not all file systems and
not all flash-based storage support this.
.TP 8
.BR \-k ", " \-\-splice
tells
.B dd_rescue
to use the Linux in-kernel zerocopy splice() copy operation rather than
reading blocks into a user space buffer. Note that this operation mode
does prevent the support of a number of
.B dd_rescue
features that can normally be used, such as falling back to smaller block
sizes, avoiding writes, sparse mode, repeat optimization, reverse direction
copy. A warning is issued to make the user aware.
.TP 8
.BR \-P ", " \-\-fallocate
results in 
.B dd_rescue
calling fallocate() on the output file, telling the file system how much
space to preallocate for the output file. (The size is determined by the
expected last position, as inferred from the input file length and 
.IR maxxfer
). On file systems that support it, this results in them making better
allocation decisions, avoiding fragmentation. (Note that it does not
make sense to use sparse together with fallocate().)
.br
This option is only available if dd_rescue is compiled with fallocate()
support. For optimal support, it should be compiled with the 
libfallocate library.
.TP 8
.BI \-C\  rate \fR,\ \fB\ \-\-ratecontrol= rate
limits the transfer speed of
.B dd_rescue
to the
.IR rate
(per second). The usual suffixes are allowed.
Note that this limits the average speed; the current speed may be up to
twice this limit. Default is unlimited. Note that you will have to use
smaller softblocksizes if you want to go below 32k (kB/s).
.
.SS Misc options
.TP 8
.BR \-r ", " \-\-reverse
tells
.B dd_rescue
to copy in reverse direction, starting at 
.IR ipos
(with special case 0 meaning EOF) and working towards the beginning of
the file. This is especially helpful if the input file has a bad spot
which can be extremely slow to skip over, so approaching it from both
directions saves a lot of time (and may prevent further damage).
.br
Note that 
.B dd_rescue
does automatically switch to reverse direction copy, if input and output
file are identical and the input position is smaller than the output 
position, similar to the intelligence that memmove() uses to prevent
loss of data when overlapping areas are copied. The option -f / --force
does prevent this intelligence from happening.
.TP 8
.BR \-p ", " \-\-preserve
When copying files, this option does result in file metadata (timestamps,
ownership, access rights, xattrs) to be copied, similar to the option with the
same name in the cp program.
.br
Note that ACLs and xattrs will only be copied if 
.B dd_rescue
has been compiled with libxattr support and the library can be dynamically
loaded on the system. Also note that failing to copy the attributes with
.IR -p
is not considered a failure and thus won't negatively affect the exit code
of dd_rescue.
.TP 8
.BR \-t ", " \-\-truncate
tells
.B dd_rescue
to open the output file with O_TRUNC, resulting in the output file
(if it is a regular file) to be truncated to 0 bytes before writing
to it, removing all previous content that the file may have contained.
By default,
.B dd_rescue
does not remove previous content. 
.TP 8
.BR \-T ", " \-\-trunclast
tells 
.B dd_rescue
to truncate the output file to the highest copied position after the
copy operation completed, thus ensuring there's no data beyond the end
of the data that has been copied in this run.
.TP 8
.BR \-d ", " \-\-odir_in
instructs 
.B dd_rescue
to open
.IR infile
with O_DIRECT, bypassing the kernel buffers. While this option has a negative
effect on performance (the kernel does read-ahead for buffered I/O), it will
result in errors to be detected more quickly (kernel won't retry) and allows
for smaller I/O units (hardware sector size, 512bytes for most hard disks).
.br
O_DIRECT may not be available on all platforms.
.TP 8
.BR \-D ", " \-\-odir_out
tells
.B dd_rescue
to open
.IR outfile
with O_DIRECT, bypassing kernel buffers. This has a significant negative
effect on performance, as the program needs to wait for writes to hit the
disks as opposed to the asynchronous nature of buffered writeback.
On the flip side, the return status from writing is reliable this
way and smaller I/O chunks (hardware sector size, 512bytes) are possible.
.
.SS Logging
.TP 8
.BI \-l\  logfile \fR,\ \fB\-\-logfile= logfile
Unless in quiet mode, 
.B dd_rescue
does produce constant updates on the status of the copy operation to
stderr. With this option, these updates are also written to the specified
.IR logfile .
The control characters (to move the cursor up to overwrite the existing
status lines) are not written to the logfile.
.TP 8
.BI \-o\  bbfile \fR,\ \fB\-\-bbfile= bbfile
instructs 
.B dd_rescue
to write a list of bad blocks to 
.IR bbfile .
The file will contain a list of numbers (ASCII), one per line, where
the numbers indicate the offset in terms of 
.IR hardbs
sized blocks. The file format is compatible with that of badblocks.
Using dd_rescue on a block device (partition) and setting
.IR hardbs
to the block size of a file system that you want to create, you should
be able to feed the 
.IR bbfile
to mke2fs with the option -l.
.
.SS Multiple output files
.TP 8
.BI \-Y\  ofileX \fR,\ \fB\-\-outfile= ofileX \fR,\ \fB\-\-of= ofileX
If you want to copy data to multiple files simultaneously, you can specify
this option. It can be specified multiple times, so many copies can be made.
Note that these files are secondary output files; they share file position
with the primary output file
.IR outfile .
Errors when writing to a secondary output file are ignored.
.
.SS Data protection by overwriting with random numbers
.TP 8
.BI \-z\  RANDSEED \fR,\ \fB\-\-random= RANDSEED
.PD 0
.TP
.BI \-Z\  RANDSEED \fR,\ \fB\-\-frandom= RANDSEED
.TP
.BI \-2\  RANDSEED \fR,\ \fB\-\-shred2= RANDSEED
.TP
.BI \-3\  RANDSEED \fR,\ \fB\-\-shred3= RANDSEED
.TP 
.BI \-4\  RANDSEED \fR,\ \fB\-\-shred4= RANDSEED
.PD 1
.\".PD 0
.\".IP "\fB\-5\fR \fIRANDSEED\fR,\ \fB\-\-shred5=\fR\fIRANDSEED\fR" 4
When you want to overwrite a file, partition or disk with random data,
using /dev/urandom (on Linux) as input is not a very good idea; the interface
has not been designed to yield a high bandwidth. It's better to use a
user space Pseudo Random Number Generator (PRNG). With option -z / --random,
the C library's PRNG is used. With -Z / --frandom and the -2/-3/-4 / 
--shred2/3/4 options, an RC4 based PRNG is used.
.br
Note that in this mode, there is no
.IR infile
so the first non-option argument is the output file.
.br
The PRNG needs seeding; the C libraries PRNG takes a 32bit integer (4 bytes);
the RC4 based PRNG takes 256 bytes. If 
.IR RANDSEED 
is an integer, the integer
number will be used to seed the C library's PRNG. For the RC4 method, the C
library's PRNG then generates the 256 bytes to seed it. This creates
repeatable PRNG data. The RANDSEED value of 0 is special; it will create
a seedval that's based on the current time and the process' PID and should
be different for multiple runs of
.B dd_rescue .
.br
If 
.IR RANDSEED
is not an integer, it's assumed to be a file name from which the seed values
can be read. 
.B dd_rescue
will read 4 or 256 bytes from the file to seed the C library's or the RC4
PRNG. For good pseudo random numbers, using /dev/urandom to seed is a good idea.
.br
The modes -2/-3/-4 resp. --shred2/--shred3/--shred4 will overwrite the output
file multiple times; after each pass, fsync() will ensure that the data does
indeed hit the file. The last pass for these modes will overwrite the file
with zeroes. The rationale behind doing this is to make it easier to hide
that important data may have been overwritten, to make it easier for intelligent
storage systems (such as SSDs) to recycle the empty blocks and to allow for
better compression of a file system image containing such data.
.br
With -2 / --shred2, one pass with RC4 generated PRNG is happening and then
zeroes are written. With -3 / --shred3, there are two passes with RC4 PRNG
generated random numbers and a zero pass; the second PRNG pass writes the
inverse (bit-wise reversed) numbers from the first pass. -4 / --shred4 works
like -3 / --shred3, with an additional pass with independent random numbers
as third pass.
.
.SS Plugins
Since version 1.42,
.B dd_rescue
has an interface for plugins. Plugins have the ability to analyze the
copied data or to transform it prior to it being written.
.
.TP 8
.BI \-L\  plugin1[=param1[:param2[:..]]][,plugin2[=..][,..]]
.PD 0
.TP
.BI \-\-plugins= plugin1[=param1[:param2[:..]]][,plugin2[=..][,..]]
.PD 1
loads plugins plugin1 ... and passes parameters to it. All plugins should support
at least the help parameter and provide information on their usage.
.br
Plugins may impose limits on dd_rescue. Plugins that look at the data
can't work with splice, as this avoids copying data to user space. Also the
interface currently does not facilitate reverse direction copy.
Some plugins may impose further restrictions w.r.t. alignment of data in
the file or not using sparse detection.
.br
See section 
.B PLUGINS
for an overview of available plugins.
.

.SH PLUGINS
.SS null
The null plugin (ddr_null) does nothing, except if you specify the
.B [no]lnchange
or the
.B [no]change
options in which case the plugin indicates to others that it transforms the
length of the output or the data of the stream. (With the no prefix, it's
reset to the default no-change indication again.) 
This may be helpful for testing or to influence which file the hash plugin 
considers for reading/writing extended attributes from/to
and for plugins to change their behavior with respect to hole detection.
.br
ddr_null_ddr also allows you to specify
.B debug
in which case it just reports the blocks that it passes on.
.
.SS hash
When the hash plugin (subsequently referred to as ddr_hash) is loaded, it 
will calculate a cryptographic hash and optionally also a HMAC over the 
copied data and print the result at the end of the copy operations.
The hash algorithm can be chosen by specifying
.B alg[o[rithm]]=ALG
where ALG is one of md5, sha1, sha256, sha224, sha512, sha384. (Specify
alg=help to get a list.)
To abbreviate the syntax, the alg= piece can be omitted.
.br
For backwards compatibility, the hash plugin can also be referred to with the
old MD5 name; it then defaults to the md5 algorithm.
.br
The computed value should be identical to calling md5sum/sha256sum/... on 
the target file (unless you only write part of the file),
but saves time by not accessing the (possibly large) file a second time.
The hash plugin handles sparse writes and arbitrary offsets fine.
.PP
.B multipart=CHUNKSIZE
tells ddr_hash to calculate multiple checksums for file chunks of CHUNKSIZE
each and then combine them into a combined checksum by creating a checksum
over the piece checksums. This is how the checksum for S3 multipart objects
is calculated (using the md5 hash); the output there is the combination 
checksum with a dash and the number of parts appended.
.br
Note that this feature is new in 1.99.6 and does not yet handle situations
cleanly, where offsets plus block sizes do not happen to cleanly align
with the CHUNKSIZE. The implementation for this will be completed later.
Other features like the append/prepend/hmac pieces also don't work well with
multipart checksum calculation.
.PP
ddr_hash also supports the parameter
.B append=STRING
which appends the given STRING to the output before computing the cryptographic
hash. Treating the STRING as a shared secret, this can actually be used to protect
against someone not knowing the secret altering the contents (and recomputing the 
hash) without anyone noticing. It's thus a cheap way of a cryptographic signature
(but with preshared secrets as opposed to public key cryptography). Use HMAC for a
somewhat better way to sign data with a shared secret.
.br
ddr_hash also supports
.B prepend=STRING
which is likely harder to attack with brute force than an appended string.
Note that ddr_hash always prepends multiples of the hash algorithm's block
size and pads the STRING with 0 to match.
.PP
ddr_hash can be used to compute a HMAC (Hash-based Message Authentication
Code) instead of the plain hash. The HMAC uses a password that's 
prepended and transformed twice to the data which is then hashed twice. 
HMAC is believed to protect somewhat
better against extension or collision attacks than a plain hash (with a
plain prepended secret), so it's a better way to authenticate data with a
shared secret. (You can use append/prepend in addition to HMAC, if you
have a need for a scheme with more than one secret.)
.br
When HMAC is enabled with one of the following parameters, both the plain hash
and the HMAC are computed by ddr_hash. Both are output to the console/log,
but the HMAC is used instead of the hash value to be written to a CHECKSUMS
file or to an extended attribute or checked against (see below).
.B hmacpwd=STRING
sets the shared secret (password) for computing the HMAC. Passing the secret on
the command line has the disadvantage that the shell may mistreat some bytes
as special characters and that the command line may be visible to all logged in
users on the system.
.B hmacpwdfd=INT
sets a file descriptor from with the secret (password) for HMAC computation will
be read. Specifying 0 means standard input, in which case ddr_hash even prints
a prompt for you ... Other numbers may be useful if dd_rescue is called from
another program that opens a pipe to pass the secret.
.B hmacpwdnm=INNAME
sets a file from which the shared secret (password) is read. Note that all bytes
(up to 2048 of them) are read and used, including trailing white space, 0-bytes
or newlines.
.br
Please note that the ddr_hash plugin at this point does NOT take a lot of care
to prevent the password/pre/appended secret from remaining in memory or leaking
into a swap/page file. (This will be improved once I look into encryption plugins.)
.PP
ddr_hash accepts the parameter 
.B output
, which will cause ddr_hash to output
the cryptographic hash to stdout in the same format that md5sum/sha256sum/... use.
You can also specify
.B outfd=INT
to have the plugin write the hash to a different
file descriptor specified by the integer number INT. Note that ddr_hash
always processes data in binary mode and correctly indicates this with
a star (*) in the output generated with output/outfd=.
.br
The checksum can also be written to a file by giving the
.B outnm=OUTNAME
parameter. Then a file with OUTNAME will be created and a md5sum/sha256sum/...
compatible line will be printed to the file. If the file exists and contains
an entry for the file, it will be updated. If the file exists and does not
contain an entry for the file, one will be appended. If OUTNAME is omitted, the
file name CHECKSUMS.alg (or HMACS.alg if HMAC is enabled) will be used (alg 
is replaced by the chosen algorithm).
If the checksum can't be written, a warning will be printed and the exit code
of dd_rescue will become non-zero.
.PP
The checksum can be validated using 
.B chknm=CHKNAME .
The file will be read and ddr_hash will look for an md5sum/sha256sum/...
compatible line with a matching file name to take the checksum from and
compare it to the one computed. If NAME is omitted, the same default 
as described above (in outnm=...) will be used. You can also read the
checksum from stdin if you prefer by specifying the
.B check
option.
.br
Note that in any case, the check is only performed after the copy operation
is completed -- a faulty checksum will thus NOT result in the copy not
taking place. However, the exit code of dd_rescue will indicate the
error. (If you want to avoid copying data with a broken checksum into
the final target, use a temporary target that you delete upon error and
only move to the final location if dd_rescue's exit value is 0; you can
of course also copy to /dev/null for testing beforehand, but it might
be too costly reading the input file twice.)
.br
If in addition to 
.B chknm (or
.B chk_xattr
) the option
.B chkadd
is specified, then a missing checksum will not be reported as error,
but instead an entry to the checksum file (or xattr) be added. A mismatch
will still be reported as error and the checksum file will not be
updated.
.PP
You can store the cryptographic hash into the files by using the
.B set_xattr
option. The hash will be stored into the extended attribute user.checksum.ALG
by default (user.hmac.ALG if HMAC is enabled), but you can override the name
of the attribute by specifying
.B set_xattr=XATTR\.NAME
instead. If the xattr can't be written, an error will be reported, unless
you also specify the 
.B fallb[ack][=CHKNAME]
option. In that case, ddr_hash tries to write the checksum to the CHKNAME
checksums file. (For the default for CHKNAME, see outnm= option above.)
.br
.B chk_xattr
will validate that the computed hash matches the one read from the extended
attribute. The same default attribute name applies and you can likewise override
it with
.B chk_xattr=XATTR\.NAME .
A missing attribute is considered an error (although the same fallback is
tried if you specify the fallback option). A broken checksum is of course
considered an error as well, but just like with checknm=CHKNAME won't
prevent the copy. See the discussion there.
.PP
Note that for output,outfd,outnm=,set_xattr ddr_hash will use the 
output file name to attach the checksum to (be it by setting xattr or the
file name used in the checksum file), unless a plugin 
in the chain after ddr_hash indicates that it changes the data.
In that case, it will warn and associate the checksum with the input file
name, unless there's another plugin before ddr_hash in the chain which 
indicates data transformation as well. In that case, there is no file that
the checksum could be associated with and ddr_hash will report an error.
.br
Likewise for chknm=,check,chk_xattr ddr_hash will use the input file
name to get the checksum (be it by reading the xattr or by looking for
the input file name in a checksums file) unless there's a plugin in the
chain before ddr_hash that indicates that it changes the data. The output
file name will then be used, unless there's another plugin after ddr_hash 
indicating data change as well, in which case there's no file we could
get the checksum for and thus an error is reported.
.PP
If your system supports extended attributes, those have the advantage
of traveling with the files; thus a rename or copy (with dd_rescue -p)
will maintain the checksum. Checksum files on the other hand can be
handled everywhere (including the transfer via ftp or http) and can
be cryptographically signed with PGP/GnuPG.
.PP
Please note that the md5 algorithm is NOT recommended any more for
good protection against malicious attempts to hide data modification;
it's not considered strong enough any more to prevent hash collisions.
sha1 is a bit better, but has been broken as well as of 2017.
The recommendation is to use the SHA-2 family of hashes.
On 32bit machines, I'd recommend sha256, while on 64bit machines, sha512
is faster and thus the best choice. Note that there is hardware acceleration
on some x86-64 and most armv8/aarch64 CPUs for sha256, so it is faster
there than sha512. dd_rescue detects and uses this acceleration (since
1.99.16).
.PP
ddr_hash also supports using the HMAC code and hashes for deriving
keys from passwords using the PKCS5 PBKDF2 (password-based key derivation
function) that allows you to improve the protection from mediocre passwords
by using a salt and a relatively expensive key stretching operation. This
is only meant for testing and may be removed in the future. It's thus 
not documented in this man page. See
the built-in help function for a brief summary on the usage.
.
. SS crypt
The crypt plugin allows to encrypt and decrypt data on the fly.
It currently supports a variety of AES ciphers.
See the
.BR ddr_crypt (1)
man page for more details.
.
.SS lzo
The lzo plugin allows to compress and decompress data using liblzo2.
lzo is an algorithm that is faster than most other algorithms but
does not compress as well.
See the
.BR ddr_lzo (1)
man page for more details.
.
. SS ddr_lzma
The xz de/compression plugin using liblzma. Supports most of popular
options from the xz utils. lzma is an algorithm that produces highly
compressed files at the cost of being expensive in terms of CPU and
memory consumption during compression. Decompression is fast.
See the
.BR ddr_lzma (1)
man page for more details.
.
.SH EXIT STATUS
On successful completion, 
.B dd_rescue
returns an exit code of 0.
Any other exit code indicates that the program has aborted because of an 
error condition or that copying of the data has not been entirely successful.
.PP
.\"TODO: Better documentation of the error codes!
.
.SH EXAMPLES
.TP
.BI dd_rescue\ \-k\ \-P\ \-p\ \-t\ infile\ outfile
copies
.IR infile
to
.IR outfile
and does truncate the output file on opening (so deleting any previous data
in it), copies mode, times, ownership at the end, uses fallocate to
reserve the space for the output file and uses efficient in kernel splice
copy method.
.TP
.BI dd_rescue\ \-A\ \-d\ \-D\ \-b\ 512\ /dev/sda\ /dev/sda
reads the contents of every sector of disk sda and writes it back to the
same location. Typical hard disks reallocate flaky and faulty sectors on 
writes, so this operation may result in the complete disk being usable
again when there were errors before. Unreadable blocks however will contain
zeroes after this.
.TP
.BI dd_rescue\ \-2\ /dev/urandom\ \-M\ outfile
overwrites the file
.IR outfile
twice; once with good pseudo random numbers and then with zeroes.
.TP
.BI dd_rescue\ \-t\ \-a\ image1.raw\ image2.raw
copies a file system image and looks for empty blocks to create a
sparse output file to save disk space. (If the source file system
has been used a bit, on that file system creating a large file with
zeroes and removing it again prior to this operation will result
in more sectors with zeroes. 
.BI dd_rescue\ \-u\ /dev/zero\ DUMMY
will achieve this ...)
.TP
.BI dd_rescue\ \-ATL\ hash=md5:output,lzo=compress:bench,MD5:output\ in\ out.lzo
copies the file
.IR in
to
.IR out.lzo
with using lzo (lzo1x_1) compression and calculating an md5 hash
(checksum) on both files. The md5 hashes for both are also written 
to stdout in the md5sum output format.
Note that the compress parameter to lzo is not strictly required 
here; the plugin could have deduced
it from the file names. This example shows that you can specify multiple
plugins with multiple parameters; the plugins are forming a filter
chain. You can specify the same plugin multiple times.
.TP
.BI dd_rescue\ \-L\ hash=sha512:set_xattr:fallb,null=change\ infile\ /dev/null
reads the file 
.IR infile
and computes its sha512 hash. It stores it in the input file's user.checksum.sha512
attribute (and falls back to writing it to CHECKSUMS.sha512 if xattrs can't be
written). Note the use of the null plugin with faking data change with
the change parameter; this causes the hash plugin to write to the input
file which it would not normally have done. Of course this
will fail if you don't have the appropriate privileges to write xattrs to
infile nor to write the checksum to CHECKSUMS.sha512.
.PP
See also README.dd_rescue and ddr_lzo(1) to learn about the possibilities.
.
.SH TESTING
Untested code is buggy, almost always.
I happen to have a damaged hard disk that I use for testing dd_rescue from
time to time. But to allow for automated testing of error recovery, it's
better to have predictable failures for the program to deal with. So there
is a fault injection framework.
.br
Specifying
.B -F\ 5w/1,17r/3,42r/-1,80-84r/0
on the command-line will result in in the 5th block (counted in hardblocksize)
will fail to be written once (from which dd_rescue should recover, as it
tries a second time for failed writes), block no 17 will fail to be read 3 times,
block no 42 will read fine once, but then fail afterwards, whereas blocks 80
through 83 are completely unreadable (will fail infinite times). Note that
the range excludes the last block (80-84 means 4 blocks starting @ 80).
.br
Block offsets are always counted in absolute positions, so starting in
the middle of a file with -s or reverse copying won't affect the absolute
position that is hit with the fault injection. (This has changed since
1.98.)

.SH BUGS/LIMITATIONS
The source code does use the 64bit functions provided by glibc for file
positioning. However, your kernel might not support it, so you might be
unable to copy partitions larger then 2GB into a file.
.br
This program has been written using Linux and only tested on a couple of
Linux systems. People have reported to have successfully used it on
other Un*xish systems (such as xBSD or M*cOS), but these systems get little
regular test coverage; so please be advised to test properly (possibly
using the make check test suite included with the source distribution) 
before relying on dd_rescue on non Linux based systems.
.br
Currently, the escape sequence for moving the cursor up is hard coded in the
sources. It's fine for most terminal emulations (including vt100 and linux),
but it should use the terminal description database instead.
.br
Since dd_rescue-1.10, non-seekable input or output files are supported,
but there's of course limitations to recover errors in such cases.
.PP
dd_rescue does not automate the recovery of faulty files or partitions
by automatically keeping a list of copied sectors and approaching bad spots
from both sides. There is a helper script dd_rhelp from LAB Valentin that
does this. Integration of such a mode into 
.B dd_rescue
itself is non-trivial and due to the complexity of the source code might
not happen.
.br
There also is a tool, GNU ddrescue, that is a reimplementation of this
tool and which contains the capabilities to automate recovery of bad
files in the way dd_rhelp does. It does not have the feature richness
of dd_rescue, but is reported to be easier to operate for error recovery
than dd_rescue with dd_rhelp.
.PP
If your data is very valuable and you are considering sending your disk
to a data recovery company, you might be better off NOT trying to use
imaging tools like dd_rescue, dd_rhelp or GNU ddrescue. If you're unlucky,
the disk has suffered some mechanical damage (e.g. by having been dropped),
and continuing to use it may make the head damage the surface further.
You may be able to detect this condition by quickly raising error counts
in the SMART attributes or by a clicking noise.
.PP
Please report bugs to me via email.
.
.SS Data destruction considerations
The modes for overwriting data with pseudo random numbers to securely
delete sensitive data on purpose only implement a limited number of
overwrites. While Peter Gutmann's classic analysis concludes that the
then current hard disk technology requires more overwrites to be really
secure, the author believes that modern hard disk technology does not
allow data restoration of sectors that have been overwritten with the
--shred4 mode. This is in compliance with the recommendations from
BSI GSDS M7.15.
.br
Overwriting whole partitions or disks with random numbers is a fairly safe
way to destroy data, unless the underlying storage device does too much
magic. SSDs are doing fancy stuff in their Flash Translation Layer (FTL),
so this tool might be insufficient to get rid of data. Use 
SECURITY_ERASE (use hdparm) there or -- if available -- encrypt data with 
AES256 and safely destroy the key.
Normal hard disks have a small risk of leaking a few sectors
due to reallocation of flaky sectors.
.br
For securely destroying single files, your mileage may vary. The more advanced
your file system, the less likely dd_rescue's destruction will be effective.
In particular, journaling file systems may carry old data in the journal.
File systems that do copy-on-write (COW) such as btrfs, are very likely to have
old copies of your supposedly erased file. It might help somewhat to fill the
file systems with zeros (dd_rescue -u /dev/zero /path/to/fs/DUMMYNAME) to force
the file system to release and overwrite non-current data after overwriting
critical files with random numbers. If you can, better destroy a whole
partition or disk.
.
.SH SEE ALSO
.BR README.dd_rescue 
.BR README.dd_rhelp 
.BR ddr_lzo (1)
.br
.BR wipe (1)
.BR shred (1)
.BR ddrescue (1)
.BR dd (1)
.
.SH AUTHOR
Kurt Garloff <kurt@garloff.de>
.
.SH CREDITS
Many little issues were reported by Valentin LAB, the author of 
.B dd_rhelp .
.br
The RC4 PRNG (frandom) is a port from Eli Billauer's kernel mode PRNG.
.br
A number of recent ideas and suggestions came from Thomas.
.SH COPYRIGHT
This program is protected by the GNU General Public License (GPL) 
v2 or v3 - at your option.
.SH HISTORY
Since version 1.10, non seekable input and output files are supported.
.br
Splice copy -k is supported since 1.15.
.br
A progress bar exists since 1.17.
.br
Support for preallocation (fallocate) -P exists since 1.19.
.br
Since 1.23, we default to -y0, enhancing performance.
.br
The Pseudo Random Number modes have been started with 1.29.
.br
Write avoidance -W has been implemented in 1.30
.br
Multiple output files -Y have been added in 1.32.
.br
Long options and man page came with 1.33.
.br
Optimized sparse detection (SSE2, armv6, armv8 asm, AVX2) has 
been present since 1.35 and been enhanced until 1.43.
.br
We support copying extended attributes since 1.40 using
libxattr.
.br
Removing and (fs)trimming the output file's file system
exists since 1.41. Support for compilation with bionic
(Android's C library) with most features enabled also 
came with 1.41.
.br
Plugins exist since 1.42, the MD5 plugin came with 1.42, the
lzo plugin with 1.43. 1.44 renamed the MD5 plugin to hash and
added support for the SHA-2 family of hashes. 1.45 added SHA-1
and the ability to store and validate checksums.
.br
1.98 brought encryption and the fault injection framework, 
1.99 support for ARMv8 crypto acceleration.
1.99.5 brought ratecontrol.
1.99.6 brought S3 style multipart checksums.
.PP
Some additional information can be found on
.br
http://garloff.de/kurt/linux/ddrescue/
.br
LAB Valentin's 
.B dd_rhelp
can be found on
.br
http://www.kalysto.org/utilities/dd_rhelp/index.en.html