Skip to content

Python package for generating and comparing lists of filenames and hashes

Notifications You must be signed in to change notification settings

slbelden/md5ls.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

md5ls

Tools for verifying files with hash-list manifests.

Motivation

To verify that my backups restore without errors, I was using this classic Bash one-liner:

find . -type f -exec md5sum {} + | LC_ALL=C sort -k2

Eventually the limitations of this approach became burdensome, so I created a one-to-one python implementation. When run without arguments, the command:

md5ls create

produces output identical to the bash command (on all tested systems).

The project has grown to include many quality-of-life improvements and additional features.

Benefits

If all you need is the behavior of the original Bash command, here are some quality-of-life features that make this Python version worth using:

Speed

Use multithreading to greatly improve performance:

md5ls create -j 8

where 8 can be replaced with the number of CPU cores available to you. Just 6 threads can produce a 10x performance improvement over the bash version in my testing.

Ease of Use

The Bash version requires that you cd into the directory you are generating a manifest for, since find . must use the current working directory as its root to get consistent relative filepaths in the output. The -r option allows you to run the command from anywhere by specifying the directory:

md5ls create -r /path/dir/folder/

Cross-Platform Support

Use the -o flag to generate a consistent manifest on all* systems:

md5ls create -o /folder/file.out

The file will always have unix-style line endings, and use unix-style folder separators, even when run on Windows. This allows easy diff comparisons between two manifests, even between different platforms.

*tested on Windows 10 & 11, Ubuntu 22.04

Better diff

You can use a basic diff command to compare two manifests:

diff file1.out file2.out

but the output can be hard to read, especially if there are many differences. Instead, I've created md5ls diff to produce more human-readable output. By default, the output adds headings and sorts changes into sections:

md5ls diff file1.out file2.out

Generate only a summary of changes, without the full list of lines with -s:

md5ls diff file1.out file2.out -s

Installation

  1. Navigate to a good temporary directory of your choice:

    cd ~

  2. Clone this repository.

    git clone https://github.com/slbelden/md5ls.py.git

  3. Change directory into the folder git just downloaded:

    cd md5ls.py

  4. Install (substitute pipx for pip if your system complains):

    pip install .

  5. Manage your PATH on your own, good luck, then use:

    md5ls

Usage

Get basic usage help with -h:

md5ls -h

and subcommand help in the same manner:

md5ls create -h

About

Python package for generating and comparing lists of filenames and hashes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages