-
-
Notifications
You must be signed in to change notification settings - Fork 27
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d5707f3
commit 38e544f
Showing
7 changed files
with
279 additions
and
394 deletions.
There are no files selected for viewing
92 changes: 49 additions & 43 deletions
92
collection-one/modules/command-line/command-line-utilities/data-transfer.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,143 +1,149 @@ | ||
|
||
# Data Transfer Techniques in the Command Line | ||
# Advanced Command Line Data Transfer Techniques | ||
|
||
## Overview | ||
|
||
Data transfer utilities are crucial for efficiently moving and synchronizing data between local and remote systems. This lesson delves deep into advanced use-cases for `scp`, `rsync`, `wget`, and `ftp`. | ||
Data transfer is akin to logistical operations, where moving resources efficiently and securely is paramount. This lesson focuses on the command line tools that serve as the backbone for data movement and synchronization between systems. | ||
|
||
## Table of Contents | ||
|
||
1. [Introduction](#introduction) | ||
2. [`scp`](#scp) | ||
3. [`rsync`](#rsync) | ||
4. [`wget`](#wget) | ||
5. [`ftp`](#ftp) | ||
2. [`scp` - Secure Copy Protocol](#scp---secure-copy-protocol) | ||
3. [`rsync` - Remote Sync](#rsync---remote-sync) | ||
4. [`wget` - Web Get](#wget---web-get) | ||
5. [`ftp` - File Transfer Protocol](#ftp---file-transfer-protocol) | ||
6. [Rate Limiting and Throttling](#rate-limiting-and-throttling) | ||
7. [Best Practices](#best-practices) | ||
|
||
--- | ||
|
||
## Introduction | ||
|
||
Being able to move data securely and efficiently is a skill often overlooked but crucial for any software engineer. | ||
Mastering data transfer utilities ensures that you can move and manage data with precision and security, essential skills in software engineering and system administration. | ||
|
||
--- | ||
|
||
## `scp` | ||
## `scp` - Secure Copy Protocol | ||
|
||
### Overview | ||
|
||
`scp` (Secure Copy) is used for securely transferring files between local and remote hosts. | ||
`scp` mirrors the strategy of securely moving critical assets between locations, utilizing SSH for data protection. | ||
|
||
### Advanced Usage | ||
|
||
#### Copy with Port Specified | ||
#### Specifying Ports | ||
|
||
Copy files from a remote host to the local host with a specific SSH port. | ||
Transfer files using a non-standard SSH port for added security. | ||
|
||
```bash | ||
scp -P 2222 username@remote:/path/to/file /local/path/ | ||
scp -P 2222 user@remote:/path/to/file /local/directory | ||
``` | ||
|
||
#### Copying Entire Directories | ||
#### Recursive Copying | ||
|
||
Move entire directories, preserving the structure and permissions. | ||
|
||
```bash | ||
scp -r username@remote:/path/to/folder /local/path/ | ||
scp -r user@remote:/directory/ /local/directory | ||
``` | ||
|
||
--- | ||
|
||
## `rsync` | ||
## `rsync` - Remote Sync | ||
|
||
### Overview | ||
|
||
`rsync` is for syncing data locally or remotely, often used for backups. | ||
`rsync` is the logistical coordinator for data, optimizing the transfer process for efficiency and integrity. | ||
|
||
### Advanced Usage | ||
|
||
#### Synchronize Remote to Local with Compression | ||
#### Efficiency with Compression | ||
|
||
Minimize bandwidth usage by compressing data during transfer. | ||
|
||
```bash | ||
rsync -avz username@remote:/path/to/folder /local/path/ | ||
rsync -avz user@remote:/source /local/destination | ||
``` | ||
|
||
#### Exclude Files | ||
#### Precision in Exclusions | ||
|
||
Exclude specific files or directories during the sync. | ||
Target your transfer by excluding non-essential data. | ||
|
||
```bash | ||
rsync -av --exclude 'tmp/*' source/ destination/ | ||
rsync -av --exclude 'path/to/exclude' /source /destination | ||
``` | ||
|
||
#### Bandwidth Limit | ||
#### Bandwidth Management | ||
|
||
Limit the data transfer rate. | ||
Control the operation's impact on your network resources. | ||
|
||
```bash | ||
rsync --bwlimit=1000 source/ destination/ | ||
rsync --bwlimit=1000 /source /destination | ||
``` | ||
|
||
--- | ||
|
||
## `wget` | ||
## `wget` - Web Get | ||
|
||
### Overview | ||
|
||
`wget` is a non-interactive downloader. | ||
`wget` facilitates the retrieval of data from web servers, acting as a digital supply line. | ||
|
||
### Advanced Usage | ||
|
||
#### Download in the Background | ||
#### Background Operations | ||
|
||
Download large files in the background, minimizing disruption. | ||
|
||
```bash | ||
wget -b url | ||
``` | ||
|
||
#### Retry Downloads | ||
#### Handling Disruptions | ||
|
||
Automatically retry the download in case of a failure. | ||
Ensure successful downloads by configuring retries and timeouts. | ||
|
||
```bash | ||
wget --retry-connrefused --waitretry=seconds --timeout=seconds url | ||
wget --retry-connrefused --waitretry=10 --timeout=60 url | ||
``` | ||
|
||
--- | ||
|
||
## `ftp` | ||
## `ftp` - File Transfer Protocol | ||
|
||
### Overview | ||
|
||
`ftp` (File Transfer Protocol) is used for transferring files between local and remote file systems. | ||
`ftp` supports basic file transfers, suitable for non-secure data movements. | ||
|
||
### Advanced Usage | ||
|
||
#### Switch to Passive Mode | ||
#### Enhancing Throughput | ||
|
||
Utilize passive mode to improve connection stability and speed. | ||
|
||
```bash | ||
ftp -p host | ||
``` | ||
|
||
#### Auto-login and Batch Processing | ||
#### Streamlining Operations | ||
|
||
Use a `.netrc` file to store credentials and run FTP commands from a script. | ||
Automate transfers and manage credentials securely. | ||
|
||
```bash | ||
ftp -s:ftp_commands.txt host | ||
ftp -s:script.txt host | ||
``` | ||
|
||
--- | ||
|
||
## Rate Limiting and Throttling | ||
|
||
Learn how to control your data transfer speed to prevent bottlenecking network resources. | ||
Managing your data transfer rates is crucial to avoid overloading network capabilities, much like managing supply lines to avoid congestion. | ||
|
||
- `rsync --bwlimit=1000` to limit rsync bandwidth to 1000 KB/s. | ||
- `wget --limit-rate=300k` to limit wget download speed to 300 KB/s. | ||
- `rsync` and `wget` provide options to limit transfer speeds, ensuring network resources are utilized judiciously. | ||
|
||
--- | ||
|
||
## Best Practices | ||
|
||
- Always validate the integrity of transferred files using checksums. | ||
- For mission-critical transfers, prefer utilities that offer resume capabilities. | ||
- Use compression flags when network bandwidth is a limiting factor. | ||
- **Integrity Checks**: Always verify the integrity of your data post-transfer. | ||
- **Resumption Capability**: For critical operations, use tools that can resume interrupted transfers. | ||
- **Efficiency**: Utilize compression to reduce bandwidth usage, crucial in bandwidth-constrained environments. |
97 changes: 48 additions & 49 deletions
97
collection-one/modules/command-line/command-line-utilities/file-compression.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,141 +1,140 @@ | ||
|
||
# File Compression Techniques | ||
# Advanced Command Line File Compression Techniques | ||
|
||
## Overview | ||
|
||
Mastering file compression techniques can significantly speed up tasks and optimize resources. This in-depth guide discusses advanced topics in `zip`, `tar`, `gzip`, and `bzip2`. | ||
File compression is the digital equivalent of efficient packing for deployment: it maximizes storage space and minimizes transfer times. This guide explores the nuances of `zip`, `tar`, `gzip`, and `bzip2`, offering insights into their optimal use cases. | ||
|
||
## Table of Contents | ||
|
||
1. [Introduction](#introduction) | ||
2. [`zip`](#zip) | ||
3. [`tar`](#tar) | ||
4. [`gzip`](#gzip) | ||
5. [`bzip2`](#bzip2) | ||
6. [Comparison of Algorithms](#comparison-of-algorithms) | ||
7. [Best Practices](#best-practices) | ||
2. [`zip` - Packaging for Efficiency](#zip---packaging-for-efficiency) | ||
3. [`tar` - The Digital Quartermaster](#tar---the-digital-quartermaster) | ||
4. [`gzip` - Optimizing for the Long Haul](#gzip---optimizing-for-the-long-haul) | ||
5. [`bzip2` - The Heavy Lifter](#bzip2---the-heavy-lifter) | ||
6. [Comparison of Compression Algorithms](#comparison-of-compression-algorithms) | ||
7. [Best Practices in File Compression](#best-practices-in-file-compression) | ||
|
||
--- | ||
|
||
## Introduction | ||
|
||
File compression is not just about saving disk space; it's also about optimizing file transfers and even computational performance. | ||
Understanding file compression is akin to mastering supply chain logistics: it's about optimizing what you pack (file sizes), how fast you move (transfer speeds), and how much you can carry (storage efficiency). | ||
|
||
--- | ||
|
||
## `zip` | ||
## `zip` - Packaging for Efficiency | ||
|
||
### Overview | ||
|
||
`zip` is a utility for packaging and compressing files. | ||
`zip` is like a versatile utility knife, ideal for packaging and compressing files for easy sharing and storage. | ||
|
||
### Advanced Usage | ||
|
||
#### Exclude Files | ||
#### Precision Exclusions | ||
|
||
Exclude specific files from a zip archive. | ||
Exclude non-essential items to keep your package lean. | ||
|
||
```bash | ||
zip archive.zip -r folder/ -x \*.git\* | ||
zip archive.zip -r target_folder/ -x \*exclude_pattern\* | ||
``` | ||
|
||
#### Update Mode | ||
#### Dynamic Updates | ||
|
||
Update an existing zip file with new files. | ||
Refresh your package with new or updated items without starting from scratch. | ||
|
||
```bash | ||
zip -u archive.zip newfile.txt | ||
zip -u archive.zip updated_file.txt | ||
``` | ||
|
||
--- | ||
|
||
## `tar` | ||
## `tar` - The Digital Quartermaster | ||
|
||
### Overview | ||
|
||
`tar` is used primarily for archiving files, and can be combined with various compression algorithms. | ||
`tar` acts as your digital quartermaster, organizing and bundling supplies (files) efficiently for storage or deployment. | ||
|
||
### Advanced Usage | ||
|
||
#### Incremental Backups | ||
#### Streamlined Backups | ||
|
||
Create incremental backups to save only changed files since the last backup. | ||
Implement incremental backups, capturing only what has changed, much like updating supply caches. | ||
|
||
```bash | ||
tar --listed-incremental=/path/to/snapshot.file -cvzf backup.tar.gz /path/to/folder | ||
tar --listed-incremental=snapshot.file -cvzf backup.tar.gz target_directory/ | ||
``` | ||
|
||
#### Remote Archiving | ||
#### Secure Remote Deliveries | ||
|
||
Archive a directory and pipe it through SSH to another machine. | ||
Directly ship your bundled assets to a remote location securely over SSH. | ||
|
||
```bash | ||
tar czf - /path/to/dir | ssh user@host "cat > backup.tar.gz" | ||
tar czf - target_directory/ | ssh user@remote "cat > remote_backup.tar.gz" | ||
``` | ||
|
||
--- | ||
|
||
## `gzip` | ||
## `gzip` - Optimizing for the Long Haul | ||
|
||
### Overview | ||
|
||
`gzip` is optimized for high compression ratios. | ||
`gzip` focuses on maximizing payload efficiency, delivering the best compression ratios for faster transfers over constrained networks. | ||
|
||
### Advanced Usage | ||
|
||
#### Compression with a Name Suffix | ||
#### Custom Identifiers | ||
|
||
Compress files and add a suffix. | ||
Mark your compressed files with custom suffixes for easy recognition. | ||
|
||
```bash | ||
gzip -S .archive large_file.txt | ||
gzip -S .custom_suffix large_file | ||
``` | ||
|
||
#### Concatenating Archives | ||
#### Efficient Archiving | ||
|
||
Multiple `.gz` files can be concatenated into one. | ||
Combine multiple archives into a single streamlined package. | ||
|
||
```bash | ||
cat file1.gz file2.gz > combined.gz | ||
cat archive_part1.gz archive_part2.gz > combined_archive.gz | ||
``` | ||
|
||
--- | ||
|
||
## `bzip2` | ||
## `bzip2` - The Heavy Lifter | ||
|
||
### Overview | ||
|
||
`bzip2` usually offers better compression ratios compared to `gzip`. | ||
`bzip2` excels in heavy-duty compression, providing superior efficiency at the cost of speed, suitable for large-scale archival. | ||
|
||
### Advanced Usage | ||
|
||
#### Decompress to STDOUT | ||
#### Direct Output | ||
|
||
Decompress directly to the standard output. | ||
Stream decompressed data for immediate use or further processing. | ||
|
||
```bash | ||
bzip2 -dc file.bz2 | ||
bzip2 -dc archive.bz2 > output_file | ||
``` | ||
|
||
#### Parallel Compression | ||
#### Accelerated Compression | ||
|
||
Use `pbzip2` for parallel and faster compression. | ||
Utilize parallel processing to compress large files more quickly. | ||
|
||
```bash | ||
pbzip2 -p4 large_file.txt | ||
pbzip2 -p4 massive_file | ||
``` | ||
|
||
--- | ||
|
||
## Comparison of Algorithms | ||
## Comparison of Compression Algorithms | ||
|
||
- **Deflate**: Used in `zip` and `gzip`, offers fast compression but somewhat lower ratios. | ||
- **Bzip2**: Slower but offers better compression ratios. | ||
- **Deflate** (used by `zip` and `gzip`): Fast and efficient for everyday use. | ||
- **Bzip2**: Trades speed for superior compression, ideal for large archives. | ||
|
||
--- | ||
|
||
## Best Practices | ||
## Best Practices in File Compression | ||
|
||
- Consider the CPU cost when choosing a compression level. | ||
- For long-term storage, use stable and well-supported formats. | ||
- Always check the integrity of compressed archives before and after transferring them. | ||
- **Resource Management**: Balance compression ratio and CPU usage to match your operational needs. | ||
- **Archival Integrity**: Use robust formats and verify archives to ensure data integrity over time. | ||
- **Strategic Selection**: Choose the compression tool and level based on your specific requirements, considering factors like speed, size, and computational resources. |
Oops, something went wrong.