Backs up hosts using rdiff-backup.
Chef >= 12.18
yum
(for installing rdiff-backup)yum-epel
(for installing rdiff-backup)
sudo
(for reading files to back up and so the nrpe user can read the status of repos)nrpe
(for nagios alerts regarding backup statuses)
nagios
(for nagios alerts regarding backup statuses)
- Can back up clients managed by Chef
- Can back up clients not managed by Chef
- Can send Nagios alerts if backups fail, start late, run too long, grow too quickly, or become corrupted
- [Planned] Can back up MySQL and PostgreSQL databases
Jobs can be specified through node definitions, roles, cookbooks, and/or
databags for both the rdiff-backup server and each rdiff-backup client. Jobs
run within a certain time period once every night. One rdiff-backup server may
find and back up all hosts on the same chef server (or just within the same
environment) via Chef search, and is capable of backing up non-chef hosts as
well, as long as a databag is created for it. When Chef runs on the
rdiff-backup server, jobs are created for each directory on each host to be
backed up. Each job has an entry added to the crontab (at
/etc/cron.d/rdiff-backup
) and a backup script added to
/home/rdiff-backup-server/scripts
. These simple bash scripts are generally
run by cron, but may be run manually at any time to start a job if it is not
already running. A job will back up the source directory on the given host to
/data/rdiff-backup
(or other specified directory) on the rdiff-backup server.
Additional log files are generated at /var/log/rdiff-backup
, and include both
a general log for all backups as well as more verbose logs for each individual
job.
The rdiff-backup server may also act as a client and back up its own local directories. This is useful for backing up locally-mounted network filesystems. Backing up the rdiff-backup data directory is not recommended.
It is worth noting that rdiff-backup is run with --exclude-other-filesystems
by default, which means that subdirectories on different volumes/filesystems
will not be backed up. For example, if /var and /var/www are on different
filesystems, both must be specified as backup dirs explicitly to back both of
them up, as rdiff-backup will not recurse into /var/www from within the /var
backup.
If Nagios alerts are enabled, checks will be added for new backups after they run for the first time, after which they will alert if the backups fail, start late, run too long, grow too quickly, or become corrupted.
node['rdiff-backup']['server']['start-hour']
- Earliest hour of the day to schedule jobs, default "13"node['rdiff-backup']['server']['end-hour']
- Latest hour of the day to schedule jobs, default "23"node['rdiff-backup']['server']['restrict-to-own-environment']
- Whether to back up all rdiff-backup clients or only the ones in the same environment as the server, default "true"node['rdiff-backup']['server']['restrict-to-environments']
- Ifrestrict-to-own-environment
is false, the rdiff-backup server will only back up clients in this given list of environments, defaults to all environments (effectively disabling all environment restrictions)node['rdiff-backup']['server']['mailto']
- The email address(es) (comma delimited) to mail cron reports to, default "" (no mail is sent)node['rdiff-backup']['server']['nagios']['alerts']
- Whether to provide Nagios alerts for the status of each job, default "true" (must be enabled for any alerts to be created)node['rdiff-backup']['server']['user']
- User to run backups with on the server side, default "rdiff-backup-server"node['rdiff-backup']['server']['jobs']
- Map of jobs to run for all clients, where each key is a source dir to back up (or 'default') and each value is a hash of attributes specific to that job, defaults below
node['rdiff-backup']['client']['ssh_keys']
- Array of public ssh keys. default[]
node['rdiff-backup']['client']['ssh-port']
- SSH port to connect to, default "22"node['rdiff-backup']['client']['user']
- User to run backups with on the client side, default "rdiff-backup-client"node['rdiff-backup']['client']['jobs']
- Map of jobs to run for a particular client, where each key is a source dir to back up (or 'default') and each value is a hash of attributes specific to that job, default empty
Job attributes may be set on either the server or the client. When set on the server, the attributes will be used as the defaults for all jobs unless overridden by job attributes that are set on a client.
-
node['rdiff-backup']['<server|client>']['jobs']['default']
- Map of optional attributes to be used as defaults for all jobs on this client, defaults below -
node['rdiff-backup']['<server|client>']['jobs']['default']['destination-dir']
- Location to store backups on the server side, default "/data/rdiff-backup" -
node['rdiff-backup']['<server|client>']['jobs']['default']['exclude-dirs']
- A set of files and directories not to ignore (not back up), default empty -
node['rdiff-backup']['<server|client>']['jobs']['default']['retention-period']
- String defining how long to keep backups, default "3M" (see rdiff-backup manual for --remove-older-than format) -
node['rdiff-backup']['<server|client>']['jobs']['default']['additional-args']
- String of additional arguments to pass to rdiff-backup, default "" -
node['rdiff-backup']['<server|client>']['jobs']['default']['nagios']['alerts']
- Whether to provide Nagios alerts for the status of the backup, default "true" (no effect if server has all alerts disabled) -
node['rdiff-backup']['<server|client>']['jobs']['default']['nagios']['max-change']
- How many megabytes the backup repo can change by from a single backup before a warning alert is sent, default 1024 -
node['rdiff-backup']['<server|client>']['jobs']['default']['nagios']['max-late-start']
- How late (in hours) the job can start before a critical alert is sent, default 2 -
node['rdiff-backup']['<server|client>']['jobs']['default']['nagios']['max-late-finish-warning']
- How long (in hours) the job can run before a warning alert is sent, default 4 -
node['rdiff-backup']['<server|client>']['jobs']['default']['nagios']['max-late-finish-critical']
- How long (in hours) the job can run before a critical alert is sent, default 8 -
node['rdiff-backup']['<server|client>']['jobs']['default']['retention-period'] = '3M'
- Example of how to set the default retention-period for all jobs -
node['rdiff-backup']['<server|client>']['jobs']['/etc']['retention-period'] = '2W'
- Example of how to set the retention-period for the/etc
job, overriding the default
Attributes for jobs are applied in the following order, with later attributes overriding earlier ones:
-
Cookbook defaults, as specified above as well as at the top of recipes/server.rb and recipes/client.rb. These are the base defaults.
-
Server node default attributes, as specified by
['rdiff-backup']['server']['jobs']['default']
in the server's node definition (usually set through roles). -
Server databag default attributes, as specified by
['rdiff-backup']['server']['jobs']['default']
in thedata_bags/rdiff-backup_hosts/myserver_mydomain_com.json
databag. (Optional) -
Client node default attributes, as specified by
['rdiff-backup']['client']['jobs']['default']
in the client's node definition (usually set through roles). -
Client databag default attributes, as specified by
['rdiff-backup']['client']['jobs']['default']
in thedata_bags/rdiff-backup_hosts/myclient_mydomain_com.json
databag. (Optional) -
Server node job-specific attributes, as specified by
['rdiff-backup']['server']['jobs']['/path/to/back/up']
in the server's node definition (usually set through roles). (Not usually used) -
Server databag job-specific attributes, as specified by
['rdiff-backup']['server']['jobs']['/path/to/back/up']
in thedata_bags/rdiff-backup_hosts/myserver_mydomain_com.json
databag. (Optional, not usually used) -
Client node job-specific attributes, as specified by
['rdiff-backup']['client']['jobs']['/path/to/back/up']
in the client's node definition (usually set through roles). -
Client databag job-specific attributes, as specified by
['rdiff-backup']['client']['jobs']['/path/to/back/up']
in thedata_bags/rdiff-backup_hosts/myclient_mydomain_com.json
databag. (Optional)
In general, it may be easier to remember that databag attributes override node attributes, client attributes override server attributes, and job-specific attributes override 'default' job attributes. It is also possible to simply ignore most of the order and set up backups entirely through one level, such as entirely through roles or entirely through databags.
To set up an rdiff-backup server, first generate an ssh keypair (such as with ssh-keygen
) and add the public key to
node['rdiff-backup']['client']['ssh_keys']
Example:
node.default['rdiff-backup']['client']['ssh_keys'] =
[
'no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,command=\"sudo rdiff-backup --server --restrict-read-only /\" ssh-rsa aTjnzpFeQ1kE69Vi3krV58YM1ZcUg7JgbYR337eE== rdiff-backup client'
]
include_recipe 'rdiff-backup::client'
To allow two rdiff-backup servers to run independently of each other in different environments, each with their own keys and clients, you must use different usernames.
Include recipe[rdiff-backup::server]
in your node's run_list
, and copy the
ssh private key to the rdiff-backup server's ~/.ssh/id_rsa
.
Include recipe[rdiff-backup]
or recipe[rdiff-backup::client]
in your node's
run_list
. If the server has the jobs you want specified, those will be used.
Otherwise, set the node['rdiff-backup']['client']['jobs']
attribute for it
(preferably in a role or databag). All other attributes are optional, and the
defaults of the cookbook and/or rdiff-backup server will be used.
Install rdiff-backup on the client, create the backup user, give it
passwordless sudo access, and add the server user's pubkey to its
authorized_keys
. Then, in the rdiff-backup_hosts
databag, create an entry
with the client's info. At the very least, each entry must have an id that
corresponds to its fqdn (with underscores replacing all periods), but they may
also specify rdiff-backup client attributes, as well as an "environment" to
make sure the host gets backed up by the right server. If not specified in the
databag, the environment will default to _default
and the rdiff-backup client
attributes will default to whatever client attributes the server has. Managed
hosts can also have databag entries if managing their attributes through node
definitions is not ideal. Note that if a managed host has a databag entry, any
attributes set in the host's node definition will be ignored entirely.
Example data_bags/rdiff-backup_hosts/myclient_mydomain_com.json
:
{
"id": "myclient_mydomain_com",
"chef_environment": "dev",
"rdiff-backup": {
"client": {
"ssh-port": 2222,
"user": "my-special-rdiff-backup-user",
"jobs": {
"default": {
"retention-period": "3M",
"exclude-dirs": [
"/etc/keys",
"/var/log/apache",
"/var/log/nginx"
]
},
"/etc": {},
"/home": {},
"/var/log": {
"retention-period": "2W"
}
}
}
}
}
Additionally, a databag for the rdiff-backup server may contain both server attributes (to be used as defaults for all jobs) and client attributes (for backing up itself).
- ChefDK 2.5.3 or later
- Terraform
- kitchen-terraform
- OpenStack cluster
Ensure you have the following in your .bashrc
(or similar):
export TF_VAR_ssh_key_name="$OS_SSH_KEYPAIR"
This cookbook contains a kitchen-terraform
environment that can be used to test multi-node configurations. You can spin up
this environment using kitchen test multi-node
.
If you want to create/converge only 1 VM you can use any of these commands:
VM | Command |
---|---|
client | terraform apply -target openstack_compute_instance_v2.client |
server | terraform apply -target openstack_compute_instance_v2.server |
You can login to a VM by using one of these commands:
$ ssh centos@$(terraform output client)
$ ssh centos@$(terraform output server)
When you are done testing the provisioning environment run
kitchen destroy multi-node
.
Here is a table showing what each IP points to:
Host | IP |
---|---|
client | 192.168.60.11 |
server | 192.168.60.12 |
All of these nodes are configured using a Chef Server which is a container running chef-zero. You can interact with the chef-zero server by doing the following:
$ CHEF_SERVER="$(terraform output chef_zero)" knife node list -c test/chef-config/knife.rb
client
server
In addition, on any node that has been deployed, you can re-run chef-client
like you normally would on a production system. This should allow you to do
development on your multi-node environment as needed. Just make sure you
include the knife config otherwise you will be interacting with our production
Chef server!
You do not need to use kitchen-terraform directly if you're just doing development. It's primarily useful for testing the multi-node cluster using inspec. You can simply deploy the cluster using terraform directly by doing the following:
# Sanity check
$ terraform plan
# Deploy the cluster
$ terraform apply
# Destroy the cluster
$ terraform destroy