Chapter 7

Backups & Snapshots

Crash-consistent vs app-aware, 3-2-1 rule.

Learning objectives

  • Distinguish snapshots from backups and when to use each
  • Configure Proxmox Backup Server or vzdump schedules
  • Define RPO and RTO for Workshop Co. workloads

Snapshots are not backups

A snapshot captures VM disk state at a moment in time, stored on the same storage pool. It is fast and great before a risky change. It is not a backup because:

  • Same disk failure kills snapshots and VM together
  • Snapshots grow and can hurt performance if kept for weeks
  • Ransomware on the host can encrypt snapshot chains

A backup copies data to separate media — another disk, Proxmox Backup Server (PBS), or off-site object storage. Workshop Co. keeps snapshots for hours, backups for months.

Production rule

Never rely on a month-old snapshot as your only recovery path. Marcus deletes pre-upgrade snapshots after verifying success — within 48 hours.

RPO and RTO

TermMeaningWorkshop Co. target
RPO (Recovery Point Objective)Max acceptable data lossDatabase: 1 hour; website: 24 hours
RTO (Recovery Time Objective)Max downtime to restoreWebsite: 4 hours; booking db: 2 hours

Worked example — Proxmox backup job

Marcus schedules nightly vzdump for all production VMs to a Swift Host storage box over NFS:

# /etc/pve/vzdump.cron or Datacenter → Backup → Add
# Daily 02:00 Edmonton time — after classes, before morning prep
vzdump 110 120 130 --mode snapshot --storage nfs-backup-swift \
  --compress zstd --mailto admin@workshopco.ca --notes-template '{{guestname}}'

Weekly he verifies restore: clone VM 120 backup to isolated VM ID 220, run pg_verifybackup, delete test VM.

Snapshot workflow — PostgreSQL upgrade

1

Notify staff — maintenance window Saturday 6 AM

2

Stop app on web VM; systemctl stop postgresql on db VM

3

qm snapshot 120 pre-pg16 --description "Before PG 16"

4

Run upgrade; test booking flow on staging first

5

Success → remove snapshot; failure → qm rollback 120 pre-pg16

ESXi backup notes

On the Calgary MSP ESXi cluster, Workshop Co.'s DR replica uses Veeam or native VMware backup APIs. Marcus receives monthly restore-test reports — same discipline as Proxmox.

Try it yourself

Workshop Co.'s Nextcloud holds irreplaceable instructor PDFs. Design backup policy:

  1. Snapshot frequency (if any)
  2. Backup frequency and retention
  3. Off-site copy location (Canadian preference)
Sample policy
  • Snapshots: only before major Nextcloud upgrades; delete within 24h
  • Backups: nightly full, 30 daily + 12 weekly retained
  • Off-site: Proxmox Backup Server replica to Swift Host Montreal region — PIPEDA-aligned contract

Check your understanding

  1. Can you restore a vzdump backup if the Proxmox host SSD dies?
  2. Why stop PostgreSQL before a crash-consistent snapshot?
Answers
  1. Yes — backups on NFS/off-site are independent of local pool failure (if you tested restores).
  2. Crash-consistent snapshots risk corrupted DB pages; quiesce or use guest agent freeze for application-consistent copies.