Internal architecture, operation modes, Veeam-style DR replication, NBD-based instant recovery, cross-hypervisor conversion (vSphere/Hyper-V → PVE), and the security model with TLS fingerprint pinning.
Technical companion to the PodHeitor Proxmox plugin page.
1. The problem: stock Bacula is hypervisor-blind
Bacula Community in its stock form has no hypervisor awareness. Backing up Proxmox VMs without a plugin typically falls into one of three options, all bad:
- Host filesystem backup — captures
.qcow2/.rawfiles in inconsistent state, no quiesce, no CBT. Restored VMs frequently boot into corruption. vzdump+ directory dump, then Bacula — doubles the storage footprint (1× PVE dataset + 1×vzdumpoutput), and every incremental retransmits the entire dump becausevzdumpdoes not emit native deltas.- Bacula Enterprise PVE plugin — exists, but at enterprise pricing, with no cross-site DR replication and no cross-hypervisor conversion.
The PodHeitor Proxmox Plugin closes all three gaps in a single binary: VM-aware backup with CBT, Veeam-style cross-node DR replication, and cross-hypervisor restore (vSphere/Hyper-V → PVE).
2. Architectural model
The plugin follows the PodHeitor pattern of cdylib + standalone backend, communicating over PTCOMM (length-tagged framing on stdin/stdout). The motivation is threefold:
- Crash isolation. A panic in the NBD or QMP engine kills the backend, not
bacula-fd. The cdylib observes EOF on the pipe, reports the job as failed, and the FD keeps serving other jobs. - Parallelism freedom. The backend can open PVE REST + NBD + QMP connections in parallel without violating Bacula’s “one thread per
bpContext” contract. - License firewall. Since v2.0.0, no Bacula AGPLv3 source is statically linked.
2.1 Process topology
bacula-fd → podheitor-proxmox-fd.so (cdylib ~600 LoC) → podheitor-proxmox-backend (Rust ~4500 LoC)
├─ PVE REST API (HTTPS/TLS pinned)
├─ NBD Client (disk I/O)
└─ QMP Client (snapshot + dirty bitmap)
The backend hosts five engines: BackupEngine, RestoreEngine, ReplicationSender, ReplicationReceiver and InstantRecoveryEngine. Selected via the plugin string’s mode= parameter.
3. Operation modes
| Mode | Function | Engine |
|---|---|---|
backup |
VM-aware backup (Full/Inc/Diff) with CBT via QEMU dirty bitmaps | BackupEngine |
seed |
Initial full sync, auto-provisions replica VM on DR target | ReplicationSender |
incremental (DR) |
CBT-only incremental replication — dirty-bitmap deltas | ReplicationSender |
receiver |
DR-target receiver daemon — listens on TCP dr_port (9190) |
ReplicationReceiver |
failover-exec |
Boot replica on DR (planned failover) | ReplicationSender |
failback-pre |
Return replica to standby | ReplicationSender |
4. CBT via QEMU dirty bitmaps
Instead of re-sending 100 GB every night, the plugin installs a persistent dirty bitmap in PVE’s QEMU through QMP block-dirty-bitmap-add. Every incremental:
- Takes a consistent snapshot (with
quiesce=yesvia QEMU Guest Agent when available). - Reads only blocks flagged dirty since the last backup, via NBD
BLOCK_STATUS. - Streams those blocks over PTCOMM with offsets preserved.
- Resets the bitmap once the backup terminates OK.
A 100 GB VM with 2 GB modified → only 2 GB transferred. Without CBT (stock Bacula + filesystem), it’d be 100 GB every night.
5. Veeam-style DR replication (v1.1.0)
The plugin implements a cross-node PVE-1 → PVE-2 DR pipeline with semantics close to Veeam Replication, but driven entirely from the Bacula Director (FileSet/Job).
5.1 Phases
- Seed (
mode=seed): initial full sync. Auto-provisions the replica VM on the DR target (cores, RAM, NICs, SCSI controllers, storage spec). - Continuous incremental (
mode=incremental): dirty-bitmap deltas only. - Restore points: snapshots auto-rotated on the replica (default 7 points).
- Verify (
verify_sample_blocks=N): FNV-1a-64 hash of N sample blocks compared source ↔ DR. Mismatch = job fail (not silently OK). - Planned failover (
mode=failover-exec): one run boots the replica on DR. - Planned failback (
mode=failback-pre): returns replica to standby.
5.2 DR channel authentication
| Method | Parameter | When to use |
|---|---|---|
| PSK token (HMAC) | dr_auth_token |
Default; quick setup between two controlled sites |
| TLS Mutual Auth | dr_auth_cert + dr_auth_key |
Compliance / multi-tenant; rustls + PEM |
| Both | all 3 | Defense-in-depth |
5.3 Standalone receiver daemon
The DR target does not need to run bacula-fd. The package installs the systemd template [email protected]; the receiver listens on dr_port (default 9190) and accepts authenticated streams, writing disks over NBD to the local PVE’s dr_storage. This reduces attack surface on DR and simplifies partial air-gap.
6. Instant Recovery via NBD overlay
Traditional recovery of a 500 GB VM can take hours of disk-write time before the service is back. The InstantRecoveryEngine bypasses this:
- Boots the VM in PVE pointed at a virtual disk served by NBD from the backend, reading directly from the Bacula restore stream.
- Overlay writes (
ir_overlay_storage) capture guest changes on fast local storage. - In the background, with
ir_auto_migrate=yes, the disk is migrated to final storage (ir_target_storage) without downtime.
Typical RTO drops from hours to minutes. Key parameters: ir_nbd_bind, ir_nbd_port, ir_overlay_storage, ir_target_storage, ir_timeout (default 3600s).
7. Cross-hypervisor conversion
The plugin reads backups produced by sister plugins PodHeitor vSphere and PodHeitor Hyper-V and restores them directly into PVE — no manual reconversion:
| Source | Source disk format | Conversion |
|---|---|---|
| VMware vSphere | VMDK | VMDK → qcow2/raw via in-process library (not a shell-out) |
| Hyper-V | VHDX | VHDX → qcow2/raw via in-process library |
Lab-validated: Job 805 (Hyper-V → PVE) and Job 865 (VMware → PVE) restored VMs with successful boot on the destination PVE.
8. Security model
8.1 TLS Fingerprint Pinning (PVE API)
Stock Bacula typically trusts the system trust store; PVE certificates are self-signed by default. The plugin enforces explicit SHA-256 fingerprint pinning via pve_fingerprint=AA:BB:CC:..., with pve_insecure=no as the default. To obtain the fingerprint:
openssl s_client -connect pve-host:8006 </dev/null 2>/dev/null
| openssl x509 -noout -fingerprint -sha256
| sed 's/SHA256 Fingerprint=//'
Mismatch (rotated cert, MITM, host swapped) → job aborts with ERROR: TLS fingerprint mismatch. Expected AA:BB:... got CC:DD:....
8.2 Credentials
- Passwords and tokens passed via FileSet plugin string (not stored on disk by the plugin).
bacula-dir.confmust have perms600, ownerbacula.- For production: integrate an external vault (HashiCorp Vault, AWS Secrets Manager) and template the plugin string from the Director.
9. Lab validation
| Metric | Result |
|---|---|
| Sequential Bacula Jobs executed | 1,290+ JobIDs |
| Backup/Restore Full/Inc/Diff (same-host + cross-host) | OK |
| Cross-hypervisor Hyper-V → PVE (Job 805) | OK |
| Cross-hypervisor VMware → PVE (Job 865) | OK |
| Replication seed 100 GB | 107 GB in 93.7 min, 19 MB/s sustained |
| Replication incrementals | 15 / 15 back-to-back, 14.5 s avg |
| Integrity verify | 150 sample blocks, 0 mismatches |
| mTLS DR channel | v3 cert with IP SAN — handshake + integrity OK |
| Planned failover + failback | One-command, exit 0 |
| Bacula-driven JobId 3448 | Termination=OK |
Environment: Director Oracle Linux 9.6 + Bacula 15.0.3; PVE Site A Debian 12 + PVE 8.4.18; PVE Site B Debian 13 + PVE 9.x.
10. Documented anti-patterns
- Do not disable
pve_fingerprintin production. Settingpve_insecure=yesaccepts any cert the PVE host presents — including MITM certs. Lab use only. - Do not run
quiesce=yeswithout QEMU Guest Agent installed and active in the VM. Without the agent, the plugin auto-degrades to crash-consistent — but the operator must know this happened (check job log). - Do not run receiver and sender on the same PVE host. They open the same
dr_portand the second one fails to bind. - Do not confuse
backup_type=incremental(Bacula level) withmode=incremental(DR mode). The first is the FileSet’s backup level; the second is the replication engine mode. Orthogonal.
11. License posture
Since v2.0.0, the plugin ships under LicenseRef-PodHeitor-Proprietary. No Bacula AGPLv3 source is statically linked into the .so. The cdylib is built from the pure-PodHeitor plugin-proxmox crate in the PodHeitor Rust cdylib workspace, with independent extern "C" bindings via the bacula-fd-abi crate.
Ready to evaluate?
30-day free trial for production Proxmox VE fleets. Guaranteed at minimum 50% discount vs Bacula Enterprise, Veeam or Commvault, with more capabilities included (DR replication, instant recovery, cross-hypervisor conversion).
Heitor Faria — Founder, PodHeitor International
✉ [email protected]
☎ +1 (789) 726-1749 · +55 (61) 98268-4220 (WhatsApp)
🔗 PodHeitor Proxmox plugin page
Disponível em:
Português (Portuguese (Brazil))
English
Español (Spanish)