Technical whitepaper — PodHeitor File Replication (BRC) for Bacula

SD-side plugin that intercepts backup data in real time and replicates files to a local or network-mounted filesystem — files immediately available for instant recovery, no traditional restore job. Mirror, versioned retention, multi-site fan-out, BLAKE3 skip-unchanged, AES-256-GCM at rest, RPO/RTO compliance reporting.

Companion technical document to the PodHeitor BRC plugin page.

1. The problem: instant recovery requires restore

In stock Bacula Community, making a backed-up file available at a DR site requires (a) running a backup job, (b) replicating the Bacula volume to another SD, (c) running a restore job. Total time: minutes to hours, depending on size. For sub-minute DR SLAs, that’s impractical.

PodHeitor BRC inverts: files are replicated during backup, not after. The SD plugin intercepts the data stream before the volume write and forks a copy to a target filesystem — local, NFS, SMB, or DR disk. Result: the file is available for immediate use at the destination, while the traditional Bacula backup continues for long-term retention.

2. Architecture — SD cdylib + Rust backend

┌──────────┐    ┌──────────────┐    ┌──────────────────┐    ┌───────────────┐
│  File    │───▶│  Storage     │───▶│  Rust cdylib     │───▶│  Rust Backend │
│  Daemon  │    │  Daemon      │    │  (SD Plugin v13) │    │  (binary)     │
└──────────┘    └──────────────┘    └──────────────────┘    └───────┬───────┘
                                                                    │
                                              ┌─────────────────────┼──────────┐
                                              ▼                     ▼          ▼
                                    ┌──────────────┐    ┌─────────────┐  ┌──────────┐
                                    │ File Writer  │    │ RPO/RTO     │  │Dashboard │
                                    │ (target FS)  │    │ Reports     │  │ JSON     │
                                    └──────────────┘    └─────────────┘  └──────────┘
  • podheitor-replica-sd.so — pure Rust cdylib loaded by bacula-sd. Implements the Bacula SD Plugin API v13 binary contract in clean-room Rust. No Bacula source or header is copied or linked.
  • podheitor-replica-sd-backend — standalone binary that reads volume blocks from the configured FIFO, parses Bacula records, drives all replication / retention / encryption / dashboard logic.
  • Config/opt/bacula/etc/podheitor-replica.conf (key=value)

Both components are 100% Rust. Building the package requires only cargo — no C compiler, no Bacula source tree.

3. Three pillars — B · R · C

Pillar Capability
B — Backup Native Full / Incremental / Differential. Metadata fidelity (ACL, xattr, sparse, ownership, timestamps). FIFO zero-volume mode.
R — Replication Mirror and retention modes. Multi-site fan-out. BLAKE3 skip-unchanged. Bandwidth throttling. Consistency groups. Failover automation.
C — Conversion Stream decompression (zlib/LZ4). AES-256-GCM at rest. RPO/RTO compliance reporting (JSON + Markdown).

4. Replication modes

Mode Behaviour Use case
mirror 1:1 dataset replica; delete_removed=yes removes orphans on Full Warm DR site, dev/test refresh
retention Per-file historical versions, controlled by retention_versions=N Compliance, audit trail, time-travel

5. BLAKE3 skip-unchanged

With skip_unchanged=yes, the backend computes BLAKE3 of the destination file before writing. If the hash matches, the write is skipped — the destination filesystem receives no unnecessary I/O.

  • BLAKE3 chosen for throughput: ~6 GB/s/core on x86_64 with AVX-512, vs ~1 GB/s for SHA-256.
  • Critical in mirror mode for large datasets where most files don’t change.
  • Pair with verify_after=yes for post-replication BLAKE3 verification (cross-check).

6. FIFO mode — zero local I/O overhead

With fifo_path= pointing to a named pipe, the backend reads volume blocks directly from the FIFO instead of mirroring via local filesystem. Result: zero local I/O overhead on the SD host — bytes go from the FD socket directly to the replication destination, never landing on disk.

Use case: SD with tight I/O budget (ingest IOPS bound), replicating to SAN destination. FIFO mode avoids the “write to volume → read from volume → write to destination” round-trip.

7. Multi-site fan-out

The targets parameter accepts multiple destinations separated by ;, with per-target options:

targets = /mnt/dr-site-a;/mnt/dr-site-b:bwlimit=1M;/mnt/dr-site-c:bwlimit=500K

Each target runs in its own tokio task; bandwidth limits are per-target. Failure on one site doesn’t block the others — reported in job log and metrics.

8. Encryption at rest — AES-256-GCM

With encrypt_key= set to a 64-char hex key (256 bits), the backend encrypts each file with AES-256-GCM before writing to the destination. The nonce derives from path + timestamp; the auth tag is appended. Tampering attacks are detected on restore.

Important: the key sits in podheitor-replica.conf (mode 0600). For production, integrate with KMS via encrypt_key_cmd= (out-of-band lookup).

9. RPO/RTO compliance reporting

With rpo_report_dir= set, the backend emits JSON + Markdown reports per job:

Metric Computation
Observed RPO now() - last_successful_replication per file
RPO SLA sla_rpo_secs (default 14400 = 4h)
Estimated RTO Dataset size / destination bandwidth + mount/check overhead
RTO SLA sla_rto_secs (default 3600 = 1h)
SLA breach Files that exceeded the RPO/RTO target

Markdown output is directly consumable in audit reports; JSON feeds Grafana or Bacularis dashboards.

10. Snapshot integration — pre-replication

With snapshot_backend=lvm|zfs|btrfs, the backend snapshots the destination filesystem before starting replication. If the job fails mid-write, rollback restores the consistent state.

  • lvmlvcreate --snapshot --name podheitor-pre-{jobid}
  • zfszfs snapshot pool/dataset@podheitor-pre-{jobid}
  • btrfsbtrfs subvolume snapshot

11. Failover automation

Operational hooks via healthcheck_cmd, promote_cmd, demote_cmd:

  • healthcheck_cmd — exit 0 = primary healthy; backend keeps replicating.
  • promote_cmd — invoked when healthcheck fails N times; promotes replica to primary (mount, IP takeover, DNS update).
  • demote_cmd — invoked on failback; demotes the previous primary.

12. Documented anti-patterns

  • Don’t use delete_removed=yes without a snapshot backend. In mirror mode, a misexecuted Full can delete genuine files. LVM/ZFS snapshot is the safety net.
  • Don’t run FIFO mode without sizing the pipe. The Linux default of 64 KB can be a bottleneck — use fcntl F_SETPIPE_SZ or raise /proc/sys/fs/pipe-max-size.
  • Don’t trust skip_unchanged=yes alone for integrity. BLAKE3 collision is theoretically possible (astronomically improbable); pair with verify_after=yes on critical datasets.
  • Don’t run encryption without KMS in production. Key in a config file is PoC-grade.

13. License posture

The plugin ships under LicenseRef-PodHeitor-Proprietary. No Bacula AGPLv3 source is statically linked. The SD Plugin API v13 is reimplemented in clean-room Rust — only the function-pointer binary contract is honoured, no header copy.

Ready to evaluate?

Free 30-day trial for DR and instant recovery workloads. We guarantee at least 50% discount vs Bacula Enterprise, Veeam or Commvault, with more features included.

Heitor Faria — Founder, PodHeitor International
[email protected]
☎ +1 (789) 726-1749 · +55 (61) 98268-4220 (WhatsApp)
🔗 PodHeitor BRC plugin page

Disponível em: pt-brPortuguês (Portuguese (Brazil))enEnglishesEspañol (Spanish)

Leave a Reply