Technical whitepaper — PodHeitor PostgreSQL for Bacula

Five backup modes (dump / parallel_dump / pitr / pitr_block / cdp), replication-slot and subscription capture, automated 6-phase restore, and PG17 block-level via pg_basebackup --incremental.

Technical companion to the PodHeitor PostgreSQL plugin page.

1. The problem: stock Bacula + PostgreSQL is fragile

PostgreSQL backup with stock Bacula typically reduces to one of three options:

  • Filesystem-level backup of $PGDATA with no coordination — captures torn-write files, restore is a WAL-replay lottery.
  • pg_dump in RunBeforeJob — duplicated staging on disk; no PITR; no real incremental.
  • Bacula Enterprise PostgreSQL plugin — Perl wrappers around pg_dump/pg_basebackup, no PITR_BLOCK, no CDP, no replication-state capture.

The PodHeitor PostgreSQL Plugin delivers five native backup modes, replication-slot/subscription capture, and a 6-phase automated restore — all selectable from the Bacula Director with no external scripts.

2. Architectural model

A Bacula metaplugin speaking PTCOMM over stdin/stdout between the FD cdylib (Rust .so) and the Rust backend. The backend orchestrates every PostgreSQL-aware step: pg_dump, pg_backup_start/pg_backup_stop, WAL archive management, per-file enumeration of $PGDATA, tablespace walks, logical replication-slot capture, automated recovery-configuration writing, and post-restore verification.

v2.0.0 (April 2026) — the cdylib is built from the plugin-postgresql crate in the PodHeitor Rust cdylib workspace. No Bacula AGPLv3 source is statically linked. The legacy C++ shim that linked pluginlib/metaplugin.o has been removed.

3. Five backup modes

Mode Function Output namespace
dump (default) Logical per-database pg_dump @postgresql/dump/<db>.dump
parallel_dump Multi-DB dump with worker pool @postgresql/dump/<db>.dump
pitr Physical per-file: pg_backup_start/stop + $PGDATA walk + WAL window @postgresql/pitr/{pgdata/, tblspc/, wal/, backup_label, tablespace_map, _manifest.json}
pitr_block PG17+ block-level via pg_basebackup --incremental + pg_combinebackup @postgresql/pitr_block/
cdp Continuous WAL streaming (continuous data protection) @postgresql/cdp/wal/

3.1 VLDB split — COPY-range chunking

Tables with a numeric PK above large_table_threshold (default 10G) are auto-split into PK ranges and processed by parallel workers in parallel_dump mode. In DBs with monster single-threaded tables in pg_dump, this turns an 8-hour job into ~1 hour with parallel_tables=8.

4. Replication-state capture

With track_replication_state=true, the plugin captures in the per-job _manifest.json:

  • Physical + logical replication slots from pg_replication_slots (name, type, plugin, active, restart_lsn, confirmed_flush_lsn).
  • Subscriptions from pg_subscription with synthesized CREATE SUBSCRIPTION ... WITH (slot_name=..., create_slot=false, enabled=...) DDL. LEFT JOIN pg_database ensures each subscription appears once with its actual target DB (v1.3.0 dedup fix).
  • Source roleprimary or standby, derived via pg_is_in_recovery(). Primary-only vs standby-only accessors are guarded with CASE so the same query works on both.

At restore time, with restore_replication_state=true (default), the plugin parses the manifest and recreates missing slots via pg_create_{physical,logical}_replication_slot() against the target cluster. Subscriptions that were disabled are recreated with enabled=false by default — the restore never accidentally turns on replication that was off at backup time.

5. Automated 6-phase restore

The restore flow runs six idempotent phases that honor dry_run_restore=true:

  1. Pre-flight — autodetect systemd unit, stop PG, optional mv PGDATA → PGDATA.old.<UTC-ts> (5-second rollback), create $PGDATA + wal_restore_dir with correct ownership.
  2. Per-file receipt — dispatch incoming FNAME by vpath prefix: PGDATA files, tablespace entries, WAL segments, backup_label, tablespace_map, _manifest.json.
  3. Recovery config — append a clearly-delimited PodHeitor-managed block to $PGDATA/postgresql.auto.conf with restore_command + operator-supplied recovery_target_{time,lsn,xid,name} + recovery_target_action (promote/pause/shutdown). Touch recovery.signal.
  4. Start + monitor — optional systemctl start; poll pg_is_in_recovery() + pg_last_wal_replay_lsn() until target action converges or restore_timeout fires.
  5. Verify — post-recovery SQL sanity checks; optional pg_checksums --check; optional operator-supplied SQL script.
  6. Cleanup — purge PGDATA.old.<ts> older than pgdata_backup_retain_days; optionally remove wal_restore_dir.

6. Backup-from-standby + multi-cluster

Parameter Default Function
backup_from_standby false Connects to a read-only standby via PGTARGETSESSIONATTRS=read-only; full primary offload
cluster_id (empty) Multi-cluster namespace prefix: @postgresql/cluster_<id>/...
track_replication_state false Capture slots + subscriptions in manifest
compress zstd zstd, lz4, gzip, none
archive_dir auto Auto-derived from SHOW archive_command (PITR only)

7. Compatibility

Component Supported versions
PostgreSQL 12, 13, 14, 15, 16, 17 (incremental & VLDB require 17)
Bacula Community 15.0.3+ (tested on Oracle Linux 9.6)
Bacula Enterprise 18.0+ (with open_bpipe + FD_PLUGIN_INTERFACE_VERSION patches)
OS (plugin host) glibc 2.17+ or musl-static: CentOS 7, Rocky/Alma/OL 8/9, RHEL 8/9, Debian 11+, Ubuntu 20.04+
Arch x86_64

Backend binary is musl static-pie ~500 KB — portable across distros without specific glibc dependency.

8. Bacula Enterprise vs PodHeitor

Feature Bacula Enterprise PostgreSQL PodHeitor v2.0.0
DUMP mode (pg_dump)
Parallel DUMP multi-DB
VLDB split COPY-range
PITR Full + Diff + Inc (WAL-only)
PITR_BLOCK (PG17+)
CDP (continuous WAL)
Backup from standby
Replication slot capture
Subscription DDL capture
Automated 6-phase restore ⚠️ (manual) ✅ (with dry_run_restore)
Recovery-target params (time/lsn/xid/name)
Tablespace remap ⚠️ ✅ (OID:/path)
Rollback safety (mv before wipe)
Prometheus metrics
Multi-cluster namespace

9. Documented anti-patterns

  • Don’t run Differential or Incremental without Accurate = yes on the Job. Bacula needs the Accurate walker to filter unchanged files; without it, “incremental” sends everything.
  • Don’t use start_postgresql_after_restore=true by accident in prod. Default is false precisely to avoid auto-starting a restored cluster in the wrong environment.
  • Don’t confuse archive_dir with wal_restore_dir. The first is where the source wrote WALs; the second is where the restore puts them for replay.
  • Don’t disable the wipe_pgdata safety rename. Default false is the safety; flipping to true without understanding the 5-second mv PGDATA → PGDATA.old.<ts> rollback can lead to data loss.

10. License posture

Single-license proprietary, AGPL-clean since v2.0.0. Both binaries (backend + FD cdylib) are proprietary. No Bacula AGPLv3 source is statically linked. Releases ≤ v1.3.0 shipped a C++ shim that statically linked Bacula Community objects — that shim was removed in v2.0.0.

Ready to evaluate?

30-day free trial for production PostgreSQL fleets (including PG17 with PITR_BLOCK and VLDB split). Guaranteed at minimum 50% discount vs Bacula Enterprise, Veeam or Commvault, with more capabilities included.

Heitor Faria — Founder, PodHeitor International
[email protected]
☎ +1 (789) 726-1749 · +55 (61) 98268-4220 (WhatsApp)
🔗 PodHeitor PostgreSQL plugin page

Disponível em: pt-brPortuguês (Portuguese (Brazil))enEnglishesEspañol (Spanish)

Leave a Reply