The first Bacula plugin built natively for HPC. Parallel filesystems (Lustre, GPFS / IBM Spectrum Scale, BeeGFS, CephFS, WekaFS), billion-file namespaces, Slurm/PBS/LSF-aware scheduling, AI/ML checkpoint-aware deduplication, and restripe-on-restore. Pure-Rust, with process isolation via PTCOMM — no Bacula AGPL statically linked.

Why a dedicated HPC plugin?

Bacula’s stock File Daemon walks files single-threaded (findlib/find_one.creaddir + sequential save_file()). On a Lustre filesystem with 1 billion files this is a non-starter — the bottleneck is readdir+lstat, not bandwidth.

Bacula Enterprise 18.2 ships dedicated plugins for HDFS, Quobyte, NDMP, NetApp, and Nutanix — but zero native support for the parallel filesystems that actually run HPC. This plugin closes that gap.

Innovative features (the differentiators)

  • Parallel namespace walker — rayon work-stealing, one worker per Lustre MDT / GPFS NSD / BeeGFS metadata target. Replaces the FD’s single-threaded find_one_file. 10–100× metadata throughput.
  • Namespace sharding (Shard=N/M) — splits the namespace into N shards by hash-of-inode or subtree pinning, so N concurrent Bacula jobs run against N SD streams. Bacula has no built-in within-job stream multiplexing — sharding is the only way to saturate HPC fabric.
  • Filesystem-native incrementals — Lustre ChangeLogs, GPFS mmapplypolicy, CephFS rstats+rctime, BeeGFS metadata-shard scan. True “changed since” without a billion stat() calls.
  • Stripe-aware parallel reader — reads Lustre OSTs / GPFS NSDs in parallel via llapi_layout_get_by_path; reassembles in-order through PTCOMM. Naive sequential reads leave ≥80% of HPC bandwidth on the floor.
  • Slurm/PBS/LSF orchestration — submits the scan as a Slurm job-step on a compute node; quiesces competing jobs; JobComp hook for AI/ML checkpoint capture. Backup runs on the fast fabric, not the login node.
  • AI/ML checkpoint-aware dedup — pluggable into PodHeitor Global Deduplication with content-defined chunking tuned to tensor-stride boundaries. Training checkpoints differ by ~5% per epoch — 95%+ dedup ratio is realistic.
  • Restripe-on-restore — persists original Lustre layout as a RestoreObject; restore recreates striping before writing. Preserves performance characteristics, not just bytes.
  • Namespace-only “metadata snapshot” mode — fast nightly inode + ACL + xattr capture; bulk data weekly. Catastrophic recovery needs the namespace fast; bulk can stream from tape.
  • HSM-aware — Lustre HSM integration (archive/release/restore as a tier). Backup becomes a first-class HSM action, not a hostile scan.
  • Bandwidth shaping by Slurm load — reads live cluster utilization; bursts during idle, throttles during high-priority jobs. Static QoS doesn’t cut it on shared HPC.

Commercial differentiators

Feature Bacula Community Bacula Enterprise / Veeam PodHeitor HPC
Native Lustre / GPFS / BeeGFS / CephFS / WekaFS No No Yes
Parallel walker (10–100× stock FD) No No Yes
Namespace sharding + N SD streams No No Yes
Filesystem-changelog incrementals No Partial Yes
Slurm/PBS/LSF orchestration No No Yes
Restripe-on-restore No No Yes
HSM-as-tier No No Yes
Cost Free (no support) $$$$ ≥50% cheaper than Enterprise/Veeam

Compatibility

  • Bacula Community 15.0.3+
  • Filesystems: Lustre 2.14+, IBM Spectrum Scale (GPFS) 5.x, BeeGFS 7.x, CephFS, WekaFS
  • Schedulers: Slurm 22.05+, PBS Pro, LSF, OpenPBS
  • Distros: RHEL/Oracle/Rocky/Alma 9.x, Debian 12+, Ubuntu 22.04+
  • Architecture: x86_64 (musl static-pie binary)
  • Rust toolchain: 1.95+ (build), no runtime dependency

Installation

Install via official .deb or .rpm package — no production builds required:

# RHEL / Oracle Linux / Rocky / Alma 9.x
sudo dnf install podheitor-hpc-plugin-0.1.0-1.el9.x86_64.rpm
# Optional sub-package on hosts with Lustre client:
sudo dnf install podheitor-hpc-plugin-lustre-0.1.0-1.el9.x86_64.rpm

# Debian / Ubuntu
sudo dpkg -i podheitor-hpc-plugin_0.1.0-1_amd64.deb
sudo dpkg -i podheitor-hpc-plugin-lustre_0.1.0-1_amd64.deb

Packages install libpodheitor_hpc_fd.so into /opt/bacula/plugins/, the podheitor-hpc-backend binary, the podheitor-hpc CLI, the systemd unit, and configuration examples. bacula-fd restarts automatically via post-install.

Technical whitepaper

📘 Read the full technical whitepaper — internal architecture, parallelism model, sharding, changelog drivers, restripe-on-restore, Phase 10 benchmarks and deployment topologies.

📄 Executive version (PDF): PodHeitor HPC Whitepaper PDF

Ready to switch?

Bring us your renewal or new-contract proposal from Bacula Enterprise, Veeam, Commvault or NetBackup. We commit to a minimum 50% discount, with more capabilities included.

Heitor Faria — Founder, PodHeitor International
✉ heitor@opentechs.lat
☎ +1 (786) 726-1749 · +55 (61) 98268-4220 (WhatsApp)
Free 30-day commercial trial for qualifying workloads.

Disponível em: pt-brPortuguês (Portuguese (Brazil))enEnglishesEspañol (Spanish)