VM-level backup of Nutanix AHV via Prism v3+v4 with native Changed Regions Tracking, vendor-neutral PHCBT01 replication, inbound cross-hypervisor restore from Proxmox/vSphere/Hyper-V, and disk-only / alternate-cluster restore — all absent from the Bacula Enterprise 18.2.3 AHV plugin.
Companion technical document to the PodHeitor Nutanix plugin page.
1. Gaps in the Bacula Enterprise Nutanix plugin
Bacula Enterprise 18.2.3 ships a Nutanix AHV plugin — JVM-based, Prism v2/v3, no v4 CRT, no cross-restore, no vendor-neutral replication, no disk-only restore, no alternate-cluster restore. For customers running pc.2024.3+ who want multi-vendor DR, that leaves four large operational gaps:
- No v4 CRT. The
compute-changed-regionsAPI in pc.2024.3+ is faster and more granular than the legacy v3changed_regions. BEE doesn’t consume it. - No cross-restore. AHV backups restore only on AHV. Leaving Nutanix requires manual V2V.
- Replication coupled to Nutanix Protection Policies. Doesn’t work cross-vendor.
- JVM latency. GC pauses during streaming of large disks are measurable.
The PodHeitor Nutanix Plugin is a Rust sibling of podheitor-proxmox, podheitor-vsphere and podheitor-hyperv — reusing their on-wire formats byte-for-byte so restores are fully cross-compatible.
2. Architecture — two-process Rust
The plugin follows the PodHeitor pattern: cdylib (in this case a C++ shim of ~120 LOC, constants-only, linked against the metaplugin framework) + standalone Rust backend, communicating via PTCOMM length-tagged framing on stdin/stdout.
┌──────────────────────────────────────────────────────────────────────────┐
│ Bacula File Daemon (bacula-fd) │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ podheitor-nutanix-fd.so (metaplugin C++ shim, ~120 LOC) │ │
│ │ - PLUGINNAMESPACE="@nutanix" │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ PTCOMM over pipe (stdin/stdout) │
└──────────────────────────────┼────────────────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ podheitor-nutanix-backend (Rust binary) │
│ ┌──────────────┬──────────────┬──────────────┬──────────────────────┐ │
│ │ prism v3/v4 │ snapshot │ iscsi │ disk_reader │ │
│ │ REST client │ RAII guard │ attach/detach│ (O_DIRECT /dev/sdX) │ │
│ ├──────────────┼──────────────┼──────────────┼──────────────────────┤ │
│ │ crt (CBT) │ backup.rs │ restore.rs │ replication.rs │ │
│ └──────────────┴──────────────┴──────────────┴──────────────────────┘ │
└────┬─────────────────────────┬────────────────────────┬───────────────────┘
│ HTTPS (9440) │ iSCSI (DSIP:3260) │ TLS (9848)
▼ ▼ ▼
┌──────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Prism Central│ │ Nutanix CVMs │ │ DR Receiver │
└──────────────┘ └──────────────────┘ └──────────────────┘
2.1 Two deployment modes, one binary
| Mode | Where backend runs | When to choose |
|---|---|---|
proxy_mode=external |
FD host outside the AHV cluster | Default — requires network route to Prism:9440 + DSIP:3260 and open-iscsi on FD host |
proxy_mode=in_cluster |
Linux VM inside the AHV cluster | Maximum throughput: data plane is local virtual-NIC to DSIP |
3. Full backup — step by step
- PTCOMM handshake; receive JobInfo + Plugin params.
- Cluster discovery: PC v4 with v3 fallback, returns PE IP + 15-min JWT (cookie
NTNX_IGW_SESSION). POST /api/nutanix/v3/vms/{uuid}/snapshot(or v4 equivalent). RAIISnapshotGuardguarantees delete on drop.- Clone snapshot disks to a temporary Volume Group (Prism API).
- Attach VG to proxy/FD via iSCSI:
iscsiadm -m discovery+iscsiadm -m node --login. - Enumerate
/dev/disk/by-path/...via sysfs scan; map disk ↔ block device by LUN. - Emit FNAME packets:
@nutanix/<cluster>/<vm-uuid>/vm-metadata.json,@nutanix/.../disks/disk-<idx>-<id>.raw. - Stream bytes via O_DIRECT 1 MiB / 4 KiB-aligned reads → D-packets.
- Logout iSCSI, delete VG, delete snapshot (unless it is the CBT reference).
- Persist CBT state:
reference_recovery_point_ext_id(v4) orsnapshot_uuid(v3) at/var/lib/podheitor-nutanix/bitmap/<cluster>/<vm-uuid>.json. - PTCOMM F (end-of-data), wait FD ack, T (terminate).
4. Incremental — v4 CRT with v3 fallback
Step 5 of the full flow becomes:
- Create new snapshot (current). Keep previous reference snapshot.
- Per disk: call
compute-changed-regions(v4) or/data/changed_regions(v3) with(reference, current). - Paginate: up to 10,000 regions per response, follow
nextOffsetuntil exhausted. - Emit changed extents in PHCBT01 format (magic + original_size + region_count + {offset, length, data} × N) over D-packets.
- Hybrid path: dense extents → iSCSI attach + read; sparse extents → REST range-GET. Density heuristic.
- On success: rotate. Delete old reference, promote current to new reference.
5. RAII guards — deterministic cleanup
Orphan Nutanix snapshot = capacity leak in the cluster. Orphan VG = zombie LUN target. Open iSCSI session = zombie device file on the FD. The backend solves it via reverse Rust Drop chain:
| Guard | Resource | Drop action |
|---|---|---|
SnapshotGuard |
Prism recovery point | Delete via API |
VolumeGroupGuard |
Temporary VG | Delete via API |
IscsiSessionGuard |
iSCSI session | iscsiadm -m node --logout |
CleanupGuard |
Catch-all | Catches Drop panics; logs errors but never propagates |
Drop ordering is load-bearing: iSCSI logout → VG delete → snapshot delete. Rust’s reverse-declaration drop order gives that for free, but inserting a guard between existing ones silently swaps the order — convention is encoded as a comment in back_up_vm.
6. Inbound cross-restore — Proxmox/vSphere/Hyper-V → AHV
Detection at FNAME scan (first file of the job):
| FNAME prefix | Pipeline |
|---|---|
@proxmox/<vmid>/disks/*.raw |
raw → qcow2 (qemu-img convert) → Image Service upload |
@vsphere/<vm>/disks/*.vmdk |
vmdk → qcow2 → Image Service upload |
@hyperv/<vm>/disks/*.vhdx |
vhdx → qcow2 → Image Service upload |
VM config (.conf / .vmx / .vmcx XML) is translated to AHV vm_spec_v3:
| Source field | AHV target |
|---|---|
| Memory (MB) | resources.memory_size_mib |
| vCPUs | num_sockets × num_vcpus_per_socket |
| Firmware BIOS/UEFI | resources.boot_config.boot_type |
| SCSI/IDE disks | disk_list[].device_properties.device_type=DISK, adapter_type=SCSI |
| NIC MAC+VLAN | nic_list[].mac_address + subnet_reference via network_map |
7. Vendor-neutral replication (PHCBT01 over TLS)
Unlike Nutanix Protection Policies, which require Nutanix at both ends, PodHeitor replication operates over PHCBT01-over-TLS on port 9848 — same format as the Proxmox/vSphere/Hyper-V plugins. The receiver can be Nutanix, Proxmox, or any host with a PodHeitor peer FD.
- Seed: initial full via the same backup path, marked as reference.
- Bitmap-push: periodic cycles read CRT delta, send via TLS to receiver.
- Failover modes: planned, unplanned, test, undo, permanent.
8. Disk-only and alternate-cluster restore
Two modes absent from BEE AHV v18:
- Disk-only restore: restores only
disk-N.rawto an arbitrary device path on the FD host (no Image Service upload, no VM creation). Use case: forensics, single-file recovery via manual mount. - Alternate-cluster restore:
target_clusterparam redirects restore to a different PE than the source. Combine withrestore_vm_name=andnetwork_map=to avoid IP collision.
9. Documented anti-patterns
- Don’t use
prism_insecure=truein production. It exists for Nutanix CE (default self-signed cert). Import the PC CA into the FD trust store instead of bypassing. - Don’t invert guard Drop order. Snapshot delete before iSCSI logout leaves zombie device files on the FD.
- Don’t run
proxy_mode=externalwithout explicitdsip=. The fallback tocluster_nameworks in lab but is fragile in production (DNS, multi-DSIP). - Don’t run replication against a cluster with Protection Policies active on the same VM. Snapshot conflict; current flow doesn’t auto-detect.
10. License posture
The plugin ships under LicenseRef-PodHeitor-Proprietary. The backend is a standalone Rust binary — no Bacula AGPLv3 source is statically linked. The C++ shim is minimal (~120 LOC, constants-only) and dynamically links against the Bacula metaplugin framework.
Ready to evaluate?
Free 30-day trial for Nutanix AHV clusters (Prism Central pc.2024.3+ recommended, pc.2023.x supported via v3 fallback). We guarantee at least 50% discount vs Bacula Enterprise, Veeam or Commvault, with cross-restore and vendor-neutral replication included.
Heitor Faria — Founder, PodHeitor International
✉ [email protected]
☎ +1 (789) 726-1749 · +55 (61) 98268-4220 (WhatsApp)
🔗 PodHeitor Nutanix plugin page
Disponível em:
Português (Portuguese (Brazil))
English
Español (Spanish)