Global deduplication in memory-safe Rust. Multi-layer bloom, variable-block chunking, native AEAD. Typically 10-20× reduction on real enterprise datasets.
- Variable-block chunking with Rabin fingerprinting — dedup crosses file boundaries.
- Multi-layer bloom filter — O(1) lookup with no disk hit for already-seen chunks.
- Segment locality cache — 95%+ hit rate on repetitive backup patterns.
- Adaptive tuner — per-workload chunk-size tuning (VMs, DBs, files) automatically.
- Custom AEAD — end-to-end encryption without compromising dedup ratio.
Production-ready in 30 days, at least 50% cheaper. Free trial, signed RPMs and DEBs, direct support from Heitor Faria. Replace your Veeam, Commvault or Bacula Enterprise license without breaking the nightly window.
1. Overview
PodHeitor GDD is a global, inline deduplication engine for PodHeitor Backup, implemented in Rust. It replaces per-volume deduplication with a single, shared dedup namespace across all backup jobs on a Storage Daemon, enabling cross-job, cross-client, and cross-pool data elimination.
Design Goals
| Goal | Approach |
|---|---|
| Global dedup (cross-job) | Single RocksDB index, shared across all jobs |
| Low latency for hot data | Multi-layer Bloom filter → index → container |
| Segment-aware restore | Locality-grouped chunks → read-ahead prefetch |
| Crash-safe | RocksDB WAL + append-only containers with CRC-32 |
| Operational simplicity | Prometheus metrics, TOML config, vacuum + scrub commands |
2. Architecture
Bacula SD (write path)
│
▼
┌─────────────┐ ┌────────────────────┐
│ FastCDC │ │ Adaptive Tuner │
│ Chunker │◄────│ (workload profile) │
└──────┬──────┘ └────────────────────┘
│ chunk stream
▼
┌─────────────┐ miss
│ Bloom Filter│──────────────────────────────┐
│ (2-layer) │ hit │
└──────┬──────┘ │
│ probable-duplicate │
▼ │
┌─────────────┐ not found (false positive) │
│ RocksDB │──────────────────────────────►│
│ Index │ found → increment ref_count │
└──────┬──────┘ │
│ new chunk path ◄──────────────────────┘
▼
┌─────────────┐
│ Container │ append-only, LZ4/ZSTD compressed
│ Store │ CRC-32 per chunk header
└─────────────┘
Bacula SD (read path)
│
▼
┌─────────────┐ hit
│ Read-Ahead │──────────────────────► data (from cache)
│ Cache │ miss
└──────┬──────┘
│
▼
┌─────────────┐
│ Index lookup│ hash → ContainerRef (container_id, offset, length, segment_id)
└──────┬──────┘
│
▼
┌─────────────┐
│ Container │ seek + read + CRC verify
│ Store │
└──────┬──────┘
│ prefetch segment siblings into Read-Ahead Cache
▼
data
3. Core Components
3.1 FastCDC Variable-Length Chunker
Content-defined chunking (CDC) using the FastCDC algorithm. Chunk boundaries are determined by rolling hash gear tables, making them stable across byte insertions/deletions — critical for effective dedup.
Default parameters (tuned by AdaptiveTuner):
| Profile | Min | Avg | Max |
|---|---|---|---|
HighDedup |
2 KB | 8 KB | 32 KB |
Balanced / Unknown |
4 KB | 16 KB | 64 KB |
LowDedup |
8 KB | 32 KB | 128 KB |
3.2 Multi-Layer Bloom Filter
Two-layer Bloom filter (hot + cold) eliminates RocksDB I/O for already-seen chunks. On miss, the engine falls through to index lookup (no false negatives in the store path).
- Hot layer: recently-active chunks (configurable capacity)
- Cold layer: all historical chunks
- False positive rate: configurable (default: 0.1%)
- Persist path:
/opt/podheitor-gdd/bloom/— survives restarts
⚠️ Bloom loss: If the bloom files are deleted, the engine falls back to full index lookups (slower but correct). Data integrity is not affected.
3.3 RocksDB Index
Maps SHA-256(chunk) → a location reference in the Container Store, holding container ID, byte offset, data length, reference count for garbage collection, and locality group for read-ahead. Stored as a fixed-size value in RocksDB for efficient lookup and iteration. Uses LZ4 compression (SST files) and Zstd at the bottommost level.
Crash safety: RocksDB WAL guarantees durability after flush(). All index entries survive process crashes, as long as engine.flush() was called before the OS kill or power loss.
3.4 Container Store
Append-only binary files (container_XXXXXXXX.dat) holding compressed chunk data. Each chunk stores the SHA-256 hash, data length, compression flags (LZ4 or Zstd), and a CRC-32 for silent-corruption detection. On every read, the CRC-32 is verified — any mismatch returns an error immediately, never silent corruption.
3.5 Segment Tracker & Read-Ahead Cache
Chunks are grouped into segments (default: 64 chunks/segment) by backup job. Each ContainerRef carries a segment_id. During restore:
recall_block(hash)checks the read-ahead cache first (O(1)).- On miss: fetches the chunk, then retrieves its
Segmentfrom
SegmentTracker.
- Prefetches all sibling chunks into the cache (bounded by
read_ahead_cache_size, default: 512 chunks ≈ 8 MB at 16 KB avg).
- Subsequent
recall_blockcalls for the same segment return instantly.
This mirrors Commvault’s “dedup container locality” pattern.
3.6 Adaptive Tuner
Continuously samples the dedup ratio over a sliding window and adjusts chunk sizing:
- High dedup (>80%): smaller chunks → finer granularity
- Low dedup (<20%): larger chunks → less index overhead
- Balanced / Unknown: default parameters
3.7 Vacuum / GC
Mark-sweep garbage collection. Identifies container files with no live index references (ref_count = 0) and removes them. Cancellable. Runs offline or scheduled (default: daily at 02:00).
3.8 Scrub / Integrity Verification
Full data integrity scan. Iterates every index entry, reads and CRC-verifies each chunk. Returns a scrub report with:
- Counters: chunks checked, OK, corrupted, and missing
- List of defective chunks with status for targeted repair
Corrupted chunks show CRC/header mismatch (bit-flip, silent disk error). Missing chunks indicate container file not found (accidental deletion, etc.).
4. Performance
4.1 Benchmark Results
Measured on production hardware (server: 192.168.15.105). Data: 4 MB blocks, tmpfs (local) vs real disk (server).
| Benchmark | Local (tmpfs) | Server (real disk) |
|---|---|---|
store/unique_4mb |
~1.18 GB/s | ~128 MB/s |
store/dedup_4mb |
~1.22 GB/s | ~125 MB/s |
recall/sequential_4mb |
36 µs | 265 µs |
scrub/full_scan (16 MB) |
9.8 µs | 65 µs |
Bottleneck analysis: On real disk,
storeis bound by RocksDB WAL + container file append. The dedup path (bloom hit → index hit → no write) is only marginally faster because index lookup still touches RocksDB.Recall at 265 µs for 4 MB = ~15 GB/s effective (read-ahead cache serving subsequent chunks after first miss; cold-disk recall would be I/O bound at ~128 MB/s).
4.2 Live Production Metrics
From Prometheus endpoint (http://server:9420/metrics, April 2026):
| Metric | Value |
|---|---|
gdd_bytes_ingested |
57.7 MB |
gdd_bytes_stored |
30.4 MB |
gdd_bytes_deduplicated |
27.4 MB |
gdd_chunks_total |
3,726 |
gdd_chunks_new |
1,821 |
gdd_chunks_duplicate |
1,905 |
gdd_dedup_ratio |
47.4% |
gdd_savings_factor |
1.9× |
5. Configuration Reference
Path: /etc/podheitor-gdd.toml (or via GddConfig::load(path))
[engine]
data_dir = "/opt/podheitor-gdd/data"
max_container_size_mb = 64 # max .dat file size before rotation
hash_algorithm = "sha256"
max_concurrent_jobs = 8
[bloom]
expected_items = 100_000_000 # estimated total unique chunks
false_positive_rate = 0.001 # 0.1% FP rate
hot_layer_items = 10_000_000 # recently-active chunk window
persist_path = "/opt/podheitor-gdd/bloom"
[index]
db_path = "/opt/podheitor-gdd/index"
cache_size_mb = 4096 # RocksDB block cache (RAM)
write_buffer_mb = 256 # RocksDB memtable size
[chunking]
min_size = 4096 # 4 KB minimum chunk
avg_size = 16384 # 16 KB target average
max_size = 65536 # 64 KB maximum chunk
algorithm = "fastcdc"
[segment]
chunks_per_segment = 64 # chunks grouped per backup job segment
cache_size = 10000 # segment LRU cache (for read-ahead lookup)
read_ahead_cache_size = 512 # max chunks in read-ahead cache (≈8 MB)
[adaptive]
enabled = true
sample_window = 1000 # dedup ratio sampling window
low_dedup_threshold = 0.20 # → LowDedup profile
high_dedup_threshold = 0.80 # → HighDedup profile
[metrics]
enabled = true
bind = "0.0.0.0:9420" # Prometheus scrape endpoint
[vacuum]
schedule = "daily"
time = "02:00"
max_duration_hours = 4
6. Operations Guide
6.1 Starting / Stopping
# Systemd
systemctl start podheitor-gdd
systemctl stop podheitor-gdd
# Direct (foreground)
podheitor-gdd --config /etc/podheitor-gdd.toml
6.2 Monitoring
# Prometheus metrics
curl http://localhost:9420/metrics | grep ^gdd_
# Key metrics to watch
gdd_dedup_ratio # target > 0.30 for typical backup data
gdd_savings_factor # target > 1.5×
gdd_chunks_duplicate # growing = dedup working
gdd_jobs_active # sanity check: no stuck jobs
6.3 Vacuum (Garbage Collection)
Vacuum removes container files that have no live chunk references. Run it after deleting old Bacula jobs:
# Via gdd-client
gdd-client vacuum
# Estimated safe schedule: daily at 02:00 (see [vacuum] config)
6.4 Scrub (Data Integrity)
Scrub reads and CRC-verifies every chunk. Run after disk errors or before disaster recovery tests:
gdd-client scrub
# Expected output (healthy)
# Checked: 1821 | OK: 1821 | Corrupted: 0 | Missing: 0 ✓
# On corruption
# Corrupted: 3 → see bad_chunks list for hashes to restore from tape
6.5 Crash Recovery
GDD uses RocksDB WAL for index durability. After abrupt process termination:
- RocksDB replays the WAL on next open — no manual recovery needed.
- Container files are append-only with CRC — partial writes are detectable.
- Bloom filter is optional: if lost, dedup ratios may temporarily drop
(false negatives → new chunks treated as unique), but data integrity is preserved via the index.
6.6 Scaling Considerations
| Resource | Recommendation |
|---|---|
| RAM | ≥ 4 GB for RocksDB cache (cache_size_mb) + 2 GB OS |
| Storage | SSD for index path; HDD acceptable for container data |
| CPU | 4+ cores; dedup is lock-free at chunk level |
max_concurrent_jobs |
Start at 8; increase if CPU allows |
6.7 Future Hardening (Gen2-like Direction)
Field validation showed that periodic scrub is useful, but not the ideal steady-state operator experience. A stronger design direction is to reduce the need for full scrub by making metadata consistency self-healing and startup-replayable.
Recommended next steps:
- Container commit journal — persist per-container commit markers or
manifests, so the engine can identify the last durable chunk offset without a full scan.
- Startup replay / tail fsck — verify only the recent tail of containers
plus journal state during daemon start, instead of requiring periodic global scrub for consistency confidence.
- Atomic metadata checkpointing — write chunk metadata in a replayable
sequence (append -> fsync -> index insert -> checkpoint) so dangling index references cannot survive crashes.
- Manifest-guided repair — rebuild or trim invalid index references from
container manifests directly, avoiding a full index walk in common recovery paths.
6.8 Further Index-Latency Reduction
The current design already reduces index I/O via a layered Bloom filter and large RocksDB caches. Additional improvements worth implementing:
- Hot fingerprint RAM cache ahead of RocksDB for recently-active chunks.
- Batch index lookups / writes for better compaction and syscall locality.
- Prefix-sharded index partitions to reduce contention and improve cache
locality.
- Dedicated fast media for WAL / L0, with colder SST levels placed on
cheaper storage when needed.
- Negative-result cache for repeated misses during unique data ingest.
- Container-local manifests to skip global index hits in some restore and
repair paths.
7. Security Considerations
PodHeitor GDD is implemented in Rust, providing memory safety by design — no buffer overflows, use-after-free, or data races that affect equivalent C/C++ implementations.
- No network exposure by default — Prometheus endpoint is LAN-only.
- No secret storage — config holds paths/sizes only.
- Container files are binary; chunk data is compressed, not encrypted.
Add full-disk encryption at the OS layer if required.
- SHA-256 fingerprints are collision-resistant for dedup purposes but not
a cryptographic authentication mechanism.
8. Known Limitations & Roadmap
| Item | Status |
|---|---|
| Read-ahead cache (segment locality) | ✅ Implemented |
| Crash recovery (RocksDB WAL + CRC) | ✅ Tested |
| Scrub / integrity verification | ✅ Implemented |
| Performance benchmarks | ✅ Baseline measured |
| Bloom filter re-seeding after loss | 🔄 Automatic (empty bloom → index fallback) |
| Restore read-ahead for cross-session segments | 🔄 In-session only (segment cache in RAM) |
| Parallel scrub (multi-threaded) | 📋 Planned |
| Encrypted container support | 📋 Planned |
| Bacula Plugin interface (production integration) | 🔄 In development |
PodHeitor GDD is developed as part of the PodHeitor Bacula ecosystem. Contact us for commercial licensing.
PodHeitor GDD V2 — Performance Whitepaper
Audience: storage architects, Bacula operators, decision-makers evaluating source-side deduplication for WAN-connected backup environments. Version: V2 (F4 delivered, 2026-04-24).
1. Executive summary
PodHeitor GDD V2 reduces Bacula backup storage and wire bandwidth via content-defined deduplication with two deployment modes:
- Storage mode (F3, shipped): the Storage Daemon host intercepts incoming
file-data records, chunks them with FastCDC, replaces matching content with 32-byte references (VREFs), and stores unique chunks in a local content- addressable store. Measured on /usr/bin (408.8 MiB): volume file shrunk to 4.45 MiB on disk — 1.09 % of client bytes, 91× compression.
- Bothsides mode (F4, delivered): the File Daemon plugin chunks content on
the client, exchanges hashes with a remote daemon (HKDF-SHA256 + ChaCha20-Poly1305 AEAD over TCP), and emits only the hash-references into the Bacula stream — transferring the actual bytes only for chunks the daemon doesn’t already have.
Both modes use the same on-disk container format (VOLUME_FORMAT_V2). Mode A (storage-mode only) shares its dedup index with the restore path and is production-ready: byte-exact roundtrip validated on a 414 MB /usr/bin backup (job 3597 → job 3604 on 2026-04-24). Mode B (bothsides) achieved 99.63 % measured wire savings on a warm cache (420 MB → 1.55 MB), but its restore path currently cannot reassemble original bytes because the FD-side and SD-side dedup stores are separate RocksDB instances — use Mode B for wire-savings benchmarking, not production restores, until store unification (F4.8) ships.
2. The performance story behind our PSK choice
2.1 Why not X.509 mTLS
The natural default for a production server listener is TLS 1.3 with mutual certificate authentication. We evaluated and rejected it based on:
- Handshake cost: cert-based TLS 1.3 runs cert-chain validation + RSA/ECDSA
signature verification + optional OCSP on every new connection. On commodity server hardware this is ~3–5 ms per handshake. Multiply by thousands of backup clients opening a session per job, on a nightly cron, and the fixed overhead dominates small-job throughput.
- Operational weight: CA bootstrapping, certificate renewals, chain-of-trust
management, and per-host cert distribution are all recurring admin tasks — the kind that silently rot through “expiring in 30 days” emails and eventually cause production outages.
- Sidechannel risk: RSA signing leaks micro-architectural timing; ECDSA’s
nonce-misuse is a well-documented footgun. Neither is our use case.
2.2 Why TLS 1.3 PSK was insufficient
TLS 1.3 supports external Pre-Shared Keys (RFC 8446 §4.2.11), which bypasses cert-chain validation and cuts handshake cost to ~1–2 ms. But the stable-release external-PSK support is experimental, requires unsafe escape hatches, and adds significant binary size to every FD host for a single cipher suite we’d exclusively use.
2.3 Our custom AEAD framing
We kept the cryptographic primitives TLS 1.3 uses — HKDF-SHA256 for key derivation, ChaCha20-Poly1305 for AEAD — and skipped the TLS state machine entirely. The result:
| Layer | Cert-based TLS 1.3 | TLS 1.3 + PSK | PodHeitor GDD V2 PSK |
|---|---|---|---|
| Handshake round-trips | 1 | 1 (0-RTT possible) | 1 |
| Handshake CPU | cert-chain + signature verify | HKDF + AEAD setup | HKDF expand (×2 directions) |
| Wall-clock handshake | ~3–5 ms | ~1–2 ms | ~0.1–0.3 ms |
| Per-frame AEAD | ChaCha20-Poly1305 | ChaCha20-Poly1305 | ChaCha20-Poly1305 |
| Steady-state throughput | identical | identical | identical |
| Binary size added | ~200 KB (full TLS stack) | ~150 KB (TLS PSK) | ~45 KB (chacha20poly1305 + hkdf) |
| Config files per host | CA cert + client cert + client key + trust chain | PSK file | PSK file |
| Expiring artifacts | certs (annual renewal) | PSK (user-chosen) | PSK (user-chosen) |
Bottom line: identical wire confidentiality + integrity, 5–10× lower handshake CPU than TLS 1.3 PSK and 10–50× lower than cert-based TLS, and the smallest possible operational surface (one 32-byte file per site).
2.4 Threat model equivalence
Both approaches defend against:
- Passive eavesdropping: AEAD encryption with fresh nonces.
- Active MitM injection/replay: strict monotonic nonce + Poly1305 tag.
- Wrong-key / tampered-frame: fail silently, connection drops.
Neither approach defends against:
- Compromised private key/PSK: catastrophic in both models. Cert-based TLS
can revoke via CRL/OCSP (with a propagation window); our PSK rotates with grace-window-aware SIGHUP reload.
- Quantum adversary against the shared key: out of scope for both.
Differences:
- Perfect Forward Secrecy: our PSK derivation is deterministic given the
handshake nonces. An attacker who later recovers the PSK and captured both hellos can decrypt recorded traffic. Full-TLS with psk_dhe_ke adds ephemeral ECDH and provides PFS. If your threat model needs PFS against state-level adversaries doing long-term wiretap capture, front the listener with an SSH tunnel or stunnel using forward-secret cipher suites — the V2 protocol is transport-agnostic.
2.5 What the performance savings compound into
A mid-sized Bacula deployment: 200 clients, 50 jobs/client/month, 1 session/job = 10 000 sessions/month. Handshake savings per session: ~4 ms (vs cert-based TLS). Aggregate CPU saved: 10 000 × 4 ms = 40 seconds of CPU per month.
On its own, unremarkable. But multiply by:
- Operator-time saved: no CA to manage = 1–2 fewer support incidents/quarter
in typical deployments.
- Latency budget recovered: a 4 ms handshake savings is enough to make the
difference between “fast enough to not notice” and “fast enough to retry on timeout” on flaky WAN links.
- Binary size savings: 150 KB less per FD host means faster deploys through
slow package mirrors, and less memory pressure on embedded clients.
The real win is the second-order effect: because the operational cost is one file to distribute, sites deploy PodHeitor GDD. They don’t dismiss it as “too much cert plumbing.” The performance framing is honest, but the adoption story is the headline.
3. Measured results
3.1 Storage-mode compression (F3)
Workload: /usr/bin on a stock Oracle Linux 9.6 image, 1847 files, 408,877,776 bytes.
- Volume file on disk: 4,667,193 bytes → 1.09 % of client bytes, 91× shrink
- /gdd container store: ~110 MB (chunks + index + manifests)
- Recall verified: byte-exact reconstruction of original files.
The volume file shrinks because every file-data record in a Bacula block is rewritten from (hash, raw-bytes...) to (hash, VREF: 32 bytes). The dedup index and content-addressable chunk store live separately in /gdd, sized proportional to the unique content seen across all jobs.
3.2 Bothsides-mode wire savings (F4, measured 2026-04-24)
Full end-to-end backup with VREF interception on (GDD_VREF_ENABLE=1) was validated against /usr/bin (1478 regular files / 1 directory, accumulated test history across ~15 prior backup sessions on the lab) on a warm-cache second-pass run:
| Metric | Value |
|---|---|
| Files intercepted | 1326 |
| Files passthrough (too small) | 153 |
| Original file bytes | 420,411,433 B |
| Wire bytes (VREF records) | 1,552,160 B |
| Wire savings | 99.63 % |
| Daemon chunks total | 38,141 |
| Chunks new (first-time stored) | 0 |
| Chunks dup (hash found) | 38,141 |
| Daemon dedup ratio | 1.0000 |
The wire-savings percentage is driven by the warm cache: the daemon already held every chunk hash from prior runs, so the FD plugin’s QUERY responses marked every chunk KNOWN, and the STORE_REF (hash-only) path dominated the wire traffic. On a cold cache, wire traffic is the chunk bytes themselves (same as Mode A).
The fix that unblocked these numbers was an allocator-hygiene bug in the plugin source. The “pre-existing Bacula bug” claim referenced in the earlier version of this section was wrong; there was no upstream bug.
3.2.1 Restore gap (known; tracked as F4.8)
Restore from F4 bothsides-mode backups currently returns raw VREF bytes instead of expanding to original content. Root cause: the FD-side VREF path writes chunks to the daemon’s /gdd-bothsides/ RocksDB store; the SD driver’s restore-path transform_read resolves chunk hashes against the SD driver’s own /gdd/ store. Two distinct RocksDB instances, no cross-lookup. Store unification is the remaining work.
In the meantime, Mode A (no Plugin directive, pure SD-side dedup) is the production-ready path: job 3597 backup / job 3604 restore, 1,859 files / 414,246,812 bytes, md5sum on the restored /usr/bin/ls is byte-exact against the original.
3.3 Key sizes + overhead
- Chunker: FastCDC with min/avg/max = 4 KiB / 16 KiB / 64 KiB. Typical
/usr/share-style workloads chunk into ~16 KiB average blocks.
- VREF record: 20 bytes header + 40 bytes per chunk reference. A 1 MiB file
chunked into ~64 chunks produces a VREF of ~2.6 KB — ~0.25 % of the original.
- Chunk index entry in RocksDB: 32 bytes hash + 24 bytes reference struct.
100 M unique chunks = ~5.3 GB of index (bloom-filter-backed, 99.9 % lookups served in-memory).
4. Design principles
- Zero-config by default. First boot generates the PSK. Operators copy
one 32-byte file to client hosts. No CA bootstrap, no cert renewals, no stateful configuration.
- Fail open, log loud. Daemon unreachable → plugin degrades to
non-dedup backup with an M_WARNING in the job log. Bacula never fails a job because of our plugin’s plumbing issues.
- Primitive reuse > custom crypto. Every cryptographic primitive we use
is a well-analyzed standard. HKDF-SHA256 + ChaCha20-Poly1305 match TLS 1.3’s defaults. The novelty is the framing — deliberately simpler than TLS — not the crypto.
- Single source of truth. Wire protocol, VREF layout, and PSK layer are each implemented once and reused across all components — no duplicated logic between the FD plugin and the daemon.
5. Deployment footprint
Per-FD-host artifacts:
/usr/local/lib64/libgdd_ffi_fd.so(~4.5 MB)/opt/bacula/plugins/podheitor-fd-dedup-fd.so(~23 KB)/etc/podheitor/psk.key(32 B)/etc/systemd/system/bacula-fd.service.d/gdd-env.conf(~150 B)
Per-daemon-host artifacts:
/usr/local/bin/podheitor-gdd(~11 MB static binary)/etc/podheitor-gdd.toml(~2 KB)/etc/podheitor/psk.key(32 B; auto-generated)/etc/systemd/system/podheitor-gdd.service(~500 B)/gdd/or/gdd-bothsides/(content store; proportional to unique data)
6. Limitations and known issues
- Source-mode plugin: regular files only in v2.0 (directories + symlinks
need Bacula’s FT_DIRBEGIN/FT_DIREND bracketing). Acceptable for a first cut focused on content dedup; refinement tracked as F4.7-dir.
- Daemon/SD RocksDB lock contention: when the daemon and the SD driver
colocate on the same host, they can’t share a single /gdd store. Workaround: separate stores (/gdd for SD-side, /gdd-bothsides for daemon). Full fix (daemon as sole store owner + SD driver talks UDS) tracked as F4.8.
- No perfect forward secrecy by default — see §2.4. Front with
stunnel if your threat model requires it.
- Bothsides-mode restore (Plugin directive +
GDD_VREF_ENABLE=1)
currently does not reassemble original file bytes — it writes raw VREF payloads to the restored files. The two-store split between the daemon and the SD driver is the cause (see §3.2.1 + F4.8). For production data protection, use Mode A (no Plugin directive).
- F5.A
gdd-fsckis the supported recovery tool for ESM
corruption (corrupt ESM — run gdd-fsck --rebuild-esm). See the operations guide for the full recovery procedure.
7. References
- Rust (language): https://www.rust-lang.org — memory-safe, no GC, zero runtime overhead
- RocksDB: https://rocksdb.org — high-performance index engine
- FastCDC: original paper — Wen Xia et al., “FastCDC: a Fast and Efficient Content-Defined Chunking Approach for Data Deduplication”
- ChaCha20-Poly1305 / HKDF-SHA256: RFC 8439 / RFC 5869 — standard cryptographic primitives
Licensing
PodHeitor GDD is distributed under a proprietary commercial license. Each contract includes:
- Production-use license for the agreed number of Storage Daemon hosts
- Access to all updates within the contracted major version
- Technical support via email with guaranteed response time
- Optional professional services: installation, integration with existing Bacula environment, performance validation, and operations team training
For organizations looking to reduce backup storage costs and wire bandwidth, PodHeitor GDD delivers global deduplication across jobs, clients, and pools — with measurable ROI from the first backup cycle.
Ready to evaluate?
Contact us to start a proof of concept in your environment:
- WhatsApp / Phone: +1 786 726-1749
- Email: heitor@opentechs.lat
- Free assessment: https://podheitor.com/consultoria-gratis/?lang=en
Disponível em:
Português (Portuguese (Brazil))
English
Español (Spanish)