AEAD envelope

Steganography hides that a message exists. AEAD (authenticated encryption with associated data) hides what it says — and cryptographically binds the envelope's metadata so a tamperer can't swap scheme, density, or body length without the recipient noticing. rsteg's default is XChaCha20-Poly1305 sealed with a key derived from the user's passphrase via Argon2id.

What's in the envelope

offset size field 0 1 kdf_id (2 = Argon2id) 1 1 kdf_version (1 = OWASP 2023 params) 2 16 salt fresh OS-RNG bytes per seal 18 24 nonce fresh OS-RNG bytes per seal 42 N ciphertext XChaCha20 keystream XOR plaintext; len == plaintext_len 42+N 16 poly1305_tag MAC over ciphertext + AAD

Overhead: 58 bytes per payload regardless of plaintext size. The whole thing becomes the body of a PayloadHeader-framed message with flags.encrypted = 1 and crypto_fourcc = b"XCA1".

Seal flow

passphrase Zeroizing<Vec<u8>>, from stdin salt 16 B from OS RNG plaintext body N bytes, arbitrary nonce 24 B from OS RNG, per seal AAD (74 B) outer PayloadHeader (32 B) ‖ inner crypto-header (42 B) Argon2id m=65536 (64 MiB) t=3, p=1 out = 32 B key key (32 B) Zeroizing, drops on scope exit XChaCha20-Poly1305 seal XChaCha20 keystream ⊕ plaintext Poly1305 over (ciphertext ‖ AAD) inputs: key, nonce, plaintext, AAD outputs: ciphertext (N B), tag (16 B) stream cipher — no block padding output metadata 42 B ciphertext N B tag 16 B binds into tag ←→ any bit flip in AAD or ciphertext breaks Poly1305 verification and returns BadPassphrase final 58-byte AEAD preamble + N-byte ciphertext becomes the "body" of the outer PayloadHeader

Why Argon2id (not PBKDF2 or scrypt)

PBKDF2 is HMAC in a loop — CPU-bound. A modern GPU or ASIC can run tens of millions of PBKDF2-SHA256 candidates per second, because HMAC uses almost no memory. A 600 000-iteration PBKDF2 defence (OWASP 2023) collapses to minutes against a moderately-resourced attacker with a dictionary.

Memory-hard KDFs require the attacker to allocate large per-guess working memory that can't be amortized across guesses. scrypt was the first mainstream design; Argon2 (RFC 9106) is the PHC winner that superseded it.

Within Argon2: the -id variant is the OWASP default. Argon2d has side-channel leaks (memory access pattern depends on secrets — fine on a server, bad on a shared client). Argon2i resists side-channels but has weaker time-memory tradeoffs. Argon2id runs one i-pass then a d-pass, and gets the best of both for password hashing.

parametervaluenote
m_cost6553664 MiB memory per derivation — dominates cost
t_cost3three iterations (passes over memory)
p_cost1single-lane; we don't parallelize a user password
salt16 B OS entropydistinct per-file; prevents rainbow tables
output32 BXChaCha20 key

At 64 MiB × 3 passes on a modern CPU, one KDF call takes ~300–500 ms. A brute-force attacker pays that per candidate — a 10⁶-word dictionary is days on one machine, weeks on a small cluster. That's the right ballpark for file-at-rest steganography.

Why XChaCha20-Poly1305 (not AES-GCM)

AES-GCM is a fine AEAD. It's fast on CPUs with AES-NI, it's standardized, it's what TLS uses. But it has a nonce-reuse cliff: the same (key, nonce) pair with different plaintexts leaks the XOR of the plaintexts and destroys the MAC. GCM nonces are 96 bits; birthday-collision on random 96-bit values shows up around 2⁴⁸ messages — reachable for a busy server, not reachable for a steganography tool, but not robust to buggy callers.

XChaCha20 uses a 192-bit extended nonce. Birthday bound is 2⁹⁶ — practically unreachable. You can generate a fresh random nonce per seal from the OS RNG with no counter, no state, no collision worry. That's the right property for a library that might be called from a shell pipeline with no persistent storage.

XChaCha20 is also a stream cipher, so there's no block padding and no padding-oracle class of bug. Poly1305 is a 128-bit universal-hash MAC computed over ciphertext concatenated with AAD, constant-time verified via subtle::ConstantTimeEq in the RustCrypto implementation. All four properties — large nonce, no padding, fast on commodity CPUs, audited implementation — make it the conservative pick for this use case.

The AAD binds the outer header into the tag

The associated data for Poly1305 is:

aad = outer_payload_header (32 B)
    || inner_crypto_header (42 B: kdf_id, kdf_version, salt, nonce)

This is the whole point. Consider the attacks AAD defends against:

Open flow — and why it returns a single error

open(ciphertext, passphrase, outer_header) runs the seal flow in reverse: parse the 42-byte AEAD preamble, reject unknown kdf_id/kdf_version, reconstruct AAD exactly, run Argon2id with the parsed salt, decrypt-and-verify with XChaCha20-Poly1305.

Any failure — wrong passphrase, tampered ciphertext, tampered AAD, truncated input, tampered nonce — returns the same error, Error::BadPassphrase. No subdivision. The spec 06 invariant: the caller cannot distinguish "you typed the password wrong" from "someone flipped a bit in the header". That distinction would leak a side channel for the adversary to tell whether their tamper landed on a signed region or not.

What is not protected

Steganographic concealment: the AEAD envelope still looks like a 58-byte preamble followed by N + 16 bytes. A strong steganalyser with access to the plaintext cover can still detect that something has been written to the LSB plane — no AEAD fixes that. It's the permuted walk plus reasonable density choice that handles concealment. AEAD handles confidentiality and tamper resistance given concealment.

Post-compromise: if the passphrase leaks, every past file encrypted with that passphrase is recoverable. rsteg does not implement forward secrecy — which would require per-file asymmetric crypto and is out of scope for a local-file tool.