Every rsteg-written carrier starts its embedded bitstream with a fixed 32-byte frame. The frame answers four questions for the extractor: "is this an rsteg file?", "which scheme was used?", "how many body bytes follow?", and "was it encrypted?" — without which extract would have to guess.
offset size field role 0 4 magic b"RSTG" — false-positive filter on extract 4 1 version wire version, current = 1 5 1 flags bit0 encrypted, bit1 compressed, bit2 permuted 6 4 crypto_fourcc AEAD scheme id (e.g. b"XCA1"); all-zero for plaintext 10 4 scheme_fourcc carrier + walk (BLSL, BLSP, WLSL, …) 14 1 density bits written per sample, 1..=4 15 1 reserved must be zero 16 4 body_len u32 big-endian length of body bytes that follow 20 4 body_crc32 CRC32-IEEE of plaintext body; zero when encrypted 24 8 reserved all zero, preserves 32-byte frame for future extensions
Source of truth: crates/rsteg-core/src/header.rs.
All multi-byte integers are big-endian — picked because RSTG
reads left-to-right in a hex dump and because network byte order is the
default when there's no good reason to pick otherwise.
Build a header with the form below. The page posts to
POST /api/algo/header-encode, which calls
PayloadHeader::encode() on the server and returns the 32 bytes
plus field metadata. Every field is colored both in the hex and in the
table so you can line them up visually.
| field | offset | size | value | note |
|---|
b"RSTG"A four-byte tag checked first on every extract. It doesn't provide any security — an attacker who knows the format can reproduce the magic — but it cheaply rules out the 99.99% of carrier files that aren't rsteg files. Without it, every extract would have to try the whole rest of the decode pipeline and probably fail deep inside, producing confusing errors.
A 1-byte wire version. Currently 0x01. decode()
rejects anything else with Error::HeaderBadVersion. This gives us
a clean upgrade path: version 2 can add fields by co-opting the reserved
bytes, and old readers will refuse to misinterpret them.
A 1-byte bitfield:
bit0 FLAG_ENCRYPTED — the body is an AEAD ciphertext; see the AEAD page.bit1 FLAG_COMPRESSED — reserved for phase 3; not written yet.bit2 FLAG_PERMUTED — embed/extract use passphrase-seeded shuffle order.bits 3..7 — must be zero; decoder rejects with HeaderReservedBitsSet.
Rejecting unknown flags on read is deliberate. A future writer adding
bit3 must also bump the version; a zero-policy on reserved bits
catches accidental corruption that happens to land in a known-flag position.
Two orthogonal identifiers. scheme_fourcc tells the format
adapter how to walk samples (which carrier and linear vs permuted — e.g.
BLSL = BMP LSB linear). crypto_fourcc tells the
crypto registry which seal/open to run; it's all-zero for plaintext payloads.
Separating them lets every carrier × walk × crypto combination compose freely.
One byte in 1..=4. Determines how many low bits per sample are
overwritten, which in turn determines capacity and perceptual delta. The
decoder enforces the range and returns
Error::DensityOutOfRange for anything else.
body_len is the big-endian u32 length of the
payload that follows. It caps how many post-header bytes the extractor reads
and therefore bounds the walk. body_crc32 is a CRC32-IEEE of
the plaintext body for unencrypted payloads — a cheap integrity hint for
detecting walk-miscount or a wrong density. For encrypted payloads
it's zero: the Poly1305 tag inside the AEAD envelope already
authenticates the ciphertext, and a second integrity check on plaintext
would force the decoder to leak timing info about decryption success.
One byte at offset 15, eight bytes at offset 24..32. All must be zero on
write and are checked on read. They reserve space for forward-compatible
fields — e.g. an explicit seed_hint or a
content_type hint — without rewriting the frame.
The extractor starts by reading 32 bytes' worth of embedding units from the
carrier — which is 256 / density samples at density 1..4. For
a 24-bit BMP that's:
The first 32 bytes of that bitstream are decoded as a PayloadHeader.
If the magic isn't RSTG, decode() returns
Error::HeaderMissing — the extractor stops immediately. Cost of
being wrong: ~250 bytes read from a potentially multi-MB carrier. The
false-positive rate on random data is 2⁻³² per trial, so we never
mistake a non-rsteg file for an rsteg one.
pub fn encode(&self) -> [u8; 32] {
let mut out = [0u8; 32];
out[0..4].copy_from_slice(b"RSTG");
out[4] = self.version;
out[5] = self.flags;
out[6..10].copy_from_slice(&self.crypto_fourcc);
out[10..14].copy_from_slice(&self.scheme_fourcc.0);
out[14] = self.density;
// 15 stays zero (reserved)
out[16..20].copy_from_slice(&self.body_len.to_be_bytes());
out[20..24].copy_from_slice(&self.body_crc32.to_be_bytes());
// 24..32 stay zero (reserved)
out
}
No serde, no proc-macros, no framework. Fixed layout, hand-written, trivial to audit. This is what rsteg means by "supply-chain minimization" — every byte of this wire format is in the open and reproducible from the spec.