A Vault File That Fails Closed: Encrypt-then-MAC on an MCU
How a hardware password manager authenticates every vault file before it decrypts: encrypt-then-MAC, verify-before-decrypt, and fail-closed reads.

A password vault has one job at rest: hand back exactly the bytes you stored, or refuse. Not “probably,” not “best effort” — a vault that returns almost your password, or that behaves differently when a file has been tampered with, is worse than no vault at all. On a microcontroller, where the encrypted files live on a flash chip an attacker can desolder and read, “refuse” has to be the default for everything that isn’t provably authentic.
This post is about the file format I use in a hardware password manager I’m building (not yet released) — an ESP32-class device that stores credentials on its internal flash. The whole design rests on one rule that is surprisingly easy to get wrong: never decrypt a byte you haven’t authenticated first. Get the order wrong and you’ve built a padding oracle. Get it right and the vault fails closed.
TL;DR — What this format does
- Every vault file is an encrypt-then-MAC envelope:
[ver][iv][hmacTag][cipher]. The tag authenticates a context prefix —ver ‖ recordType ‖ slotId ‖ generation— ahead of the IV and ciphertext. - On read, the tag is recomputed and compared in constant time before any decryption — verify-before-decrypt.
- Binding slot, type, and a freshness counter makes a relocated, retyped, or rolled-back file fail the MAC — even against an attacker with raw flash access but no PIN.
- The freshness counters live in
meta.bin, which is itself authenticated; mutations are crash-safe (stage → commit → promote). - Encryption and authentication use separate keys; every reject path produces no plaintext and wipes its scratch.
Quick glossary (if crypto isn't your background)
- Plaintext · ciphertext: the readable data (your password) and its scrambled, unreadable form once encrypted.
- Encrypt · key: scramble data with a key (a secret) so only someone holding the key can reverse it. Here the key comes from your PIN.
- Block cipher (AES) · padding: AES encrypts in fixed 16-byte chunks; padding is the filler bytes that top up the last chunk.
- MAC · HMAC · tag: a tamper-evident seal. HMAC uses a key to compute a tag (a short fingerprint); change a single byte and the tag stops matching.
- IV (initialization vector): random bytes, a fresh one per write, so encrypting the same data twice gives different results.
- KDF · PBKDF2: a function that “stretches” a weak secret (a PIN) into a key, deliberately slow to frustrate guessers.
- Offline brute force: trying every combination on your own hardware, with a stolen copy of the data and no attempt limit.
- slot: each numbered place where the vault stores one credential.
Decrypt-first is a footgun
First, the threat. This is a device you can hold in your hand, and the credentials live on a flash chip soldered to the board. An attacker who steals it isn’t limited to typing PINs at the screen: they can desolder the flash and read it on a bench programmer, or dump it over a debug port, and walk away with the raw ciphertext. From then on the attack is offline — no lockout, no rate limit, all the time in the world. The threat model for the firmware treats raw flash readout as a real possibility (mitigated in production by the ESP32’s own flash encryption), so the file format has to assume the attacker is holding the exact bytes it wrote and is free to modify them and feed them back.
That’s what makes decrypt-first dangerous. The intuitive design is the wrong one: you have ciphertext, you want plaintext, so you decrypt — and only then, maybe, you check whether the result “looks right.” To see why that order is fatal, you have to look at what happens inside.
Start with the basics: AES is a block cipher — it only knows how to work on fixed-size chunks, 16 bytes each. Since your data almost never lands on an exact multiple of 16, the last chunk is topped up with filler bytes — the padding — until it fills the block. PKCS#7 does this by a simple rule: if 5 bytes are missing, append five 0x05 bytes; if 11 are missing, eleven 0x0B. When decrypting, the very last thing the algorithm does is look at that padding and check it’s well-formed.
Now the part that makes it dangerous. AES in CBC mode chains the blocks: each ciphertext block is mixed (with an XOR) into the next one’s decryption. Think of a row of dominoes — nudge one and you move its neighbor. That hands the attacker a lever: tamper with one ciphertext block and you control, byte by byte, what comes out of the block after it — including that final padding the decryptor is about to inspect.
And here’s the trap: whether the padding is valid is something the system gives away. If the attacker can tell a “bad padding” failure from any other — a different error message, a log line, or simply a reply that takes a hair longer — they’ve got an oracle: a box they can ask one yes/no question per try, and it always answers truthfully. It’s like a chess opponent who, without meaning to, winces whenever your move helps them: that tell, repeated, is enough to reconstruct the game. With that single “is the padding valid?” bit, asked enough times, they peel the plaintext apart one byte at a time, without ever holding the key:
- Steal the ciphertext block they want to read (the target block) and the block right before it.
- Tamper with one byte of the preceding block and resubmit the modified ciphertext for decryption. In CBC, the preceding block is mixed (XORed) into the next block's decryption, so that byte controls what comes out.
- A distinguishable 'bad padding' vs. 'other error' response reveals one intermediate byte — and from it, one plaintext byte.
- Repeat across the 256 possible values, then all 16 positions, then every block, until the whole file is recovered.
- Encrypt-then-MAC: the MAC check fails first, so the decrypt + unpad path is never reached. The oracle never gets to answer.
| Step | Kind | Action |
|---|---|---|
| 1 | attacker action | Steal the ciphertext block they want to read (the target block) and the block right before it. |
| 2 | repeated step | Tamper with one byte of the preceding block and resubmit the modified ciphertext for decryption. |
| 3 | information leaked | A distinguishable 'bad padding' vs. 'other error' response reveals one intermediate byte — and from it, one plaintext byte. |
| 4 | attacker action | Repeat across the 256 possible values, then all 16 positions, then every block, until the whole file is recovered. |
| 5 | blocked by design | Encrypt-then-MAC: the MAC check fails first, so the decrypt + unpad path is never reached. The oracle never gets to answer. |
Each query yields one bit — 'padding valid?' — and that bit is enough to recover a byte, then a block, then the file. Encrypt-then-MAC removes the oracle by never reaching the decrypt step for a tampered file.
That is the padding oracle attack, first described by Serge Vaudenay in 2002, and it has been breaking real protocols for two decades: Lucky Thirteen against TLS in 2013, POODLE against SSL 3.0 in 2014.
Moxie Marlinspike distilled the lesson into The Cryptographic Doom Principle: “if you have to perform any cryptographic operation before verifying the MAC on a message you’ve received, it will somehow inevitably lead to doom.” Decryption is a cryptographic operation. So the MAC check has to come first.
Three ways to combine a cipher and a MAC
Encrypting solves confidentiality — nobody can read the secret — but not integrity — knowing nobody tampered with it along the way. That’s the MAC’s job. Think of it as the wax seal on an old letter: the sender stamps the seal with a signet only they hold (the key), and the recipient checks it’s still intact before opening; if someone intercepted and rewrote the letter, the seal no longer matches and it’s discarded unread. HMAC is that seal in cryptographic form: with a key it produces a short tag from the message, and changing a single byte invalidates it.
Authenticated encryption therefore needs both pieces — a cipher (for confidentiality) and a MAC (for integrity) — but there are exactly three ways to bolt them together, and they are not equally safe. The canonical analysis is Bellare and Namprempre’s Authenticated Encryption: Relations among Notions and Analysis of the Generic Composition Paradigm (J. Cryptology, 2008), reinforced for secure channels by Hugo Krawczyk’s The Order of Encryption and Authentication for Protecting Communications.
| Composition | What it does | Notably used by | Verdict |
|---|---|---|---|
| Encrypt-and-MAC (E&M) | MAC the plaintext, encrypt the plaintext, send both | SSH | Not generically secure — the MAC can leak plaintext, and you must decrypt to verify |
| MAC-then-Encrypt (MtE) | MAC the plaintext, then encrypt plaintext+MAC together | TLS (CBC ciphersuites) | You must decrypt before you can check — the doom path; source of Lucky Thirteen |
| Encrypt-then-MAC (EtM) | Encrypt the plaintext, then MAC the ciphertext | IPsec | Generically secure. The MAC gate-keeps the ciphertext, so you verify without decrypting |
Encrypt-then-MAC is the only one where you can authenticate the message without touching the cipher — the tag is computed over the ciphertext, so checking it requires no decryption. That property is exactly what kills the padding oracle: a tampered file never reaches the decrypt path. So that is what the vault uses.
Put the two orderings side by side and the doom is obvious. The composition isn’t an academic taxonomy — it’s a decision about which step runs first when an attacker hands you bytes:
MAC-then-Encrypt (the doom path)
must decrypt before it can check
- Receive a possibly-tampered file
- Decrypt it — touches attacker bytes, runs PKCS#7 unpad this is where the oracle leaks
- Only now check the MAC — already too late
Encrypt-then-MAC (the vault)
checks before it decrypts
- Receive a possibly-tampered file
- Recompute & check the MAC over the ciphertext first no decryption needed to do this
- Decrypt only if the tag verified
| Path | Step | Action | Status |
|---|---|---|---|
| MAC-then-Encrypt (the doom path) | 1 | Receive a possibly-tampered file | step |
| MAC-then-Encrypt (the doom path) | 2 | Decrypt it — touches attacker bytes, runs PKCS#7 unpad | unsafe — attacker bytes touched before the check |
| MAC-then-Encrypt (the doom path) | 3 | Only now check the MAC — already too late | unsafe — attacker bytes touched before the check |
| Encrypt-then-MAC (the vault) | 1 | Receive a possibly-tampered file | step |
| Encrypt-then-MAC (the vault) | 2 | Recompute & check the MAC over the ciphertext first | safe — verified before any decryption |
| Encrypt-then-MAC (the vault) | 3 | Decrypt only if the tag verified | safe — verified before any decryption |
MAC-then-Encrypt forces a decryption on attacker-controlled input before the integrity check; Encrypt-then-MAC gate-keeps with the tag, so a forged file is rejected before the cipher ever runs.
The envelope: [ver][iv][hmacTag][cipher]
Every encrypted file in the vault — each credential, each TOTP secret, the listing index — is one self-contained envelope. Picture a postal envelope: the outside carries, in the clear, just enough to handle and verify it (the version, the IV, and the tag that acts as the seal), and the encrypted letter rides inside. They all share the same fixed prefix:
- ver · 1 B · @0 — format version
- IV · 16 B · @1 — random per write
- hmacTag · 32 B · @17 — HMAC-SHA256
- cipher · … · @49+ — AES-256-CBC
| Field | Offset | Size |
|---|---|---|
| ver | @0 | 1 B |
| IV | @1 | 16 B |
| hmacTag | @17 | 32 B |
| cipher | @49+ | … |
The HMAC tag is stored inline, between the IV and the ciphertext — no sidecar files.
The on-disk layout above is the whole file — but the tag authenticates a little more than the file holds. Ahead of the IV and ciphertext, the firmware prepends a 7-byte authenticated context prefix and HMACs the lot (over SHA-256) with the file’s MAC sub-key:
// authPrefix = ver(1) ‖ recordType(1) ‖ slotId(1) ‖ generation(4, LE)
// authMsg = authPrefix ‖ iv ‖ cipher
std::array<uint8_t, kEnvelopeContextSize + kIvSize + MaxCipher> message{};
size_t n = writeEnvelopeContext(message.data(), type, slot, generation); // 7 B
std::memcpy(message.data() + n, iv, kIvSize); n += kIvSize;
std::memcpy(message.data() + n, cipher, cipherLen); n += cipherLen;
VaultCrypto::hmacSha256(macKey, macKeyLen, message.data(), n, tagOut.data());
VaultCrypto::secureWipe(message.data(), message.size());That context prefix is never written to the file — the on-disk version byte stays 0x01 and the layout doesn’t change a single byte. recordType and slotId come from where the file lives (which path, which slot); generation comes from the authenticated meta.bin. So the tag proves that this exact ciphertext, in this exact slot, of this exact type, at this exact revision belongs here, untouched — not just that someone who knew the key produced some ciphertext. Why those three extra fields earn their keep is its own section; first, the basics of writing and reading one.
The IV (initialization vector) is the 16 random bytes that seed CBC’s chaining for the first block; without it, two records with the same first block would decrypt to the same plaintext, leaking equality. The vault draws a fresh IV from the ESP32 hardware RNG on every write, so saving the same password twice produces two completely different ciphertexts — an attacker reading flash can’t even tell that two slots hold the same value. The IV isn’t secret (it’s stored in the clear, right there in the envelope), but it must be unpredictable and unique per write, and authenticating it under the tag stops anyone from quietly substituting one.
Writing a record: encrypt, then MAC
The write path is encrypt-then-MAC in the literal order of its name. The plaintext is first serialized to a packed binary record (more on that later), then encrypted with AES-256-CBC under the encryption sub-key, then the tag is computed over the ciphertext under the MAC sub-key, and finally the four parts are written out:
// Encrypt-then-MAC: encrypt the record, then authenticate the ciphertext.
VaultCrypto::encrypt(keys.enc.data(), plainBuf.data(), plainLen,
iv.data(), cipherBuf.data(), &cipherLen);
computeTag(keys, id, generation, iv.data(), cipherBuf.data(), cipherLen, tag);
// Persist [ver][iv][hmacTag][cipher].
const uint8_t ver = kCredFormatVersion;
f.write(&ver, kCredVerSize);
f.write(iv.data(), kIvSize);
f.write(tag.data(), kCredHmacTagSize);
f.write(cipherBuf.data(), cipherLen);
// Every transient buffer is wiped before this function returns.
VaultCrypto::secureWipe(cipherBuf.data(), cipherBuf.size());
VaultCrypto::secureWipe(iv.data(), iv.size());
VaultCrypto::secureWipe(tag.data(), tag.size());Note the wiping. The plaintext, the keys, and the scratch buffers all hold secret material, and on a device that can be powered down and probed, leaving them in RAM is a liability. Every buffer is cleared with a zeroize primitive that the compiler is not allowed to optimize away — mbedtls_platform_zeroize on the device, which exists precisely because a plain memset can be elided as a “dead store.”
Reading a record: verify, then decrypt
This is the section the whole format exists for. When the vault loads a credential, it does five things in a strict order, and any failure at any step returns nothing:
// Fail-closed: wipe the caller's record up front, so EVERY reject path
// (bad size, version, MAC, decrypt, or decode) leaves no residue behind.
out.wipe();
// 1. Structural checks: plausible size, ciphertext is a whole number of blocks.
if (fileSize < kCredMinFileSize || fileSize > kCredMaxFileSize) return false;
const size_t cipherLen = fileSize - kCredHeaderSize;
if ((cipherLen % kAesBlockSize) != 0) return false;
// 2. Read [ver][iv][storedTag][cipher]; reject an unknown format version.
// ... (reads omitted) ...
if (ver != kCredFormatVersion) return false;
// 3. Verify-before-decrypt: recompute the tag and compare in CONSTANT TIME.
computeTag(keys, id, generation, iv.data(), cipherBuf.data(), cipherLen, expectedTag);
if (!VaultCrypto::constantTimeEqual(expectedTag.data(), storedTag.data(),
kCredHmacTagSize)) {
// MAC mismatch → tampered or corrupt. Wipe scratch, produce no plaintext.
return false;
}
// 4. Tag verified — only now is it safe to decrypt.
VaultCrypto::decrypt(keys.enc.data(), iv.data(), cipherBuf.data(),
cipherLen, plainBuf.data(), &plainLen);
// 5. Decode the packed plaintext with a strict, fail-closed codec.
return decodeCredential(plainBuf.data(), plainLen, out);Two details make this safe rather than merely sequential.
First, the constant-time compare. Imagine a combination lock that gave a little click — and took a hair longer to answer — each time you got a digit right: guessing it blind would stop being hopeless, because the lock itself keeps whispering “warmer, warmer.” An ordinary byte comparison leaks exactly that hint. “Constant-time” here means concrete: the comparison takes the same amount of work regardless of the data, so its duration tells an observer nothing about the secret. A naïve memcmp does the opposite — it returns the instant it finds the first differing byte. Feed it a guessed tag and the time it takes to say “no” reveals how many leading bytes you got right: a tag matching the first 3 bytes returns measurably later than one that’s wrong at byte 0. An attacker who can measure that turns tag forgery into a byte-at-a-time search, ~256 tries per byte instead of 2256 for the whole tag. The vault never uses memcmp on secret-dependent data; it uses a branch-free comparison that XOR-accumulates every byte difference into one value and only checks that value at the very end — same number of operations whether the tag matches on byte 0 or byte 31:
// The vault calls Mbed TLS's mbedtls_ct_memcmp; this is the idea it implements:
bool constantTimeEqual(const uint8_t* a, const uint8_t* b, size_t len) {
volatile uint8_t diff = 0; // 'volatile' stops the optimizer
for (size_t i = 0; i < len; i++) // always touch EVERY byte
diff |= a[i] ^ b[i];
return diff == 0; // one check, at the very end
}The vault doesn’t hand-write that loop — it calls Mbed TLS’s mbedtls_ct_memcmp, the library’s constant-time comparison primitive, which does exactly this. If you’ve never seen why this matters, Coda Hale’s A Lesson In Timing Attacks is the classic five-minute explanation; BearSSL’s constant-time notes go deeper.
Second, fail-closed. The record is wiped before anything is read, so there is no code path — not a bad size, not a bad version, not a MAC mismatch, not a decrypt failure, not a malformed record — that can leave a partially-populated credential in the caller’s buffer. The function either returns true with a complete record or false with nothing.
Every reject path produces no plaintext — the read returns a complete credential or nothing at all.
| Check | On pass | On failure |
|---|---|---|
| Size & alignment | Known version | Fail closed |
| Known version | MAC tag matches | Fail closed |
| MAC tag matches | Decrypts | Fail closed |
| Decrypts | Record decodes | Fail closed |
| Record decodes | Return credential | Fail closed |
What the tag really binds: slot, type, and freshness
A tag over ver ‖ iv ‖ cipher proves the ciphertext is genuine — but it says nothing about three things the format also needs to promise: which slot a file belongs to, what kind of record it is, and how recent it is. An attacker with raw flash write (a programmer or the debug port) but no PIN can turn each silence into an attack — and each maps to something mundane: dropping a letter in the wrong mailbox (wrong slot), relabeling a box so it passes for another kind (wrong type), or sliding last month’s receipt back in as if it were today’s (a stale revision):
| Attack (raw flash write, no PIN) | Why it worked |
|---|---|
Cross-slot substitution — copy cred_05.bin over cred_03.bin | Both sealed with the same key, so the relocated file verified and served slot 5’s secret as slot 3 — one site’s password typed into another’s form |
| Cross-type confusion — drop a credential file onto a TOTP path | Same key and version byte; only the record decoder’s shape check stood in the way, not an authenticated discriminator |
| Selective rollback — restore an older, authentic file for one slot | Nothing recorded how fresh a file should be, so a stale-but-genuine envelope (a password you just rotated away) still verified |
The fix binds the missing context into the tag itself. This is exactly the role of associated data in an AEAD scheme — context that must be authenticated but not encrypted — except here it’s folded directly into the encrypt-then-MAC tag rather than passed to a single-pass cipher. The 7-byte prefix is centralized in one place, so every record module — and the re-key path — computes it bit-identically:
// ver(1) ‖ recordType(1) ‖ slotId(1) ‖ generation(4, little-endian)
inline size_t writeEnvelopeContext(uint8_t* out, EnvelopeRecordType type,
uint8_t slot, uint32_t generation) {
out[0] = kEnvelopeVersion; // 0x01 — unchanged on disk
out[1] = static_cast<uint8_t>(type); // Credential / Totp / Index
out[2] = slot; // the slot this file belongs to
out[3] = generation & 0xFF; // per-slot freshness counter,
out[4] = (generation >> 8) & 0xFF; // little-endian
out[5] = (generation >> 16) & 0xFF;
out[6] = (generation >> 24) & 0xFF;
return kEnvelopeContextSize; // 7
}So a relocated file carries the wrong slotId, a retyped file the wrong recordType, and a rolled-back file an old generation. In each case the tag the firmware recomputes no longer matches the one on disk, so the read fails closed before a single byte is decrypted — the same verify-before-decrypt gate, guarding three more promises:
| Bound field | Cross-slot move | Cross-type swap | Selective rollback |
|---|---|---|---|
| recordType Credential / Totp / Index | — | Closes | — |
| slotId which slot the file is | Closes | — | — |
| generation per-slot freshness | — | — | Closes |
recordType and slotId close type-confusion and relocation outright; generation closes selective rollback — one record reverted while the rest of the vault moves on.
The root of trust: authenticating meta.bin
Every check so far has leaned on something already being trustworthy — the keys, the freshness counters. That trust has to bottom out somewhere: in a root of trust, the one piece you don’t get to question, because if it were forged everything built on top would inherit the lie. Here that piece is meta.bin.
The generation counters have to live somewhere, and that somewhere has to be tamper-evident — otherwise an attacker would simply lower a counter to make a stale file look current. They live in meta.bin: the small header the device reads first, holding the salts, the PIN verifier, and a per-slot generation table — all sealed under its own HMAC tag.
- magic · 2 B · @0 — "KV"
- ver · 1 B · @2 — 0x02
- kdfSalt · 16 B · @3
- pinVerifier · 32 B · @19 — HMAC
- hmacSalt · 16 B · @51
- genTable · … · @67+ — u32 / slot
- metaTag · 32 B · @~95 — HMAC, last
| Field | Offset | Size |
|---|---|---|
| magic | @0 | 2 B |
| ver | @2 | 1 B |
| kdfSalt | @3 | 16 B |
| pinVerifier | @19 | 32 B |
| hmacSalt | @51 | 16 B |
| genTable | @67+ | … |
| metaTag | @~95 | 32 B |
Unlike index.bin, meta.bin is never rebuilt from the record files — it is the root of trust. The trailing tag authenticates every preceding byte, including the generation table.
Because the metadata is authenticated, the unlock sequence becomes a careful ladder: nothing in the file is trusted until the tag checks out, and a wrong PIN derives a different master key — so the metadata tag and the PIN verifier both miss at once.
- 1 parse meta.bin magic · ver · length
- 2 PBKDF2 PIN + kdfSalt
- 3 derive macKey + hmacSalt
- 4 verify metaTag constant-time
- 5 check pinVerifier unlock gate
- 6 unlocked table trusted
A wrong PIN derives a different master key, so the meta tag and the PIN verifier both mismatch — unlock fails closed before the generation table or salts are ever trusted.
An old or format-mismatched meta.bin — say a formatVer 0x01 from an earlier version — simply fails to authenticate, and the vault is re-provisioned. There is no migration and no dual-read: one on-flash format, ever. (The device isn’t in production, so there’s no real data to migrate.)
Crash-safe anti-rollback
Selective rollback is a replay attack — the attacker re-presents an authentic-but-stale file — and the standard defense is a freshness value: a monotonic counter the verifier expects to only ever increase, so a replayed older record fails the check. But a freshness counter is only as good as its update story. Every mutation — add, edit, or delete — bumps the slot’s generation, and the record is rewritten under the new value. That creates a torn-write hazard: lose power between writing the record (under generation N+1) and writing meta.bin (which records N+1), and the two disagree. Resolve that wrong and you either brick a valid slot or quietly re-open the rollback you just closed.
The fix is to pick one single, indivisible moment that “counts” — like the signature on a contract: before the pen lifts nothing is binding, after it everything is. The firmware makes the meta.bin swap that moment — written to a temp file, then atomically renamed into place, so a reader always sees the old meta.bin or the new one, never a torn half-write — and recovers deterministically around it:
- 1 STAGE record → staging, gen N+1
- 2 COMMIT meta.bin rename
- 3 PROMOTE staged → canonical
- 4 CLEANUP remove marker
The meta.bin rename is the one linearization point. On reboot, recovery recomputes the staged record's tag at the current generation: it either verifies (finish the promote) or it doesn't (discard the orphan, keep the old value).
On the next boot, recoverPendingMutation recomputes the staged record’s tag at the current meta.gen[type][slot]. If it verifies, the commit happened (the crash landed after the rename) → finish the promote. If it doesn’t, the commit never happened → discard the orphan and keep the canonical record under the old generation. No window bricks a slot or silently re-opens rollback, and each crash window has its own native test. Because a delete bumps the generation too, a deleted entry can’t be resurrected by replaying its old file.
The honest boundary: open vs _secure
Binding generation closes selective rollback — reverting one record while the rest of the vault moves on. It does not, on the open (developer-flashable) builds, close a full snapshot rollback: revert the entire vault — meta.bin and every record together — to an earlier, internally consistent state, and the counters revert with it, so nothing looks out of place. Catching that needs anti-rollback state living outside the rewritable flash, which is exactly what the hardened _secure builds add.
| Rollback variant | Open builds | _secure builds |
|---|---|---|
Selective — one record reverted, meta.bin moves on | Closed — fails the generation-bound MAC | Closed |
Full snapshot — meta.bin + every record reverted together | Not closed — the whole set is self-consistent | Closed — Secure Boot + Flash Encryption + NVS anti-rollback |
I’d rather state that boundary plainly than oversell it. The generation counter is defense-in-depth for the non-secure builds against the far more practical selective attack; the whole-vault time-warp is a hardware problem, solved in hardware on the builds that opt into burning eFuses.
Two keys from one master: key separation
The envelope uses two different keys — one to encrypt, one to MAC — and that is deliberate (the same instinct that says your house key, car key, and mailbox key should be three different keys). The security argument for encrypt-then-MAC assumes the encryption and authentication keys are independent; reusing one key for both jobs voids the guarantee and invites cross-protocol mischief. So the vault derives separate sub-keys from a single master key, each tagged with a distinct domain-separation label, following the long-standing key-separation guidance — NIST SP 800-57 Part 1 puts it plainly: “a single key should be used for only one purpose”:
// One PBKDF2-derived master key fans out into three single-purpose sub-keys.
// encKey = HMAC(masterKey, "vault-enc") -> AES-256-CBC key
// macKey = HMAC(masterKey, "vault-mac" || hmacSalt) -> file HMAC key
// pinVerifier = HMAC(masterKey, "vault-pin") -> unlock check
VaultCrypto::hmacSha256(masterKey.data(), masterKey.size(),
kEncKeyLabel, kEncLabelLen, out.enc.data());
deriveMacKey(masterKey, hmacSalt, out.mac); // label || per-vault hmacSaltOne master key, three single-purpose sub-keys — each derived with a distinct HMAC label so no key ever does two jobs.
The third sub-key, the PIN verifier, is how the device checks the PIN without storing it: unlock derives the master key from the entered PIN, computes the verifier, and compares it — again in constant time — against the stored value. A wrong PIN simply produces a different verifier, with no hint about how wrong it was. (The verifier sits in the meta.bin header alongside the salts; the PIN itself is never written anywhere.)
The point of the split is that each key can do exactly one thing. If the same key both encrypted files and authenticated them, a clever attacker could try to make ciphertext and tags interact — and the clean security proof for encrypt-then-MAC, which assumes the two keys are independent, would simply no longer apply. Keeping them separate means a compromise or misuse of one capability can’t borrow another:
| Sub-key | Encrypt / decrypt files | Authenticate a file (HMAC tag) | Verify the PIN |
|---|---|---|---|
| encKey HMAC(master, "vault-enc") | Can | Cannot | Cannot |
| macKey HMAC(master, "vault-mac" ‖ salt) | Cannot | Can | Cannot |
| pinVerifier HMAC(master, "vault-pin") | Cannot | Cannot | Can |
Three keys, three jobs. The HMAC labels are the domain separation that makes each key usable for one purpose only — the encryption key can never stand in for the MAC key, and neither can verify the PIN.
Defense in depth: constant-time padding + a fail-closed codec
Because the MAC is verified first, a tampered file never reaches the decryption code, which means the classic padding oracle is unreachable by construction. The padding check below is therefore defense in depth, not the primary defense — belt and suspenders for the case where authenticated-but-corrupt data still needs to be rejected cleanly. It validates PKCS#7 padding in constant time and reports a single unified failure, so it leaks nothing through logs or timing regardless:
// The last byte claims the pad length; fold every check into one mask.
uint8_t padByte = plainOut[cipherLen - 1];
uint8_t padInvalid = 0;
padInvalid |= (padByte == 0); // pad length 0 is illegal
padInvalid |= (padByte > kAesBlockSize); // pad length > block is illegal
for (size_t i = 0; i < kAesBlockSize; i++) {
uint8_t mask = (i < padByte) ? 0xFF : 0x00; // examine the whole block
padInvalid |= (plainOut[cipherLen - 1 - i] ^ padByte) & mask;
}
if (padInvalid != 0) { // one message for corruption AND bad padding
LOG_ERROR(kTag, "Decryption failed");
mbedtls_platform_zeroize(plainOut, cipherLen);
return false;
}The decrypted plaintext is then handed to an in-house binary codec, not a general-purpose parser. The decrypted bytes are the most sensitive range in the whole product, and I don’t want a JSON or CBOR library — with its surface area, its allocations, its surprises — anywhere near them. The decoder is a forward-only cursor where every read is bounds-checked before it copies, every string length is rejected if it exceeds the field’s compile-time cap, and any trailing garbage fails the record:
// need(n): are n more bytes available, without integer-overflow wraparound?
bool need(size_t n) const noexcept { return off_ + n <= len_ && off_ + n >= off_; }
// Read a length-prefixed string: reject len > field-capacity BEFORE copying.
template <size_t N>
bool readStrBody(InplaceString<N>& dst, size_t len) noexcept {
if (len > N || !need(len)) return false; // fail closed, no partial copy
std::memcpy(dst.data(), p_ + off_, len);
dst.data()[len] = '\0'; // NUL-terminate the destination
dst.resyncLength();
off_ += len;
return true;
}Why CBC+HMAC and not AES-GCM?
A fair question. The modern default for authenticated encryption is an AEAD construction like AES-GCM, which fuses encryption and authentication into one primitive and is harder to assemble incorrectly. If I were starting a greenfield protocol, GCM (or ChaCha20-Poly1305) would be the obvious pick.
The vault uses encrypt-then-MAC with AES-256-CBC and HMAC-SHA256 for a specific, boring reason: backend uniformity. The same code has to produce byte-identical output in two very different environments — a host build (where the unit tests run, against the legacy Mbed TLS API) and the device (where Mbed TLS 4.0’s PSA Crypto API is the only public path). CBC, HMAC, and a hand-verified compare are trivially available and deterministic across both; leaning on GCM’s nonce-management rules across two backends adds a sharper footgun than the one I’m avoiding. Encrypt-then-MAC is the generically secure composition, so building AEAD out of CBC+HMAC this way is sound — it’s how IPsec (in its standard configuration) and KDBX 4 do it too.
One primitive, one key. Encryption and authentication are fused; less to wire up by hand.
- Modern default; misuse-resistant if nonces are unique
- Nonce reuse is catastrophic (key recovery)
- Backend and nonce rules must match exactly across host and device
Two primitives, two keys, explicit order. More moving parts, but each is simple and deterministic.
- Byte-identical on legacy Mbed TLS (host) and PSA (device)
- EtM is provably the safe composition
- A fresh random IV per write, authenticated by the tag
One honest footnote on the implementation: PSA’s transition guide suggests psa_mac_verify over “compute a tag, then mbedtls_ct_memcmp.” The vault deliberately keeps the compute-and-compare form because it runs identically on both backends — and the comparison is still constant-time. It’s a trade of one PSA convenience for backend uniformity, made with eyes open.
Listing without decrypting everything: the O(1) index
There’s a nice side effect of treating every file as an independent envelope (the “O(1)” in the heading is just jargon for “the cost stays flat no matter how many credentials you have”). Listing the vault doesn’t mean decrypting every credential — that would be slow and would needlessly expose every secret in RAM. Instead a single encrypted index.bin, in the exact same [ver][iv][hmacTag][cipher] envelope, caches only the non-secret metadata each credential needs in a list view: its id, name, username, brand, and a couple of flags — never the password, URL, notes, or TOTP secret.
Fail closed, by construction
The cipher isn’t what makes this vault trustworthy at rest — the order of operations is. Put plainly: never trust a file until its seal checks out, and when in doubt, hand back nothing. Authenticate the slot, type, freshness, IV, and ciphertext with a separate key; verify that tag in constant time before a single byte is decrypted; authenticate the metadata that holds the freshness counters; wipe everything on the way out; and let every error path converge on the same answer: nothing. Get those right and AES-256 is almost an implementation detail.
There’s one thing this format doesn’t solve, though. All of it protects the file — but the encryption key still ultimately comes from a short PIN, and a short PIN has very little entropy. On the device that’s fine: a brute-force guard locks out and eventually wipes the vault after a handful of wrong PINs, so online guessing dies quickly. The problem is offline. If someone desolders the flash and copies the ciphertext, the lockout never runs — they can derive keys from PIN guesses on a GPU, with no device in the loop to stop them — and a 4-digit keyspace falls in a fraction of a second. (Production flash encryption raises that bar, but the file format shouldn’t have to depend on it.) Closing that gap needs a different trick: one that mixes a secret only the original chip can produce into the key, so the exfiltrated ciphertext is useless on any other hardware. That’s the next post.
Frequently asked questions
What is encrypt-then-MAC?
Encrypt-then-MAC authenticates the ciphertext before any decryption. Each vault file is an envelope — [ver][iv][hmacTag][cipher] — and the HMAC tag is verified in constant time before AES runs, so a tampered file is rejected without ever being decrypted.
Why verify the MAC before decrypting?
Decrypting first exposes a padding oracle: the decrypt step inspects PKCS#7 padding and leaks timing or error differences an attacker can exploit offline. Verifying the tag first means no attacker-controlled bytes ever reach the cipher — the vault fails closed.
How does the format stop a rolled-back or relocated vault file?
The HMAC authenticates a context prefix — ver ‖ recordType ‖ slotId ‖ generation — so a file moved to another slot, retyped, or replaced with an older generation fails the tag, even against an attacker with raw flash access and no PIN.
Does encrypt-then-MAC need separate keys?
Yes. Encryption and authentication use independent keys derived from the PIN. Reusing a single key for both the cipher and the MAC weakens the construction and should be avoided.