Back to Blog
José Manuel Requena Plens

Your 4-Digit PIN Is Fine: Device-Bound Keys on ESP32-S3

Why PBKDF2 iterations can't protect a 4-digit PIN on a microcontroller, and how an eFuse-HMAC device secret plus HKDF stops offline brute force.

Cover image for Your 4-Digit PIN Is Fine: Device-Bound Keys on ESP32-S3

A four-digit PIN has ten thousand possible values. That is not a typo and it is not a placeholder for “a longer one later” — it is the real secret that unlocks the vault on a hardware password manager I’m building (not yet released), an ESP32-class device that keeps your credentials on its internal flash. Ten thousand. A single modern GPU runs through ten thousand of anything before you finish reading this sentence.

So how is that safe? The honest answer is that on a microcontroller it usually isn’t, and the usual fix — crank up the key-derivation iteration count until brute force “takes too long” — quietly fails on this kind of silicon. This post is about the fix that actually works: binding the key to the physical chip so that every offline guess needs a secret the attacker can’t extract, while a small on-device guard handles the guesses someone makes by hand.

TL;DR — Why a 4-digit PIN survives a flash dump

  • A 4-digit PIN is 10⁴ = 10,000 combinations. One RTX 5090 runs PBKDF2-HMAC-SHA256 at ~11.16 MH/s, so it sweeps every PIN in well under a millisecond offline.
  • PBKDF2 iteration count can’t save you on an MCU: the count is sized to a ~2-second on-device unlock, and whatever count makes that bearable, a GPU does far faster.
  • The two threats are different. Online guessing (someone holding the device) is bounded by a lockout / wipe-after-10 guard. Offline guessing (someone who dumped the flash) needs a different defense.
  • That defense is device binding: after PBKDF2, mix in a per-chip deviceSecret with HKDF (RFC 5869). On ESP32-S3 the secret is an eFuse-HMAC the CPU can’t read; on classic ESP32 it’s an NVS pepper.
  • A device-bound key only reproduces on the original chip, so an offline GPU sweep of a 10,000-key space becomes an impossible offline attack — a categorical, not linear, gain.
Quick glossary (if crypto isn't your background)
  • Entropy · keyspace: how hard a secret is to guess. A 4-digit PIN has 10,000 combinations (~13.3 bits): very little.
  • Brute force: trying every combination, one by one, until one works.
  • Online vs offline: online = guessing against the device itself, which can slow you down; offline = guessing against a stolen copy, on your own hardware, with no limit.
  • KDF · PBKDF2 · iterations (work factor): a function that turns the PIN into a key by repeating a calculation many times, to make each attempt deliberately slow.
  • GPU · hashrate (H/s): a graphics card tries millions of keys per second; “H/s” is guesses per second.
  • Device binding · device secret: mixing into the key a secret unique to this chip that can’t leave it.
  • eFuse: bits burned into the silicon (once) that the chip can use but that can’t be read back.
  • HKDF: the standard, analyzed method for mixing and deriving keys from a secret.
  • pepper · NVS: the pepper is an extra secret shared across the whole device (a cousin of salt); NVS is the ESP32’s persistent key-value store.

A 4-digit PIN has no entropy

Let’s be blunt about the keyspace. Four decimal digits is 104, which is exactly ten thousand values — roughly 13.3 bits of entropy (the jargon for “how hard it is to guess”) if every PIN were equally likely, and they aren’t, because humans pick 1234 and 0000 and their birth year far more often than chance would suggest. A 4-digit PIN is, at heart, a bike lock with 10,000 positions: plenty against someone trying by hand, laughable against a machine that runs through all of them in a blink. There is no clever KDF, no cipher, no amount of salting that adds entropy to a secret that small. The secret is the secret.

The reason a PIN this short is even defensible is that guessing it costs something, and the cost depends entirely on where the attacker is allowed to guess.

That 11.16 MH/s figure is not hypothetical. It comes straight from a published hashcat benchmark of the RTX 5090, where PBKDF2-HMAC-SHA256 (hashcat mode 10900) runs at about 11,157 kH/s on a single card — and raw SHA-256 at roughly 28.35 GH/s (mode 1400). At eleven million guesses per second, ten thousand PINs is 10000 / 11_160_000 ≈ 0.0009 seconds. The PIN, treated as an offline-attackable secret, is already gone.

The difference between online and offline is everything here. Online is the thief standing at your door trying keys: you can slam it shut after a few tries. Offline is the thief who took a copy of the lock back to their workshop, where they file keys unwatched, with all the time in the world. So the only thing that makes a short PIN viable is making sure the attacker can’t get offline — and bounding how fast they can guess online.

The online side is the easy half, and it’s the one most products get right. My firmware tracks consecutive failed attempts in NVS so the count survives a reboot or a deep-sleep cycle, and it escalates: nothing for the first three tries, a 30-second lockout for attempts four through six, a 300-second lockout for seven through nine, and a full vault wipe on the tenth. The lockout deadline is stored against the real-time clock, so yanking the power doesn’t reset the timer.

CPP brute_force_guard.cpp — the online ceiling
static constexpr uint8_t  kShortLockStart = 4;    // first timed lockout
static constexpr uint8_t  kLongLockStart  = 7;    // longer lockout
static constexpr uint8_t  kWipeThreshold  = 10;   // destroy the vault
static constexpr uint32_t kShortLockSec   = 30;
static constexpr uint32_t kLongLockSec    = 300;

uint32_t BruteForceGuard::registerFailure() {
    NvsStore prefs;
    prefs.open(kNvsNs, NvsStore::OpenMode::ReadWrite);

    uint8_t attempts = prefs.getU8(kNvsAttempts, 0) + 1;
    prefs.setU8(kNvsAttempts, attempts);

    if (attempts >= kWipeThreshold) {
        return UINT32_MAX;            // caller MUST wipe the vault
    }
    uint32_t lockoutSec = getLockoutDuration(attempts);
    // ... persist an RTC-relative unlock time so it survives power loss ...
    return lockoutSec;
}

This is the canonical defense for a short secret, and it’s what NIST SP 800-63B means when it tells you to rate-limit online guessing — its revision 4 caps consecutive failed attempts at 100 by disabling the authenticator, and adds a stricter clause aimed squarely at secrets like a PIN: when the authentication secret carries less than 64 bits of entropy, the verifier shall rate-limit the total number of consecutive failed attempts. A 13.3-bit PIN lands deep inside that clause. For an attacker physically tapping the keypad, a 10,000-value PIN behind “you get about ten tries, then I erase everything” is genuinely strong. Ten guesses out of ten thousand is a 0.1% chance, and you only get one run at it.


You can’t out-iterate a GPU on an MCU

The textbook answer to “my secret has low entropy” is a password-based KDF with a high work factor. PBKDF2 (RFC 8018, the current PKCS#5; NIST SP 800-132 for password-based key derivation) lets you pick an iteration count, and the OWASP Password Storage Cheat Sheet recommends 600,000 iterations of PBKDF2-HMAC-SHA256. Make each guess expensive enough and even a small keyspace becomes annoying to sweep. That’s the theory.

It breaks on a microcontroller for a simple physical reason: the work factor slows down the legitimate device far more than it slows down the attacker. The iteration count has to be small enough that the user’s unlock is bearable, and on this silicon that ceiling is low. My production builds use 35,000 iterations on the ESP32-S3 and 25,000 on classic ESP32, each sized for roughly a 2-second unlock. The OWASP-recommended 600,000 would push a single unlock to about 32 seconds on the S3 — and around 47 seconds on classic ESP32. Nobody is going to wait half a minute to open their password manager, so that count is physically unusable as a UX, not just inconvenient.

The chart shows the only knob iteration count gives you: make the user wait longer. Pushing the S3 from 35,000 to 600,000 turns a 2-second unlock into a 32-second one (the classic ESP32 fares worse, around 47 seconds). Meanwhile the attacker’s side of the ledger barely moves — even at 600,000 iterations, a single RTX 5090 grinds PBKDF2-HMAC-SHA256 at hundreds of thousands of guesses per second, and a 10,000-key space is still cleared in a fraction of a second. The MCU does one derivation in two seconds; the GPU does hundreds of thousands in parallel. You are racing a sprinter while wearing the ankle weights yourself.


The idea: bind the key to the silicon

If you can’t make each offline guess slow enough, make each offline guess need something the attacker doesn’t have. After PBKDF2 produces the master key from the PIN and salt, I mix in a 32-byte deviceSecret that only the original chip can produce. The verifier stored on flash is then derived from that bound master key. An attacker who dumps the flash has the ciphertext and the salt, but recomputing the key for any PIN guess now requires the per-chip secret — and that secret never appears on the flash they copied.

It’s like requiring, on top of the PIN, an ingredient that exists only in this one chip and can’t be copied: however perfect the flash copy the thief walks off with, without that ingredient their 10,000 tries go nowhere. This is the whole trick, and it changes the attack from linear to categorical. Before binding, the offline attacker needs 10,000 cheap PBKDF2 evaluations and wins. After binding, they need a secret that physically does not leave the chip, so the offline attack doesn’t get slower — it stops existing. The only place a guess can be checked is back on the device, where the lockout-and-wipe guard is waiting.

In the firmware this is one call, dropped into both the provisioning path and the unlock path so they can never drift apart:

CPP vault_meta.cpp — unlock() mirrors provision() exactly
// Tier 1: stretch the PIN into a master key.
VaultCrypto::deriveKey(pin, pinLen, meta.kdfSalt.data(), meta.kdfSalt.size(),
                       masterKey.data(), masterKey.size(), iterations);

// Tier 2: bind the master key to THIS device (a no-op on open builds).
// Must mirror provision() exactly so the verifier reproduces on the same chip.
crypto::bindMasterKey(meta.kdfSalt.data(), meta.kdfSalt.size(),
                               masterKey.data(), masterKey.size());

// Derive the verifier from the (now bound) master key and compare constant-time.
VaultKeys::derivePinVerifier(masterKey, candidate);
const bool match = VaultCrypto::constantTimeEqual(
    candidate.data(), meta.pinVerifier.data(), kPinVerifierSize);

Binding is a hardened-production feature, gated behind a build flag. Open development builds keep their re-flashable, no-eFuse-burn ergonomics, so on those the bind step is a byte-identical no-op: the derived key equals exactly what PBKDF2 produced. That’s deliberate, and I’ll come back to what it costs you at the end.


Where the device secret comes from

The “secret only this chip can produce” has two implementations, picked at compile time by what the chip can actually do. The deciding capability is whether the SoC has a hardware HMAC peripheral, which ESP-IDF exposes as SOC_HMAC_SUPPORTED. The ESP32-S3 has it; the classic ESP32 does not — and that single fact is why the code carries two backends instead of one.

Device-secret backends
BackendSelected whenThe device secret is…Strength
EFuseHmacHardened build + ESP32-S3-class (has HMAC peripheral)HMAC_eFuse(label ‖ kdfSalt) over an eFuse key that’s unreadable by software/JTAGStrongest — key never leaves hardware
NvsPepperHardened build + classic ESP32 (no HMAC peripheral)A random 32-byte pepper generated once, stored in encrypted NVSWeaker — software-reachable, but defeats naive flash exfil
NoneOpen builds / no capabilityIdentity — the master key is left unchangedNo binding (re-flashable dev builds)

The selection is a compile-time cascade. The EFuseHmac path only compiles where ESP-IDF enables its PSA opaque HMAC driver, and ESP-IDF gates that on SOC_HMAC_SUPPORTED — so the same macro that means “this chip has the HMAC peripheral” doubles as the gate for picking the strong backend. No capability and a hardened build falls back to the pepper; an open build resolves to None.


eFuse-HMAC: a key the CPU can’t read

Think of a sealed black box with a secret inside: you can hand it a message and it returns an answer computed with that secret, but there’s no way to open it and read the secret out. That is almost literally the ESP32-S3’s HMAC peripheral. The device secret is computed by that HMAC peripheral, and the property that matters is right there in Espressif’s own documentation: in “upstream” mode the peripheral computes an HMAC over a message using a key burned into eFuse, and the eFuse key never leaves the module — the HMAC result is returned to software by design, but the key bytes are not. The key block is burned with the purpose ESP_EFUSE_KEY_PURPOSE_HMAC_UP (value 8) and read-protected, so software and JTAG simply cannot read it back. You feed in a message, you get back a 32-byte tag, and the key stays sealed.

ESP-IDF 6.x drives the HMAC peripheral through the PSA Crypto opaque-key interface rather than the old esp_hmac_calculate call, and that’s exactly the path the firmware takes. The trick is psa_import_key of a tiny struct that holds the eFuse key id — a reference to the key block, not key bytes — under the PSA_KEY_LIFETIME_ESP_HMAC_VOLATILE lifetime. PSA then routes psa_mac_compute into the peripheral, which does the HMAC over my message internally and hands back only the result.

CPP device_secret.cpp — computeEFuseSecret() (trimmed)
// Message = domain label || vaultSalt, so the secret is specific to this
// use and per-vault.
std::memcpy(message.data(), kEFuseLabel, kLabelLen);     // "vault-device-secret-v1"
std::memcpy(message.data() + kLabelLen, vaultSalt, vaultSaltLen);

// Import a REFERENCE to the eFuse key block — never the key bytes.
esp_hmac_opaque_key_t opaque = {};
opaque.efuse_key_id = kHmacEfuseKeyId;                   // HMAC_KEY0 by default

// PSA attributes (elided): an opaque, volatile HMAC-SHA256 key.
psa_import_key(&attr, /* reference to opaque */ ..., sizeof(opaque), &keyId);

// Runs entirely inside the HMAC peripheral; out[] receives the 32-byte tag.
psa_mac_compute(keyId, PSA_ALG_HMAC(PSA_ALG_SHA_256),
                message.data(), messageLen, out.data(), out.size(), &macLen);
psa_destroy_key(keyId);
VaultCrypto::secureWipe(message.data(), message.size());

NVS pepper: the classic-ESP32 fallback

The classic ESP32 has no HMAC peripheral, so the strong path simply isn’t available — there’s no hardware that will hold a key and refuse to give it back. The fallback is an old, well-understood idea: a pepper. On first provision the firmware generates a random 32-byte value, stores it once, and reloads it on every unlock thereafter.

CPP device_secret.cpp — computePepper(): generate once, then reload
bool DeviceSecret::computePepper(PepperStore& store,
                                 std::array<uint8_t, kDeviceSecretSize>& out) {
    if (store.load(out)) {
        return true;                       // reload the existing pepper
    }
    // First provision on this device: generate and persist a fresh pepper.
    VaultCrypto::generateRandom(out.data(), out.size());
    if (!store.store(out)) {
        VaultCrypto::secureWipe(out.data(), out.size());
        return false;
    }
    return true;
}

The honest framing is that this is weaker than the eFuse path, and the diagram above says why: the CPU has to read the pepper to use it, so it is software-reachable in a way the eFuse key never is. Its confidentiality rests entirely on Flash Encryption + NVS Encryption being active on hardened builds, and the pepper is never logged. What it does buy you is real: a naive flash exfil — desolder the chip, dump it, attack the copy on a GPU — no longer works, because the pepper lives in encrypted NVS and isn’t present in a plain dump. It’s a meaningful bar against the most common offline attack, just not the categorical hardware seal the S3 gets.


Mixing it in with HKDF, not a hand-rolled HMAC

Once you have a 32-byte device secret and a 32-byte PBKDF2 master key, you need to combine them into a new key. The temptation is to reach for a bare HMAC(deviceSecret, master) and call it done. I didn’t, because there’s a published, analyzed primitive built for exactly this — the certified tool instead of mixing by eye: HKDF (RFC 5869), the extract-then-expand KDF whose security argument was laid out by Hugo Krawczyk in Cryptographic Extraction and Key Derivation: The HKDF Scheme. HKDF gives you a non-uniform secret turned into uniform key material with domain separation, and using it verbatim keeps the whole construction a citable, standard one rather than something I invented on a Tuesday.

The construction is the canonical two-step:

The salt is the device secret and the input key material is the PBKDF2 master, so PRK = HKDF-Extract(salt = deviceSecret, IKM = master) and then boundMaster = HKDF-Expand(PRK, info = "vault-device-bind-v1", L = 32). The versioned info string is domain separation: if I ever change the binding construction, a different info makes the new keys provably distinct from the old ones.

Here’s the engineering beat I’m proudest of. My HKDF is built on the firmware’s own VaultCrypto::hmacSha256, not on a library’s HKDF module — and that’s a feature, not reinvention. The device runs PSA Crypto on mbedTLS 4.0; the native host tests run a legacy mbedTLS. Crucially, mbedTLS 4.0 removed the legacy mbedtls_hkdf module (you’re expected to drive psa_key_derivation_* with PSA_ALG_HKDF instead), so a portable HKDF that calls the old module just won’t compile on the device, and one that calls only the PSA path won’t run on the host. By building HKDF on one HMAC primitive that exists on both backends, I get byte-identical output everywhere and sidestep the 3.6→4.0 API churn entirely.

CPP device_secret.cpp — hkdfSha256() built on one HMAC primitive
// Extract: PRK = HMAC(salt, IKM). RFC 5869 uses a HashLen zero salt for the
// empty-salt case; we fall back to that since an empty HMAC key isn't portable.
std::array<uint8_t, kHashLen> prk{};
VaultCrypto::hmacSha256(hmacSalt, hmacSaltLen, ikm, ikmLen, prk.data());

// Expand: T(0) = empty; T(n) = HMAC(PRK, T(n-1) || info || counter); OKM = T(1)..
for (uint32_t counter = 1; ok && done < outLen; ++counter) {
    /* block = T(n-1) || info || counter */
    ok = VaultCrypto::hmacSha256(prk.data(), prk.size(), block.data(), off, tCur.data());
    const size_t take = (outLen - done < kHashLen) ? (outLen - done) : kHashLen;
    std::memcpy(out + done, tCur.data(), take);
    /* ... carry T(n) forward, wipe scratch ... */
}

bindMasterKey() ties it together: for the None backend it returns immediately with the master untouched; otherwise it computes the device secret, runs HKDF, overwrites the master key in place with the bound result, and wipes every intermediate with secureWipe before returning.


Proving it: a known-answer test

A binding that’s wrong is worse than no binding — if provision() and unlock() ever disagreed by a single byte, every unlock would fail and the vault would be bricked. So the construction is pinned by tests, and the load-bearing one is a known-answer test — checking that our output matches a set of official reference numbers byte for byte, like checking a result against the answer key — against the published vectors in RFC 5869 Appendix A. Because the HKDF runs identically on the host and the device, asserting it against the RFC vector on the native build proves the HKDF primitive itself is byte-exact — so the binding it computes is reproduced identically on host and device.

CPP test_device_secret.cpp — RFC 5869 Appendix A.1 known-answer test
// RFC 5869 Appendix A.1, Test Case 1 (HKDF-SHA256).
static constexpr std::array<uint8_t, 42> kRfcOkm = {
    0x3c, 0xb2, 0x5f, 0x25, 0xfa, 0xac, 0xd5, 0x7a, 0x90, 0x43, 0x4f, 0x64, /* ... */ };

static void test_hkdf_rfc5869_case1_matches_known_answer() {
    std::array<uint8_t, 42> okm{};
    const bool ok = hkdfSha256(kRfcSalt.data(), kRfcSalt.size(),
                               kRfcIkm.data(), kRfcIkm.size(),
                               kRfcInfo.data(), kRfcInfo.size(),
                               okm.data(), okm.size());
    TEST_ASSERT_TRUE(ok);
    TEST_ASSERT_EQUAL_MEMORY(kRfcOkm.data(), okm.data(), okm.size());   // exact bytes
}

The same test file also checks the parts that don’t need silicon: that the mix is deterministic (the same chip and PIN always yield the same bound key), that binding is a byte-identical no-op when disabled, and — via an in-memory PepperStore fake — that the pepper is generated exactly once and reloaded thereafter. If you want the bigger picture of how that bound key then protects each file at rest, that’s the subject of the companion post on the encrypt-then-MAC vault envelope — this key is what fills that envelope.


The honest costs

Device binding is not free, and pretending otherwise would be its own kind of dishonesty. Tying the key to the chip is precisely what defeats the offline attacker, which means it’s also precisely what you lose if the chip dies.

Trade-offs you're accepting
  • A dead or wiped chip is unrecoverable — by design. The vault is bound to this silicon. If the chip fails or the wipe fires, the key cannot be reconstructed anywhere else. Recovery is the export/backup path you set up beforehand, not key escrow.
  • The eFuse burn is one-time and irreversible. Provisioning the EFuseHmac backend burns an eFuse key block with purpose HMAC_UP, which permanently consumes that block. That’s an owner decision made knowingly on first hardened boot, not something a build flag should do behind your back.
  • Open builds don’t bind at all. Development builds resolve to the None backend, so the binding step is a byte-identical no-op and the firmware stays re-flashable with no eFuse burn. Convenient for hacking on the device; it also means an open build offers zero offline protection.
  • The pepper path is fully testable today — generate, persist, reload, and the HKDF mix all run under the native host harness against the RFC vector.

Wrap: a short PIN, two complementary defenses

Step back and the design is a clean division of labor. A 4-digit PIN is hopeless against an offline GPU and perfectly fine against someone tapping a keypad — so I defend each lane with the control that actually fits it. Online guessing meets the brute-force guard: a handful of tries, escalating lockouts, and a vault wipe at ten. Offline guessing meets device binding: a per-chip secret, sealed in the HMAC peripheral on the S3 or held in encrypted NVS on classic ESP32, mixed into the key with HKDF so the bound master key only ever reproduces on the original silicon.

Neither defense alone is enough. The lockout means nothing if the attacker can guess off-device; the binding means nothing against someone who simply keeps typing PINs. Together, a dumped flash is just ciphertext and a salt for a key the attacker can’t recompute, and a stolen device is ten guesses from being wiped. That’s how ten thousand combinations become genuinely enough.

4-digit PIN, device-bound

An online brute-force guard (escalating lockout, wipe-after-10) bounds on-device guessing, while eFuse-HMAC or NVS-pepper device binding via HKDF makes an offline GPU sweep of the 10,000-key PIN space infeasible. The honest caveat: the vault is tied to the chip, so a dead device is unrecoverable by design — recovery is via export/backup, not escrow.

Frequently asked questions

How can a 4-digit PIN be secure on a hardware password manager?

A 4-digit PIN is only safe because two complementary defenses bound where an attacker can guess. An online brute-force guard allows about ten attempts before wiping the vault, and device-bound key derivation mixes a per-chip secret into the key so an offline GPU sweep of the 10,000-PIN space can't be reproduced off the original chip.

Why can't a high PBKDF2 iteration count protect a PIN on a microcontroller?

The iteration count must stay low enough that the legitimate user's unlock is bearable — about 2 seconds, which means 35,000 iterations on the ESP32-S3 and 25,000 on classic ESP32. OWASP's recommended 600,000 would push a single unlock to roughly 32 seconds on the S3 and 47 on classic ESP32, while a GPU still sweeps all 10,000 PINs in under a millisecond.

What is device binding in key derivation?

Device binding mixes a 32-byte per-chip deviceSecret into the master key after PBKDF2, using HKDF. The verifier stored on flash then only reproduces on the original chip, so an attacker who dumps the flash still can't recompute the key without the secret — turning an offline brute-force attack from linear into categorically impossible.

Where does the per-chip device secret come from on the ESP32?

It has two backends chosen at compile time. On the ESP32-S3 it's an eFuse-HMAC: the HMAC peripheral computes a tag using an eFuse key that's read-protected and never leaves the silicon. On classic ESP32, which has no HMAC peripheral, it's a random 32-byte NVS pepper stored in encrypted NVS — weaker, but it still defeats a naive flash dump.

Why use HKDF instead of a hand-rolled HMAC to mix in the device secret?

HKDF (RFC 5869) is a published, analyzed extract-then-expand KDF that turns a non-uniform secret into uniform key material with domain separation, keeping the construction standard and citable. It's built on one HMAC primitive so output is byte-identical on both host and device, and it's pinned by a known-answer test against the RFC 5869 Appendix A vectors.

What happens to the vault if the device dies?

A dead or wiped chip is unrecoverable by design, because the key is bound to that specific silicon and can't be reconstructed elsewhere. Recovery relies on an export/backup path set up beforehand, not key escrow. The eFuse burn is also one-time and irreversible.