String-Pool Packer
See how a deduplicated, tail-merged string pool with uint16 offsets beats a const char* pointer table for storing UI strings on a 32-bit MCU.
Each line is one index entry — a (language × ID) cell. Identical strings and suffixes are merged.
First row is the header: id,en,es,… Every cell (id × language) is one index entry, just like the firmware.
- Index table
- —
- String blob
- —
- Total
- —
- Relocations
- —
- Index table
- —
- String blob
- —
- Total
- —
- Relocations
- —
Every string — stored or tail-merged — is read by pool + offset, so retrieval is O(1) and identical regardless of merging.
| String | Occurrences | Offset | Pool bytes | Status |
|---|
Everything runs in your browser — nothing is uploaded.
About This Tool
Firmware that ships UI text in several languages usually stores it as a 2-D table of const char* pointers — one per (language × string ID) cell. On a 32-bit MCU each pointer is 4 bytes of pure index overhead, before a single character of text. This tool shows what you save by switching to a single packed string pool plus a table of 2-byte uint16 offsets, with identical strings deduplicated and suffixes tail-merged into longer strings.
It is the interactive companion to the article
What the Linker Won’t Do: Packing i18n Strings on an MCU
.
Two modes
Lines — paste strings one per line; each line is one index entry (a single language × ID cell).
CSV — the real firmware layout: a header
id,en,es,…and one row per string ID, one column per language. Every cell becomes an index entry, and an identifier × language lookup simulates the firmware’str(): pick an ID and a language and the tool shows the resolved offset and string (gen::string(lang, id)→pool + offset).
How do I use it?
Choose Lines or CSV mode and paste your data (use Load example for a starting point).
Pick the pointer width for your target (4 bytes for a 32-bit MCU, 8 bytes for a 64-bit host).
Read the comparison: index-table bytes, string-blob bytes, and the total for both the pointer-table and the packed-pool approach.
Scan the per-string breakdown — including each string’s byte offset — to see which strings were deduplicated and which were tail-merged into a longer string (their offset points partway into that longer entry).
What does it compute?
Index table — entries × 4 B (pointers) versus entries × 2 B (
uint16offsets). This halving is the structural win no linker can do for you.Dedup — identical strings (shared across languages, like
“OK”) are stored once.Tail-merge — a string that is the suffix of another reuses the longer string’s bytes (placed longest-first), exactly what the build-time generator does.
uint16 guard — flags when the pool grows past 64 KiB, at which point offsets no longer fit in a
uint16.Retrieval — every string, stored or tail-merged, is read by
pool + offset. A tail-merged string is never stored separately; its offset simply lands inside a longer entry, so lookup stays O(1) and identical for every string.
Is my data private?
All parsing, deduplication, tail-merging, and byte counting happen entirely in your browser. Nothing you type is transmitted to any server.
Frequently asked questions
Why do const char* pointer tables waste so much flash on an MCU?
Each cell in a 2-D (language × string ID) table holds a pointer, which is 4 bytes on a 32-bit MCU before a single character of text. Switching to a uint16 offset table halves that index overhead, a structural win the linker can't make for you.
How much can a packed, tail-merged string pool save versus a pointer table?
The savings come from three places: halving the index (4-byte pointers to 2-byte uint16 offsets), deduplicating identical strings shared across languages, and tail-merging strings that are suffixes of longer ones. Paste your real strings to see the exact byte total for both layouts.
What is tail-merging in a string pool?
Tail-merging stores a string that is the suffix of a longer one by pointing its offset partway into that longer entry, instead of storing it separately. Strings are placed longest-first so the shorter suffix reuses the same bytes.
When do uint16 offsets stop working for a string pool?
uint16 offsets can only address the first 64 KiB of the pool. The tool flags when your pool grows past that limit, at which point you would need wider offsets to reach later strings.
Does this tool send my strings to a server?
No. All parsing, deduplication, tail-merging, and byte counting happen entirely in your browser, and nothing you type is transmitted anywhere.