A live consumer typing platform with voice narration, 7-language localization across 9 regions, anycast traffic routing across US/Europe/Asia, real-time WebSocket analytics, and AI-generated personalized lessons. Built as a solo developer in under 8 months.
When people visit cosmickeys.app, they see a beautiful typing practice platform with animated finger guides. What they don't see is the production architecture underneath: a multi-region traffic cop routing requests across three continents, a voice narration pipeline that generates audio lessons in seven languages, a real-time WebSocket layer tracking every keystroke, and an AI personalization loop that adapts lessons to each learner's progress.
Anycast traffic cop routes each request to the geographically nearest region. Users in Tokyo hit Asia, users in Frankfurt hit EU, users in California hit US β all transparent, all automatic.
Next.js frontend + FastAPI backend deployed in each region. Shared services (auth, content, analytics) run on private backbone with cross-region replication.
PostgreSQL with a primary write node and regional read replicas. WebSocket gateway streams keystroke events into a time-series pipeline feeding the personalization loop.
This is the part most engineers wave off with βjust put it behind Cloudflare.β That works until you have stateful services, database write consistency, and region-specific localization. I designed a deliberate routing strategy that handles all three.
For a real-time typing platform, every 50ms of round-trip latency is user-visible. Anycast routing at the edge sends each user to the geographically nearest data center. A user in Tokyo hits Asia; a user in London hits Europe; a user in Seattle hits US-West. Zero configuration, automatic.
Health checks run continuously against each region's FastAPI health endpoint. When a region fails or degrades, the traffic cop withdraws it from the anycast pool and traffic automatically spills over to the next-closest healthy region. Users see a brief reconnect on WebSocket but the experience stays alive.
Each region serves localized content β Japanese in Asia, German and French in EU, English variants in US. The traffic cop attaches a region header that the backend uses for content selection. The localization is cache-friendly: each region's CDN caches its own language variants, not a global cache invalidated by any single update.
Localization is usually treated as an afterthought β a post-launch βinternationalizationβ project. I designed it as a first-class architectural concern from day one because retrofitting i18n into a typing platform (where literally every character on the screen has to localize) is a nightmare.
Latin, CJK (Chinese-Japanese-Korean), and Cyrillic handling β each requires distinct rendering, input method editor (IME) support, and keyboard layout logic.
QWERTY, QWERTZ, AZERTY, Dvorak β plus language-specific layouts for non-Latin scripts. Finger animation guides must match the user's actual keyboard.
Typing exercises, sample sentences, and progressive difficulty curves are all localized. Beginner lessons in Japanese use hiragana progressions, not the same lessons as English.
Every instructional prompt has a native-speaker audio variant. The user hears lesson guidance in their language, not English subtitles.
Words-per-minute vs characters-per-minute (critical for CJK where "word" is ambiguous). Dates, numbers, and comparisons formatted per locale.
Color palettes and imagery adapted where colors carry cultural meaning. Error states don't use red in cultures where red signals success.
One of the features that gets zero credit from first-time visitors is the voice narration system. Every instructional prompt β βplace your left index finger on F, your right index on Jβ β has a professionally-generated audio variant in every supported language.
Every keystroke a user makes flows into a WebSocket pipeline in real time. This isn't for vanity metrics β it's the data layer that powers adaptive difficulty, personalized curriculum, and the progress dashboards users see live during their sessions.
Character pressed, timing delta, accuracy flag. The foundation of everything else.
Rolling average over the last N keystrokes, streamed to the UI so users see live progress.
Which keys the user mistypes most often β used to adapt lesson difficulty in real time.
Finger-level accuracy heatmap feeding the visual feedback layer (glow intensity per finger).
Lesson completions, new speed records, streak achievements β dispatched to the notification layer.
Aggregated metrics feed into the AI personalization pipeline for next-lesson selection.
This is the feature that turns CosmicKeys from a typing game into a learning platform. After each lesson, the user's performance flows into an AI decision loop that adapts the next lesson to their weaknesses.
The LLM receives the user's recent performance, the curriculum pool, and the user's declared goals, and returns a ranked list of next-best lessons. The system picks the top choice and serves it β localized, narrated, with difficulty tuned to the user's current level.
What happens when a region goes down? The short answer: users don't notice. The long answer requires careful design at every layer.
Each region exposes a /health endpoint. The traffic cop polls every few seconds and pulls failing regions out of the anycast pool within seconds.
When a region fails, users in that region are routed to the next-closest healthy region. Keystroke streams reconnect over WebSocket within 1-2 seconds.
Writes go to a primary database region; reads can be served from any replica. During failover, writes are delayed briefly rather than split-brained.
The client keeps a local cache of the current lesson so that a brief connection loss does not interrupt the user's typing session.
The uncomfortable truth about βfully globalβ consumer products: most startups claim to be global on day one but ship single-region, English-only software with a translation dropdown bolted on three months later. Truly global β voice-narrated per locale, latency-optimized per region, keyboard-layout-aware per market β is a different engineering problem.
What this demonstrates: I can take a consumer product from zero to multi-region production with localization as a first-class concern, not an afterthought. That's the difference between βtechnically internationalβ and βactually usable by a user in Tokyo who speaks no English.β
How this would scale inside a company: every architectural component here β the traffic cop, the localization pipeline, the voice generation, the WebSocket analytics, the AI personalization β is a reusable pattern that could be pulled into a platform team to serve many product teams. Multi-region consumer infrastructure is expensive to build once and cheap to replicate. I've done it solo as a proof; the same pattern scales to teams.