Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save hungson175/34b52b5f1b36d458e6d250013adbf44b to your computer and use it in GitHub Desktop.

Select an option

Save hungson175/34b52b5f1b36d458e6d250013adbf44b to your computer and use it in GitHub Desktop.
How my voice assistant understands me so well — a plain-language explainer

How My Voice Assistant Understands Me So Well

I talk to my computer all day — in a fast mix of Vietnamese and English, full of odd technical names — and it almost always gets it right. People ask how. Here's the whole idea in plain language.

It isn't one piece of magic. It's three simple steps, each one fixing the weakness of the step before it.


Step 1 — A super-fast listener

My voice first goes to a speech-to-text engine (a service called Soniox). Think of it as a lightning-fast stenographer: it writes down what I say almost the instant I say it.

It's excellent with everyday words. But like any stenographer hearing an unfamiliar name for the first time, it sometimes guesses wrong on unusual ones — a tool called "Claude" comes out as "cloud," or "tmux" becomes "tea mux."

So Step 1 gives me speed, but a rough draft.

Step 2 — A smart editor

That rough draft is then handed to a small, cheap AI model that works like an editor. It doesn't just copy the text — it reads the whole sentence, figures out what I meant from the context, and fixes the slips. "Cross code, fix the bug" becomes "Claude Code, fix the bug." It even smooths my mixed Vietnamese-and-English into one clean instruction.

The clever part: because this model only has to tidy up text (not be a genius), a cheap, fast one is more than enough — and "cheap" means it can run on every single sentence without me ever worrying about the bill.

So Step 2 gives me understanding.

Step 3 — A personal assistant that learns overnight (the secret sauce)

This is the part that makes it feel like it truly knows me.

Every night, a little helper quietly reviews the whole day's conversations, spots the words it kept getting wrong, and updates a personal cheat-sheet — my names, my projects, my slang, the particular way I pronounce things. The next morning, the system is a touch smarter about exactly how I talk. Over weeks, it has gradually learned my personal vocabulary, all on its own.

Most voice tools are one-size-fits-all. Mine is tailored to me, and it keeps getting better.


Why it works so well

The trick isn't a single brilliant component — it's the teamwork:

  • a fast listener for speed,
  • a cheap editor for understanding, and
  • an overnight learner that personalizes everything to me.

Each layer is simple and inexpensive on its own. Stacked together, they add up to a voice assistant that understands my messy, bilingual, jargon-heavy speech — and gets a little better every single day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment