Skip to content

Instantly share code, notes, and snippets.

@matthiassb
Created July 13, 2025 05:45
Show Gist options
  • Save matthiassb/cbec89f3befa632fde0cc3ff3203c893 to your computer and use it in GitHub Desktop.
Save matthiassb/cbec89f3befa632fde0cc3ff3203c893 to your computer and use it in GitHub Desktop.
How Biology Solved Efficiency First (And Why Systems Keep Catching Up)

How Biology Solved Efficiency First

(And Why Systems Keep Catching Up)

The Pattern

The most efficient systems we build almost inevitably mirror biology. That is not coincidence. It is because biology already solved the hardest problems under the harshest constraints.

Postal systems led to networking because both solve the same fundamental question:
➡️ How do I reliably get this payload to that recipient through an unreliable, noisy medium?
Humans solved it first: addresses, routes, handoffs, retries. Networks followed the same model because it works.

Human conversation is leading us toward direct to LM architectures because both solve another universal problem:
➡️ How do I fluidly share intent with another person, in real-time, even as they’re still speaking?
Conversation is predictive, overlapping, and feedback rich, a design born of survival needs. Modern audio first systems mimic this loop not because it’s novel, but because it is the most direct path to natural, efficient communication.


Why This Pattern Repeats

Biology is efficiency under evolutionary pressure. It optimizes for:

  • ⚡ Speed
  • 🔋 Energy conservation where possible, energy investment where necessary
  • 🔧 Fault tolerance
  • 🧠 Multi task coordination

These are not features, they are necessities. Any system built to survive complexity and scale will eventually converge toward biological patterns.


Core Insights from Brain Metabolism

Communication Costs Dominate

Levy & Calvert (2021) found:

  • Cortex computation ≈ 0.1 W
  • Long distance cortical communication ≈ 3.5 W

Thus, neural communication consumes ~35x more energy than computation, emphasizing that biology prioritizes information flow over computational efficiency.
https://doi.org/10.1073/pnas.2008173118


Why Systems Need to Catch Up

1️⃣ Biology Solved for Reality, Not Theory

Biology’s designs work under constraints that mirror the real world: noisy environments, unreliable mediums, and incomplete information. Technology often begins optimizing for clean lab conditions: CPU cycles, latency benchmarks, throughput. These are disconnected from how real-time systems need to behave in practice.

2️⃣ Engineering Optimizes the Wrong Thing First

Traditional architectures prioritize:

  • Batch processing
  • Waterfall models
  • Serialization over streaming
  • Accuracy over responsiveness

These choices look efficient on paper but break down in live, interactive, fault tolerant environments.

3️⃣ Communication Is the Bottleneck, Not Computation

Biology spends its energy keeping communication fast, redundant, and alive, not minimizing computation cycles. Systems trying to emulate human-like responsiveness, such as voice AI or robotics, eventually rediscover the same rule.

4️⃣ Systems Catch Up Because Biology Is the Proof

When technology meets real world complexity, evolution’s answer outperforms every time:

  • Streaming is better than batch
  • Parallel is better than serial
  • Overlap is better than strict handoffs
  • Communication first is better than computation first

The Financial Implications

1️⃣ Infrastructure Costs

Systems that mimic biology’s efficiency, such as streaming, low latency, parallel pipelines, require fewer discrete services:

  • No separate ASR, VAD, NLU, barge in layers.
  • Fewer GPUs, fewer service calls, less latency driven over provisioning.

Direct to LM pipelines consolidate functions, reducing cloud bills, lowering latency, and simplifying architecture.

2️⃣ Opportunity Costs

Latency is expensive. Delays:

  • Reduce conversion rates
  • Break trust in voice assistants
  • Undermine user confidence in real time systems

Faster, fluid systems capture more interactions, more sales, and more user retention.

3️⃣ Ongoing Maintenance Costs

Brittle, staged pipelines require:

  • More tuning
  • More cross team coordination
  • More handling of edge cases

Biology’s model, continuous flow, adaptive interruption, redundancy, leads to simpler systems with fewer failure points, reducing operational burden.

4️⃣ Long Term Strategic Risk

Systems that fail to catch up, such as those sticking to waterfall ASR pipelines, will:

  • Be outperformed on cost
  • Be slower to adapt to new modes, such as AR or robotics
  • Lose competitive edge in user experience

The financial penalty for ignoring biology’s lessons compounds over time.


How Companies Should Think About Metabolism

What Biology Teaches Us

Biology does not minimize energy use everywhere. It spends energy where it creates leverage.

  • Communication between regions of the brain gets priority because speed and coordination create compounding returns.
  • Energy is conserved where it can be, but invested heavily where responsiveness matters.
  • The brain runs lean in some areas, proactive in others, and adaptive throughout.

Companies should treat their technical systems the same way: technical systems are the metabolism of the business.


Technical Systems Capital - The Metabolism That Drives Everything Else

Your architecture is your metabolism. If you waste energy in your technical systems - brittle integrations, slow pipelines, redundant services - the entire organization slows down or works harder to compensate.

Where to Metabolize Energy:

  • Stop fragmenting systems into brittle, serialized services where overlap and streaming reduce waste.
  • Spend compute and design resources on architectures that preserve context, accelerate decision-making, and enable graceful interruption.
  • Optimize for responsiveness, adaptability, and flow rather than arbitrary modularity.
  • Build systems that metabolize complexity naturally, without requiring human intervention to carry the load.
  • Design infrastructure to keep work flowing, not waiting.

The Core Principle

Technical metabolism determines organizational metabolism. Biology spends energy where it maintains connection, speed, and adaptability. Companies should treat technical capital the same way:

  • Spend where it accelerates outcomes.
  • Conserve where slowing down does no harm.
  • Organize systems to respond dynamically, not through brittle handoffs.

Otherwise, you risk burning time, resources, and opportunity trying to solve the wrong problems.


Direct to LM vs. Traditional ASR

Human Conversation Loop Direct to LM Architectures
Biomimetic
Traditional ASR Pipeline
Where It Breaks
Ear → Cochlea → Auditory Nerve
Continuous acoustic signal becomes rich neural features carrying pitch, rhythm, stress.
Raw audio → embeddings
Preserves spectrum and prosody in one pass.
VAD → MFCCs → acoustic model → decoder → text
Slices and flattens audio, discards prosody, timing, emotion.
Superior Temporal Gyrus
Early pattern matching, phoneme clustering, speaker change detection.
Built in VAD & speaker cues in encoder
Detects speakers in the same pass.
Separate VAD service
Lag, false positives, brittle heuristics.
Broca’s Area & Wernicke’s Area
Comprehension and response planning overlap with ongoing speech.
Streaming LM decoding
Predicts tokens mid utterance.
ASR waits for silence
Serial ASR → NLU → DM stages.
Prefrontal Cortex “Task Modules”
Shared neural representation feeds sentiment, intent, decisions, memory.
Classifier heads on embeddings
Multi purpose, efficient.
NLU re parses plain text
Prosody lost, heuristics required.
Motor Cortex → Vocal Apparatus
Speech stops instantly on interruption, responsive to auditory input.
TTS conditioned on live LM
Pauses output on fresh input.
ASR oblivious
Requires brittle barge in plumbing.
⬜ No Biological Equivalent No artificial text layer, continuous audio. ASR inserts brittle text layer, breaks continuity.
Energy Efficiency (CNS)
One integrated system, no redundant processing.
One model, one GPU
Efficient, streaming.
Multiple services
VAD, ASR, NLU, DM, barge in add compute and latency.

Final Thought

Direct to LM architectures are efficient not because they are clever but because they are biologically aligned. The brain treats speech as continuous, overlapping, interruptible signal and deploys energy proactively.

Direct to LM mirrors this.

Traditional ASR pipelines retrofit brittle text workflows onto dynamic acoustic processes and break under pressure.

Biology shows the shortest path. Technology must catch up.


References

Levy, W. B., & Calvert, V. G. (2021). Communication consumes 35 times more energy than computation in the human cortex, but both costs are needed to predict synapse number. PNAS, 118(18):e2008173118. https://doi.org/10.1073/pnas.2008173118

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment