Blog
Welcome to the Blog.
Assumption-Led Security Reviews
2026-02-22
Many security reviews fail before they begin because they are framed as checklist compliance rather than assumption testing. Checklists are useful for coverage. Assumptions are where real risk hides.
Every system has assumptions:
When assumptions are wrong, controls built on top of them become decorative.
An assumption-led review starts by collecting claims from architecture, docs, and team memory, then converting each claim into a testable statement. Not “is auth secure?” but “can an untrusted caller obtain action X through path Y under condition Z?”
This shift changes review quality immediately. ... continue
Prototyping with Failure Budgets
2026-02-22
Most prototype plans assume success too early. Schedules are built around happy-path bring-up, and risk is represented as a vague buffer at the end. In practice, hardware projects move faster when failure is budgeted explicitly from the beginning.
A failure budget is not pessimism. It is resource planning for uncertainty:
Without these budgets, teams call normal engineering iteration “delay.”
The first step is failure classification. Not all failures are equal:
Each class needs different mitigation strategy, so one generic “debug week” is rarely effective. ... continue
Timer Capture Without an RTOS
2026-02-22
One of the most useful embedded skills is measuring external timing accurately without hiding behind a heavy runtime stack. You do not need an RTOS to capture pulse widths, frequency drift, or event latency with high reliability. You need a clear timing model, disciplined interrupt design, and careful data handoff.
Timer input-capture peripherals are built for this job. They latch counter values on configured edges and let firmware process deltas later. The hardware does the precise timestamping; software handles interpretation.
A robust architecture starts with three decisions:
If these are vague, accuracy claims will be vague too.
Choose timer frequency from measurement goals, not convenience. Too slow and quantization error dominates. Too fast and overflow complexity increases, especially on narrow counters. A good target is where one tick is clearly below your required resolution with margin for jitter analysis. ... continue
State Machines That Survive Noise
2026-02-22
A lot of embedded bugs are not algorithm failures. They are state-management failures under imperfect signals. Inputs bounce, clocks drift, interrupts cluster, and peripherals report transitional nonsense. Firmware that assumes clean edges and ideal timing eventually fails in the field where noise is normal.
Robust systems treat noise as a design input, not a test surprise.
State machines are sometimes dismissed as “old-school” in modern embedded stacks. That is a mistake. They remain one of the best tools for making behavior explicit under uncertainty:
Most importantly, state machines force you to name ambiguous phases that ad-hoc boolean logic usually hides.
A resilient architecture separates interrupt capture from policy: ... continue
SPI Signals That Lie
2026-02-22
SPI looks simple on paper: clock, data out, data in, chip select. Four wires, deterministic timing, done. In real projects, SPI failures often appear as “sometimes wrong bytes,” “first transfer fails,” or “only breaks on production boards.” These are the kind of bugs that waste days because the bus seems healthy at first glance.
The core lesson is that SPI integrity is not just protocol correctness. It is electrical timing, firmware sequencing, and peripheral state management combined.
Common failure classes:
Any one can produce plausible-but-wrong data.
I start with protocol truth first. Confirm CPOL/CPHA mode from datasheets, then verify with logic analyzer captures of command/response boundaries. Do not rely on “it worked with another sensor.” Different devices tolerate different mistakes. ... continue
Debouncing with Time and State
2026-02-22
Button debouncing is one of the smallest problems in embedded systems and one of the most frequently mishandled. That combination makes it a perfect teaching case. Engineers know contacts bounce, yet many designs still rely on ad-hoc delays or lucky timing. These solutions pass demos and fail in real operation. A robust approach treats debouncing as a tiny state machine with explicit time policy.
Mechanical bounce is not mysterious. On transition, contacts physically oscillate before settling. During that interval, GPIO sampling can see multiple edges. If firmware interprets every edge as intent, one press becomes many events. The correct objective is not “filter noise” in the abstract; it is to infer a human action from unstable electrical evidence with defined latency and false-trigger bounds.
The naive pattern is edge interrupt plus delay_ms(20) inside the handler. This feels simple but causes collateral damage: blocked interrupt handling, jitter in unrelated tasks, and poor power behavior. Worse, fixed delays are often too long for responsive UIs and still too short for worst-case switches. Delays treat symptoms while creating scheduling side effects.
A better pattern separates observation from decision. Observation samples pin state periodically or on edge notifications. Decision logic advances through states: Idle, CandidatePress, Pressed, CandidateRelease. Each transition is gated by elapsed stable time. This design is cheap, deterministic, and testable. It also composes naturally with long-press and double-click features.
Sampling frequency matters less than many assume. You do not need MHz polling for human input. A 1 ms tick is usually enough, and even 2–5 ms can be acceptable with careful thresholds. What matters is consistent sampling and explicit stability windows. If a signal remains stable for N ticks, commit the state transition. If it flips early, reset candidate state. ... continue
Ground Is a Design Interface
2026-02-22
Many circuit failures are not caused by “bad signals.” They are caused by bad assumptions about ground. Designers often treat ground as a neutral reference that exists automatically once a symbol is placed. In reality, ground is a physical network with resistance, inductance, and shared current paths. If we ignore that, measurements lie, interfaces become unstable, and debugging turns into superstition.
The mental shift is simple but profound: ground is not the absence of design. Ground is part of the design interface. Every subsystem communicates through it, injects noise into it, and depends on its stability. Once you frame ground this way, layout and topology decisions stop feeling cosmetic and start feeling architectural.
A common early mistake is routing sensitive analog return currents through the same narrow paths used by switching loads. The board may pass basic tests, then fail under realistic activity when motor drivers, DC-DC converters, or digital bursts modulate the local reference. The symptom appears as random ADC jitter or intermittent threshold misfires. The root cause is shared impedance, not firmware.
Star-ground strategies can help in some low-frequency or mixed-signal contexts, but they are often misapplied as a universal rule. Solid planes usually win for modern digital work because they minimize return path impedance and give high-frequency currents predictable local loops under signal traces. The key is intentional current-path thinking, not slogan-driven layout.
Measurement technique also determines whether you see truth or artifacts. Using long oscilloscope ground clips on fast edges can invent ringing that is mostly probe loop inductance. Engineers then “fix” a problem that exists in the measurement setup. Short ground springs, proper probe compensation, and awareness of reference path are not optional details; they are prerequisites for trustworthy diagnosis. ... continue
Debugging Noisy Power Rails
2026-02-22
Noisy power rails cause some of the most frustrating hardware bugs because the symptoms look random while the root cause is often deterministic. A board that “usually works” at room temperature can fail after five minutes under load, pass again after reboot, and mislead you into chasing firmware ghosts for days.
A useful mindset shift is this: unstable power is not a side issue. It is a primary signal path. If voltage integrity is poor, every digital subsystem becomes statistically unreliable, and software symptoms are just the final expression.
My default workflow starts with measurement hygiene before diagnosis:
Bad probing creates fake ripple. Good probing reveals real coupling.
First pass checks are simple: ... continue
Why Constraints Matter
2026-02-10
Give a programmer unlimited resources and they’ll build a mess. Give them 640 KB and they’ll build something elegant.
Constraints force creativity. The demoscene proved that artistic expression thrives under extreme limitations. The same principle applies to web design: this site uses no JavaScript, and the CSS-only approach has led to solutions I would never have considered otherwise.
I have seen this pattern in codebases, hardware, writing, and product work: when limits are explicit, quality decisions become visible. You stop saying “we can optimize later” and start choosing what must be fast, simple, and stable right now. Constraints are not a prison. They are a filter.
Not all limits are equal. Bad constraints are random bureaucracy. Good constraints are deliberate boundaries with a clear purpose:
A tight budget often produces better architecture because you are forced to separate “core value” from “nice decoration.” In practice, this means fewer layers, stronger naming, and less accidental complexity. ... continue
Restoring an AT 286
2026-02-01
I found a Commodore PC 30-III (286 @ 12 MHz) at a flea market. The power supply was dead, the CMOS battery had leaked, and the hard drive made sounds like a coffee grinder.
After recapping the PSU, neutralizing the battery acid with vinegar, and replacing the MFM drive with a XTIDE + CF card adapter, the machine booted into DOS 3.31. The CGA output on a period-correct monitor is a shade of green that no modern display can reproduce.
The restoration looked simple from the outside, but each subsystem had to be proven independently. Old machines fail in clusters: power instability hides logic faults, corrosion causes intermittent behavior, and storage errors can masquerade as software problems.
I treat this like incident response, not hobby magic. Predict expected output, test one hypothesis, compare reality, then decide the next step.
The most fragile part was not the CPU or RAM, but edge connectors and sockets. A careful reseat cycle fixed several “ghost bugs.” Also, DOS 3.31 felt faster than memory suggests once disk latency vanished behind solid-state storage. The machine became practical for retro workflows, not just shelf display. ... continue