The Anatomy of a Debugging Strategy

When code breaks, beginners default to voodoo: change a < to a <=, move a line, restart the IDE, sacrifice a coffee to the demo gods. Occasionally the voodoo works, which is the worst possible outcome — the bug is gone and nothing was learned. Debugging deserves better: it is the scientific method applied at close range, and it can be practised as deliberately as any other skill. (Its natural companion is Testing Fundamentals — every diagnosed bug should leave a test behind as its tombstone.)

Debugging as the Scientific Method

graph LR O["Observe
the actual behaviour"] --> H["Hypothesise
one specific cause"] H --> E["Experiment
targeted test or probe"] E --> C{"Hypothesis
survives?"} C -->|no| H C -->|yes| F["Fix — and add the
regression test"] style E fill:#FFC857
  1. Observe precisely. Not "it crashes" but "it throws NullPointerException at OrderService.java:74 on the second request, never the first." The discipline of writing the observation down often solves the bug by itself.
  2. Formulate one falsifiable hypothesis. "The cache returns a stale connection after timeout" is testable; "something's wrong with the cache" is a mood.
  3. Run the cheapest experiment that could disprove it. A unit test, a log line, a debugger breakpoint, a hard-coded value. One variable at a time — change two things and a passing result tells you nothing.
  4. Let the result vote. Disproven? Good — one suspect eliminated, formulate the next. Confirmed? Fix it, then write the regression test that would have caught it, so this bug can never return unannounced.

The whole loop is the learning feedback loop in miniature (see Learning as a Feedback Loop): prediction, comparison, correction. Guess-and-check skips the prediction, which is why it teaches nothing.

The Binary Search of Code

When the failure is a regression — it worked before and doesn't now — you don't need insight, you need bisection. With \(n\) commits between good and bad, \(\log_2 n\) experiments find the culprit:

git bisect start
git bisect bad                 # current commit fails
git bisect good v2.3.0         # this tag was fine
# git checks out the midpoint; you test and report:
git bisect good   # or: git bisect bad
# ...repeat ~log2(n) times...
git bisect reset

# Fully automatic, if you have a test that reproduces the bug:
git bisect run ./run_repro_test.sh

The same halving logic works inside a single program when history can't help: comment out half the pipeline, hard-code the input to a suspect stage, or feed a half-sized input. Each experiment should eliminate half the remaining search space — if it can't, design a different experiment.

Reading the Stack Trace

The wall of red text is not an accusation; it is a map, and it reads top-down:

Exception in thread "main" java.lang.NullPointerException:
    Cannot invoke "Customer.getTier()" because "customer" is null
    at com.shop.PricingEngine.discountFor(PricingEngine.java:42)   ← your code: START HERE
    at com.shop.CheckoutService.total(CheckoutService.java:31)     ← your code: the caller
    at com.shop.Main.main(Main.java:12)                            ← your code: the entry point
  1. First line: what went wrong. Modern runtimes are explicit — here, customer was null when getTier() was invoked. Read the whole sentence; half the answer is usually in it.
  2. Scan down to the first frame in your package. Frames in framework or library code are context, not culprits — the overwhelmingly likely bug is at the boundary where your code handed the library something wrong. That first your-code frame, with its exact file and line, is where you set the breakpoint.
  3. Distinguish the error classes. A NullPointerException in your frame is your bug; an OutOfMemoryError is a system condition; a ConnectException is the environment. Each routes to a different investigation — code, resources, or network — and panicking at all three equally wastes the map you were just handed.

The Debugger's Checklist

  • Can you reproduce it deterministically? If not, that is the first bug to fix — capture inputs, seed the randomness, log the timing.
  • What is the smallest input that still fails? Minimisation is hypothesis generation.
  • What changed most recently — code, data, dependency, environment? (git bisect answers the first; lockfiles and infrastructure logs the rest.)
  • Is your mental model or the code wrong? Print the value you are certain about. When debugging stalls, it is almost always because a certainty was false.
  • When it's fixed: regression test written? Root cause noted? A bug understood is a class of bugs prevented; a bug merely gone is a bug scheduled.