The Illusion of Simultaneity
We teach programming as a strictly linear sequence — one statement after another, one truth at a time — and then hand students machines with a dozen cores and programs with hundreds of threads. This page is the bridge: what a thread actually is, how the operating system conjures the illusion of everything-at-once, and the vocabulary (concurrency vs. parallelism) that stops the confusion before it starts.
What a Thread Actually Is
A thread is not a magic worker; it is a small bundle of bookkeeping: a program counter (where in the code it is), a stack (its local variables and call chain), and a set of register values. Everything else — the heap, static fields, open files — is shared with every other thread in the process. That single design decision is the source of all the power and all the trouble: threads are cheap to create and communicate through memory at full speed, and precisely because of that, they can trample each other's data at full speed too (see The Shared Mutability Disaster).
The Scheduler's Sleight of Hand
Run 200 threads on 8 cores and 192 of them are, at any instant, not running at all. The OS scheduler slices CPU time into intervals of a few milliseconds and rotates runnable threads through the cores; each switch saves one thread's registers and restores another's (a context switch, costing microseconds — thousands of them per second add up). The illusion of simultaneity is exactly as real as the illusion of motion in cinema: still frames, switched fast.
Two consequences matter for your code. First, you do not choose when you are interrupted: a context switch can happen between any two instructions — even in the middle of what looks like one statement (count++ is a read, an add, and a write). Second, scheduling is nondeterministic: run the same program twice and the interleaving differs, which is why concurrency bugs appear "randomly" and vanish under the debugger.
Concurrency vs. Parallelism
The two words are not synonyms, and the distinction earns its keep daily:
- Concurrency is dealing with many things at once — a structure for a program whose work is interleaved: a server juggling 10,000 connections, a UI staying responsive during a download. You can be concurrent on a single core.
- Parallelism is doing many things at once — a hardware fact: multiple computations literally executing in the same instant on different cores. You can be parallel with no concurrency structure at all (a data-parallel matrix multiply).
A useful test: if the tasks mostly wait (network, disk, user), you want concurrency — interleaving hides the waiting. If the tasks mostly compute, you want parallelism — more cores means more throughput. Games want both at once (see the job system in Engine Architecture), which is why they are concurrency's most demanding classroom.
Why This Breaks Your Objects
Everything in the OOP track quietly assumed one thread: an object's methods run one at a time, so invariants checked at a method's start still hold at its end. Threads void that warranty — two calls to the same method on the same object can be in flight simultaneously, interleaved at any instruction boundary. What that does to state, and what to do about it, is the subject of the next page.