The Trustworthy Software Initiative
"It works on my machine" is an obsolete metric. Software now moves money in microseconds, steers vehicles, allocates medical resources and mediates civic life — contexts where unreliable code is not an inconvenience but a systemic failure of professional ethics (see Professional Ethics). Trustworthiness is not a feature you bolt on in the final sprint; it is an emergent property of your entire engineering pipeline, covering correctness, predictability, safety and transparency.
Why Trust Matters Now
Three shifts have raised the stakes. First, software has become infrastructure: failures propagate through systems the authors never imagined. Second, supply chains have exploded: a typical application executes far more third-party code than first-party code, and every dependency is a delegation of trust. Third, AI-generated code has made it trivially easy to include logic nobody has ever reasoned about. In that world, the question "why should anyone trust this system?" needs an engineering answer, not a shrug.
The Pillars of Trustworthy Software
- Determinism & predictability. Can the system's behaviour be reasoned about under exceptional inputs and stress — not just the happy path? This is a design property: bounded queues, explicit timeouts, defined failure modes, idempotent operations. A system that fails predictably can be trusted further than one that usually succeeds mysteriously (the engineering practices are covered in Risk Management).
- Verifiability. Is there a robust automated regime that demonstrates compliance with the system's claimed constraints? Tests are the executable form of your promises: unit and property tests for logic, contract tests for boundaries, and mutation testing to verify the tests themselves (Testing Fundamentals, Mutation Testing). Unverified claims are marketing, not engineering.
- Auditability. Can a third party reconstruct why the system is the way it is? Clean version control history, documented design trade-offs (ADRs), and clear provenance for every dependency — who wrote it, who maintains it, how it arrives in your build. If the answer to "where did this code come from?" is "a model generated it and we merged it", auditability has already failed.
A Calculus of Trust
Trust in a composed system can be reasoned about semi-formally, and doing so — even roughly — changes decisions. Three useful rules of thumb, drawn from the fuller treatment in our article The Calculus of Trust:
- Trust is multi-dimensional. A component earns separate scores for competence (does it work?), integrity (does it do only what it claims?) and maintenance (will it still work next year?). A brilliant but abandoned library scores high-competence, low-maintenance — a very different risk from a mediocre but active one.
- Trust attenuates through composition. If component \(A\) is trusted to degree \(T_A\) and depends on \(B\) with trust \(T_B\), the chain earns at most \(T_A \times T_B\) — trust multiplies like probabilities, and long dependency chains decay accordingly. Every transitive dependency is a factor less than one.
- Evidence updates trust; time erodes it. Passing audits, years in production and responsive maintainers push trust up; unpatched CVEs, silent behaviour changes and maintainer churn push it down. Treat trust scores as decaying quantities needing re-evidence, not permanent grades.
The same calculus applies verbatim to AI-generated snippets: they arrive with zero provenance, zero maintenance commitment and unknown competence — i.e. minimal prior trust — and must earn their way in through the verifiability pillar: your tests, your review, your comprehension. Never merge what you could not explain to an auditor.
Making It Operational
| Pillar | Pipeline gate |
|---|---|
| Determinism | Chaos/failure-injection tests; timeout and retry policies reviewed with the design |
| Verifiability | Coverage and mutation-score thresholds on changed code; contract tests on every external boundary |
| Auditability | Signed commits, dependency lockfiles + SBOM, licence and vulnerability scanning, ADRs required for architectural change |
None of these gates is exotic; all are covered in CI/CD and Tools Ecosystem. Trustworthiness is what falls out when you run them consistently and refuse the exceptions.