Bibliography

Sources & Bibliography

References, deep dives, and further reading for Vol 2: Stuff We Built on Top.

About Link Rot

Link rot is the phenomenon where hyperlinks break or disappear over time.

To mitigate this, the following steps have been taken:

Every source includes an “Accessed” date indicating when the author last verified the link.
Whenever technically possible, an “Archive” link (hosted by the Wayback Machine) is provided.

00

Intro

1 Topic

Explainers

The Largest Story Ever Written (And Why Software Still Loses)

7 sources

Loose ends, plotholes, speculation and theories (spoilers) - Stephen King Forum
Stephen King Forum 2017 Accessed: 2026-05-10
Source Archive
Discussion about plot holes in Stephen King's work. The book itself is under copyright.
Mistakes/contradictions in the books - A Song of Ice and Fire Forum
Westeros.org 2017 Accessed: 2026-05-10
Source
Similar to King's work, Martin's writing is still protected by copyright.
Robinson Crusoe
Planet Publish Daniel Defoe 1719 Accessed: 2026-05-10
Source Archive
Chapter 4: information about removing clothes appears on page 76, pocket-stuffing is described on page 77.
LibreOffice Download Other
LibreOffice Accessed: 2026-05-10
Source Archive
One possible source for downloading the tar.gz package with LibreOffice source code.
Linux Kernel Repository
GitHub Linus Torvalds Accessed: 2026-05-10
Source Archive
Official repository for the Linux kernel.
Note: Linus is the primary maintainer and approver of changes but it would be a stretch to consider him the sole author.
Fun fact: 5.2GB out of 6.7GB of the Linux kernel's repository size is the .git folder
Reddit 2024 Accessed: 2026-05-10
Source Archive
Analysis of the ratio between actual code size and change history.
Mahabharata
Wikipedia Accessed: 2026-05-10
Source Archive
Information about the Mahabharata. Size calculations in MB are author's rough estimates. The article directly compares the Mahabharata to the Odyssey and Iliad.

Part 6

Fragile Commons

5 Chapters

22

The Chain of Fools

8 Topics

Explainers

Supply Chains – Other People's Work, All the Way Down

6 sources

Making a Sandwich From Scratch Took Man Six Months
Smithsonian Magazine Maris Fessenden 2015 Accessed: 2026-05-10
Source Archive
Article about Andy George's sandwich experiment.
How to Make a $1,500 Sandwich in Only 6 Months
Time Mark Rivett-Carnac 2015 Accessed: 2026-05-10
Source Archive
Coverage of Andy George's experiment.
We Spoke to the Guy Who Spent Six Months and $1500 Making a Sandwich
Vice Abby Moss 2015 Accessed: 2026-05-10
Source Archive
Interview with Andy George about the sandwich experiment.
TrendForce Report - March 12, 2026
TrendForce 2026 Accessed: 2026-05-10
Source Archive
Report on mobile component market.
Understanding the Semiconductor Supply Chain and Its Importance
Investopedia Adam Hayes 2025 Accessed: 2026-05-10
Source Archive
Article describing supply chains in semiconductor industry.
Global Smartphone AP-SoC Market Share: Quarterly
Counterpoint Research 2026 Accessed: 2026-05-10
Source Archive
Quarterly report on global smartphone application processor market share.

Package Managers — Where All Those Dependencies Come From

7 sources

Dependency Solving Is Still Hard, but We Are Getting Better at It
arXiv Pietro Abate, Roberto Di Cosmo, Georgios Gousios, Stefano Zacchiroli 2020 Accessed: 2026-05-10
Source Archive
Review of research on dependency solvers, comparison matrix of package managers, and discussion of CUDF.
Package Managers à la Carte: A Formal Model of Dependency Resolution
arXiv Ryan Gibb, Patrick Ferris, David Allsopp, Thomas Gazagnaire, Anil Madhavapeddy 2026 Accessed: 2026-05-10
Source Archive
Recent paper (February 2026) formalizing dependency resolution semantics as Package Calculus, showing the problem is NP-complete.
Solving Package Management via Hypergraph Dependency Resolution
arXiv Ryan Gibb, Patrick Ferris, David Allsopp, Michael Winston Dales, Mark Elvers, Thomas Gazagnaire, Sadiq Jaffer, Thomas Leonard, Jon Ludlam, Anil Madhavapeddy 2025 Accessed: 2026-05-10
Source Archive
Continuation of the above work, more practical, explains differences across ecosystems.
Semantic Versioning 2.0.0
SemVer Tom Preston-Werner Accessed: 2026-05-10
Source Archive
The only canonical source for SemVer. Written by GitHub co-founder. Short, precise, with examples.
About semantic versioning
npm Docs 2026 Accessed: 2026-05-10
Source Archive
How npm interprets SemVer in practice (tilde ~, caret ^).
Dependency Resolution Methods
Andrew Nesbitt's Blog Andrew Nesbitt 2026 Accessed: 2026-05-10
Source Archive
Author runs ecosyste.ms and libraries.io. Catalogued resolver algorithms (SAT, PubGrub, ASP, backtracking) with concrete implementations per package manager. Best available synthesis for technical readers.
The nine circles of dependency hell
Sourcegraph Blog Matt Rickard 2021 Accessed: 2026-05-10
Source Archive
Best popular technical text on the topic. Clear typology of dependency circles with technical examples. Written by a Google/Kubernetes engineer.

The Industrial Revolution of Code: Understanding CI/CD

5 sources

Continuous Integration, Delivery and Deployment: A Systematic Review on Approaches, Tools, Challenges and Practices
arXiv Mojtaba Shahin, Muhammad Ali Babar, Liming Zhu 2017 Accessed: 2026-05-10
Source Archive
Continuous Delivery
Martin Fowler's Bliki Martin Fowler 2013 Accessed: 2026-05-10
Source Archive
Short, precise definition from the person who brought these concepts to the mainstream. Co-creator of Continuous Delivery (Humble & Farley, 2010).
Continuous integration vs. delivery vs. deployment
Atlassian Sten Pittet Accessed: 2026-05-10
Source Archive
Most widely cited practical guide to distinguishing the three phases. Clear, with diagram. Used as a reference standard by most teams.
Continuous Delivery
GitLab Accessed: 2026-05-10
Source Archive
Good definition from one of the most popular CI/CD platforms. Clearly distinguishes continuous deployment (automatic) from continuous delivery (can be deployed).
Continuous Integration, Delivery and Deployment: Key Differences
romenrg.com Romén Rodríguez-Gil 2017 Accessed: 2026-05-10
Source Archive
Technical blog with personal response from Martin Fowler in comments. Best practical text focused solely on distinguishing the three phases.

Case Studies

`left-pad` – Eleven Lines to Rule Them All

9 sources

kik, left-pad, and npm
npm Blog 2016 Accessed: 2026-05-10
Source Archive
Official npm post-mortem. Contains timeline, decision about 'un-unpublishing,' and Laurie Voss quote: 'Un-un-publishing is an unprecedented action.' Ground truth for the entire incident.
I've Just Liberated My Modules
Azer Koçulu's Blog Azer Koçulu 2016 Accessed: 2026-05-10
Source Archive
Author's own statement. Indirectly quoted in most articles, important for understanding his motivation.
How one developer just broke Node, Babel and thousands of projects in 11 lines of JavaScript
The Register Chris Williams 2016 Accessed: 2026-05-10
Source Archive
Live coverage from the day of the incident. Contains left-pad source code, Laurie Voss quote, and data: 'Left-pad was fetched 2,486,696 times in just the last month.' Solid primary press source.
NPM was Broken for 2.5 Hours
InfoQ Abel Avram 2016 Accessed: 2026-05-10
Source Archive
Precise confirmation: Babel, Atom, Ember, React Native as affected projects. Good technical synthesis from the next day.
A single Node of failure
LWN.net Nathan Willis 2016 Accessed: 2026-05-10
Source Archive
Most analytical of the hot-take texts. Confirms React.js, Babel, and Ember.js as affected frameworks. Important observation: 'no major interruptions appear to have hit live services' — builds failed but end users mostly noticed nothing.
NPM & left-pad: Have We Forgotten How To Program?
David Haney's Blog David Haney 2016 Accessed: 2026-05-10
Source Archive
Engineer writing on the day of the incident. Contains direct reference to React: 'Just ask the React team how well their week has been going.' Good as a quotable community voice.
How a Programmer Almost Broke the Internet by Deleting 11 Lines of Code
ScienceAlert Fiona Macdonald 2018 Accessed: 2026-05-10
Source Archive
Confirms Facebook, Netflix, Spotify via Babel and React. Quotes npm CEO Isaac Schlueter from Ars Technica.
Npm left-pad incident
Wikipedia Accessed: 2026-05-10
Source Archive
Wikipedia article describing the incident. Contains source code — note that including empty lines the package had 17 lines, which some articles referenced.
Left-pad Creator Breaks 8-Year Silence: NPM Provided the Script That Broke the Internet
Biggo 2025 Accessed: 2026-05-10
Source
Article published June 11, 2025, shedding different light on the event by revealing that the script to delete all packages was provided to the author by npm.

Log4Shell – The Ghost in the Logging Machine

9 sources

CVE-2021-44228
NVD/NIST 2021 Accessed: 2026-05-10
Source Archive
Official entry in US government's National Vulnerability Database. CVSS 10.0, technical description, list of affected versions (2.0-beta9 to 2.14.1), patch links. Canonical source for any CVE citation.
Log4Shell: RCE 0-day exploit found in log4j
LunaSec (GitHub) freeqaz 2021 Accessed: 2026-05-10
Source Archive
The original post-mortem — LunaSec coined the name 'Log4Shell' and was first to publish detailed technical analysis. Updated over several weeks as subsequent CVEs were discovered. Cited by CISA, Apache, and practically everyone else.
Note: At time of compilation, lunasec.io domain is inactive (DNS issue). Blog sources are available on GitHub. Linked file is source in MDX format, fully readable but note this is not the actual published page.
Fun fact: blog uses roughly the same tech stack as the bibliography you're reading now. Except this site's repo is not public.
Apache Log4j Vulnerability Guidance
CISA 2021 Accessed: 2026-05-10
Source Archive
Official US Cybersecurity and Infrastructure Security Agency page with Emergency Directive 22-02, links to all CVEs, and guidance for federal agencies. Good as institutional source.
What's the Deal with the Log4Shell Security Nightmare?
Lawfare Nicholas Weaver 2021 Accessed: 2026-05-10
Source Archive
Solid analysis for non-specialists by Lawfare (Harvard Law / Brookings). Describes Minecraft as the starting point, critical infrastructure impacts, and SBOM implications. Good for broader context.
'The internet's on fire' as tech races to fix 'Log4Shell' flaw
LA Times Frank Bajak 2021 Accessed: 2026-05-10
Source Archive
Coverage from disclosure day, includes quote from expert Marcus Hutchins about the Minecraft exploit. Good as mainstream confirmation of scale.
Pointing and Calling: Japan's Railway Safety Technique
Nippon.com Nishiue Itsuki 2026 Accessed: 2026-05-10
Source Archive
Official Japanese source — Nippon.com is funded by Japan Foundation. Description of the technique, its applications, and spread beyond Japan.
Point it, call it, get it right
Flight Safety Australia 2018 Accessed: 2026-05-10
Source Archive
Article from Australian aviation authority CASA. Good context for application of the technique beyond railways, includes bibliography with original scientific sources (Railway Technical Research Institute 1994).
A Brief History of JavaScript
Auth0 Blog Sebastian Peyrott 2019 Accessed: 2026-05-10
Source Archive
Technically most accurate non-academic description of history: Mocha → LiveScript → JavaScript, role of Netscape/Sun agreement, quotes Brendan Eich. Often linked as canonical source.
JavaScript (History section)
Wikipedia Accessed: 2026-05-10
Source Archive
For the naming footnote, sufficient as additional confirmation — contains quote from Eich himself who called the name change 'a marketing ploy by Netscape.'

React2Shell: When the Frontend Ate the Server

5 sources

CVE-2025-55182
NVD/NIST 2025 Accessed: 2026-05-10
Source Archive
Canonical government entry. CVSS 10.0, technical description, list of affected versions.
Critical Security Vulnerability in React Server Components
React.dev 2025 Accessed: 2026-05-10
Source Archive
Official post from React team (Meta). Contains disclosure timeline, list of affected packages and frameworks, patching instructions, and — notably — list of all follow-up CVEs updated through January 26, 2026. Best single source for the entire saga.
Denial of Service and Source Code Exposure in React Server Components
React.dev 2025 Accessed: 2026-05-10
Source Archive
Official React team post listing all follow-ups: CVE-2025-55183, CVE-2025-55184, CVE-2025-67779 (December), CVE-2026-23864 (January). Also includes a quote that could be used in the book: 'This pattern shows up across the industry, not just in JavaScript. For example, after Log4Shell, additional CVEs were reported as the community probed the original fix.'
China-nexus cyber threat groups rapidly exploit React2Shell
AWS Security Blog CJ Moses 2025 Accessed: 2026-05-10
Source Archive
Primary source for Jackpot Panda and Earth Lamia. Report from MadPot honeypot infrastructure. One of the best intelligence documents on who exploited and how.
React2Shell: Critical React Vulnerability
Wiz Research Gili Tikochinski, Merav Bar, Danielle Aminov 2025 Accessed: 2026-05-10
Source Archive
Wiz researchers published the first PoC and had the earliest cloud exploitation telemetry. Contains data: 39% of cloud environments with vulnerable versions, confirmation of AWS credential harvesting.

Polyfill.io – The Rotten Ingredient

5 sources

Polyfill supply chain attack hits 100K+ sites
Sansec 2024 Accessed: 2026-05-10
Source Archive
Original discoverers' report — Sansec was the first company to publish attack details. Contains timeline, sample malicious script code, list of affected domains, and updates from subsequent days (Namecheap domain suspension, Cloudflare response). Only first-order canonical source for the entire story.
Polyfill.io JavaScript supply chain attack impacts over 100K sites
BleepingComputer Lawrence Abrams 2024 Accessed: 2026-05-10
Source Archive
First extensive press coverage from disclosure day. Contains Google's communication to advertisers, Andrew Betts quotes, and attack mechanism details. BleepingComputer was the first medium to cover Sansec's report — hence the DDoS on their infrastructure in response.
Polyfill.io Attack Impacts Over 380,000 Hosts, Including Major Companies
The Hacker News Ravie Lakshmanan 2024 Accessed: 2026-05-10
Source Archive
Good article with broader perspective — describes scale after a week, Funnull's restart attempt under polyfill.com domain (also taken down by Namecheap), and long-term implications. Good as supplement to live coverage.
July 2: Polyfill.io Supply Chain Attack — Digging into the Web of Compromised Domains
Censys 2024 Accessed: 2026-05-10
Source Archive
Source for the 384,773 hosts number. Also includes scan of alternative domains taken over by the same operator, historical DNS records, and list of affected platforms (Hulu, Mercedes-Benz, WarnerBros). Only credible source for this specific number.
Polyfill Supply Chain Attack Impacting 100k Sites Linked to North Korea
SecurityWeek Eduard Kovacs 2026 Accessed: 2026-05-10
Source
Most widely cited medium covering Hudson Rock findings. Contains summary of forensic evidence (credentials from LummaC2 infostealer, domain configuration conversations), operation goal (cryptocurrency laundering via Suncity Group gambling network), and attribution certainty caveats. Well-balanced between 'this is strong evidence' and 'this is one research firm.'

Version 99.0.0

4 sources

Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies
Medium Alex Birsan 2021 Accessed: 2026-05-10
Source Archive
Original and only canonical source — Birsan himself describes methodology, timeline, code examples, list of vulnerable companies, and technical details. Everything else is secondary to this text.
Open source blind trust the culprit in ethical breach of 35 companies
Cybersecurity Dive Samantha Schwartz 2021 Accessed: 2026-05-10
Source Archive
Solid press coverage from publication day. Quotes Birsan directly, describes Microsoft's response (whitepaper, CVE-2021-24105), and remediation recommendations. Good institutional context.
Dependency Hijacking Software Supply Chain Attack Hits More Than 35 Organizations
Sonatype Ax Sharma 2021 Accessed: 2026-05-10
Source Archive
Technical and published simultaneously with Birsan's report — Sonatype as a supply chain security company had ready analysis on publication day. Best text for readers wanting to understand the mechanism without reading the Medium post.
Researcher breaches Apple, Microsoft, and others with installer attack
AppleInsider Mike Peterson 2021 Accessed: 2026-05-10
Source Archive
Most detailed breakdown of payouts: Microsoft $40,000 (program record), Shopify $30,000, Apple $30,000, PayPal $30,000, total over $130,000. Also quotes Apple confirmation that RCE on servers would have been possible. Good as verifiable source for specific numbers.

23

The Trillion Dollar Volunteer

5 Topics

Explainers

Open Source: Shared Knowledge, Shared Illusions

11 sources

Alexander Fleming Discovery and Development of Penicillin
American Chemical Society Accessed: 2026-05-11
Source Archive
Official ACS landmark document. Confirms 1929 publication, the decade of stagnation, the role of Florey and Chain, and mass production during WWII.
Collective Invention
Journal of Economic Behavior and Organization Robert C. Allen 1983 Accessed: 2026-05-11
Source
Primary academic source for the Cleveland ironworks case. Documents the open exchange of blast furnace performance data between 1850 and 1875. Vol. 4, pp. 1–24.
Collective Invention During the British Industrial Revolution: The Case of the Cornish Pumping Engine
Cambridge Journal of Economics Alessandro Nuvolari 2004 Accessed: 2026-05-11
Source
Primary source for the Cornwall case. Documents the practice of publishing pump efficiency data in Lean's Engine Reporter rather than seeking patent protection. Vol. 28, pp. 347–363.
Selden Patent Overthrown by Federal Court
The Henry Ford Museum 1911 Accessed: 2026-05-11
Source
Primary institutional source — Henry Ford Museum archive. Confirms the court ruling of January 10, 1911 overturning the Selden patent.
Monopoly on Wheels: Henry Ford and the Selden Automobile Patent
Wayne State University Press William Greenleaf 1961
Canonical academic study of the Ford–Selden patent dispute. Confirms the link between the 1911 ruling and the free cross-licensing system adopted by the industry in 1915 and maintained until 1956. ISBN: 0814335128.
Review: Monopoly on Wheels (Project MUSE)
Project MUSE John B. Rae 1961 Accessed: 2026-05-11
Source
Review citing the outcome of the Selden case: the automotive industry adopted a unique system of free cross-licensing in 1915, kept in force until 1956.
The Open Source Definition
Open Source Initiative 2006 Accessed: 2026-05-11
Source Archive
Canonical document defining what 'open source' means in legal and technical terms.
Why Open Source Misses the Point of Free Software
GNU / FSF Richard Stallman Accessed: 2026-05-11
Source Archive
Stallman's original text describing the distinction between 'free software' and 'open source' from the FSF perspective. Essential reading for understanding why the distinction exists at all.
Various Licenses and Comments about Them
GNU / FSF Accessed: 2026-05-11
Source Archive
Comprehensive FSF-annotated list of software licenses, covering GPL, LGPL, AGPL, Apache, MIT, BSD and dozens of others. Authoritative from the free software movement perspective.
TLDR Legal – Plain-Language License Summaries
TLDR Legal Accessed: 2026-05-11
Source Archive
Accessible plain-English summaries of software licenses. Useful contrast to dry legal documentation — ideal for readers who want to quickly understand the difference between MIT and GPL without reading full legal texts.
Licenses & Standards
Open Source Initiative Accessed: 2026-05-11
Source Archive
List of licenses approved by OSI as conforming to the Open Source Definition. Less ideological than FSF, more pragmatic.

Case Studies

Heartbleed (CVE-2014-0160) – 64 Kilobytes of Oops

9 sources

OpenSSL Security Advisory — CVE-2014-0160
OpenSSL 2014 Accessed: 2026-05-11
Source Archive
Official OpenSSL security bulletin issued on the day of disclosure. Short, technical, authoritative.
CVE-2014-0160 – National Vulnerability Database
NVD / NIST 2014 Accessed: 2026-05-11
Source Archive
Official NIST/NVD record for Heartbleed.
Heartbleed Bug
heartbleed.com 2014 Accessed: 2026-05-11
Source Archive
Dedicated site describing the Heartbleed vulnerability — created at the time of disclosure.
The Results of the CloudFlare Challenge
Cloudflare Blog Nick Sullivan 2014 Accessed: 2026-05-11
Source Archive
Cloudflare's Heartbleed Challenge confirmed that private keys could indeed be stolen via the vulnerability.
Attackers Exploit the Heartbleed OpenSSL Vulnerability to Circumvent Multi-factor Authentication on VPNs
Google Cloud Blog Christopher Glyer, Chris DiGiamo 2014 Accessed: 2026-05-11
Source Archive
Google Cloud post describing real-world exploitation of Heartbleed to bypass multi-factor authentication on VPNs.
Heartbleed: Developer Who Introduced the Error Regrets 'Oversight'
The Guardian Alex Hern 2014 Accessed: 2026-05-11
Source Archive
Contains the responsible developer's own explanation of how the bug was introduced.
Why Heartbleed Is the Most Dangerous Security Flaw on the Web
The Verge Russell Brandom 2014 Accessed: 2026-05-11
Source Archive
After 'Catastrophic' Security Bug, the Internet Needs a Password Reset
Wired Robert McMillan 2014 Accessed: 2026-05-11
Source Archive
Heartbleed: OpenSSL Vulnerability That Affects Everyone
The Spokesman-Review Daniel Gayle 2014 Accessed: 2026-05-11
Source

curl – 'Please Fix This by Yesterday'

6 sources

LogJ4 Security Inquiry – Response Required
daniel.haxx.se Daniel Stenberg 2022 Accessed: 2026-05-11
Source Archive
The legendary post in which Stenberg publishes an ultimatum email from a corporation demanding an audit of a Java library that curl does not use. He describes the level of ignorance and incompetence as breathtaking. Best single source illustrating how compliance processes replace actual thinking.
I Will Slaughter You
daniel.haxx.se Daniel Stenberg 2021 Accessed: 2026-05-11
Source Archive
Stenberg publishes a death threat received because curl 'didn't work correctly' in a user's closed system. Illustrates the extreme end of free support expectations — users who treat maintainers' time as their entitlement.
Death by a Thousand Slops
daniel.haxx.se Daniel Stenberg 2025 Accessed: 2026-05-11
Source Archive
Explains why Stenberg shut down the curl Bug Bounty program. He describes AI-generated security reports as a 'DDoS on human attention' — technically plausible-looking but factually wrong reports flooding maintainers at scale.
The Challenge of Maintaining curl
LWN.net Jonathan Corbet 2025 Accessed: 2026-05-11
Source Archive
Companies and Products Known to Ship curl
curl.se Daniel Stenberg Accessed: 2026-05-11
Source Archive
List of companies using curl, maintained by the tool's author.
Commercial curl Support
wolfSSL 2019 Accessed: 2026-05-11
Source Archive
Stenberg's company offering paid commercial support for curl.

faker.js & colors.js – The Zalgo Rebellion

7 sources

faker.js Gets Erased (Reddit Discussion)
Reddit 2022 Accessed: 2026-05-11
Source
Reddit thread containing a screenshot of the faker.js repository after Marak's deletion. Primary contemporaneous record since the original repo no longer exists on GitHub.
faker.js Repository – Web Archive Mirror (January 5, 2022)
Web Archive 2022 Accessed: 2026-05-11
Source
Archived snapshot showing the entire repository history reduced to a single commit.
faker – npm Package
npm Marak Squires 2022 Accessed: 2026-05-11
Source Archive
npm registry entry. Worth noting the version number at the time of the sabotage.
JavaScript Dev Deliberately Screws Up Own Popular npm Packages to Make a Point of Some Sort
The Register Thomas Claburn 2022 Accessed: 2026-05-11
Source Archive
The Story Behind colors.js and faker.js
Revenera Marcus Lucero 2022 Accessed: 2026-05-11
Source Archive
The NPM Libraries 'Colors' and 'Faker' Were Corrupted
Heimdal Security Dora Tudor 2022 Accessed: 2026-05-11
Source Archive
Dev Corrupts NPM Libs 'colors' and 'faker' Breaking Thousands of Apps
BleepingComputer Ax Sharma 2022 Accessed: 2026-05-11
Source Archive

XZ Utils – The Long Con in Plain Sight

7 sources

backdoor in upstream xz/liblzma leading to ssh server compromise
oss-security (Openwall) Andres Freund 2024 Accessed: 2026-05-11
Source Archive
Original disclosure by Andres Freund. The primary source for the entire XZ backdoor story.
Everything I Know About the XZ Backdoor
boehs.org Evan Boehs 2024 Accessed: 2026-05-11
Source Archive
Detailed attack timeline analysis. Widely cited reconstruction of Jia Tan's multi-year social engineering campaign.
Analysis of Vulnerability CVE-2024-3094 in XZ Utils
TH Brandenburg (OPUS4) Conrad Ferneding 2024 Accessed: 2026-05-11
Source
Academic bachelor's thesis with extensive technical breakdown and references. Detailed analysis of the backdoor mechanism and attack chain.
The XZ Backdoor: Everything You Need to Know
Wired Dan Goodin 2024 Accessed: 2026-05-11
Source Archive
Urgent Security Alert for Fedora Linux 40 and Fedora Rawhide Users
Red Hat Blog 2024 Accessed: 2026-05-11
Source Archive
CVE-2024-3094 Regarding xz-utils
Debian Security Announce Salvatore Bonaccorso 2024 Accessed: 2026-05-11
Source
Debian maintainers noted how Jia Tan attempted to manipulate them into ignoring valgrind warnings as 'false positives'. Debian rolled back all XZ packages to the safe 5.4.5 version and revoked Jia Tan's infrastructure access. The incident showed that Debian's 'bureaucracy' functioned as a protective filter.
CVE-2024-3094 – Ubuntu Security
Ubuntu Security 2024 Accessed: 2026-05-11
Source Archive
Canonical confirmed the malicious XZ version reached Noble Numbat repositories but was removed before the official LTS release thanks to Freund's discovery. Had it shipped, millions of corporate servers would have received a backdoor as part of a 'secure' Long-Term Support release.

24

The Rug Pull

4 Topics

Explainers

The First One's Free — A Business Model with a Smile

4 sources

BMW's Heated Seat Subscription
Consumer Rights Wiki Accessed: 2026-05-12
Source Archive
BMW Clarifies Its Rules Around Subscription-Based Heated Seats (and It's Not Quite What You Think)
The Manual Matthew Denis 2023 Accessed: 2026-05-12
Source Archive
BMW Drops Heated Seats Subscription Due to Customer Backlash
The Sun (Malaysia) 2023 Accessed: 2026-05-12
Source
BMW Heated Seats Subscription Is Real And It Costs $18 Per Month
Motor1 Adrian Padeanu 2022 Accessed: 2026-05-12
Source Archive

Case Studies

Docker Desktop — Oxygen Metered by Subscription

10 sources

Updating Docker Product Subscriptions
Docker Blog Giri Sreenivas 2024 Accessed: 2026-05-12
Source Archive
Official announcement of the licensing change (August 31, 2021). Introduces the 250-employee / $10M revenue thresholds, grace period until January 31, 2022, and new subscription tiers.
Docker Revenue Profile
Sacra Accessed: 2026-05-12
Source Archive
Best available source for a private company. Reports ARR of $20M in 2021 (pre-license change), $165M in 2023, $207M in 2024. Secondary source — Docker Inc. is private and does not publish full financials.
Celebrating Our Second Fiscal Year
Docker Blog Scott Johnston 2022 Accessed: 2026-05-12
Source Archive
Official post from Docker Inc. after the first full subscription year. CEO Scott Johnston reports ARR above $50M — over four-fold year-on-year growth. Only primary source with figures directly from the company.
Dockershim Deprecation FAQ
Kubernetes Blog 2020 Accessed: 2026-05-12
Source Archive
Original document announcing dockershim deprecation in Kubernetes 1.20. Explains technical reasons (CRI incompatibility), timeline, and implications for users.
Kubernetes is Moving on From Dockershim
Kubernetes Blog Sergey Kanzhelev, Jim Angel, Davanum Srinivas, Shannon Kularathna, Chris Short, Dawn Chen 2022 Accessed: 2026-05-12
Source Archive
Post accompanying dockershim removal in Kubernetes 1.24. Confirms the decision was technical and long-planned, and that the majority of clusters still used Docker at the time of the licensing announcement.
Docker Desktop Is No Longer Free for Enterprise Users
InfoWorld Scott Carey 2021 Accessed: 2026-05-12
Source Archive
Day-of-announcement coverage. Business context, market reaction, quotes from CEO Scott Johnston.
Kubernetes Proceeding with Deprecation of Dockershim in Upcoming 1.24 Release
InfoQ Matt Campbell 2022 Accessed: 2026-05-12
Source Archive
Connects both threads — dockershim deprecation timeline and the post-license-change market situation. Also explains Mirantis's role as maintainer of cri-dockerd.
A Quick Guide to Docker Licensing
USU Blog Dr. Christian Seeling 2022 Accessed: 2026-05-12
Source Archive
Software Asset Management perspective — practical compliance implications for IT and procurement departments. Useful for the 'compliance panic' context described in the case study.
Open Container Initiative
Open Container Initiative Accessed: 2026-05-12
Source Archive
OCI homepage describing the image-spec and runtime-spec standards. Includes the history of the initiative (founded 2015, emerging from Docker and CoreOS).
OCI Image Format Specification
GitHub / Open Container Initiative Accessed: 2026-05-12
Source Archive
Technical source for the claim about image portability across different container runtimes.

Terraform — Open Until Further Notice

8 sources

HashiCorp Adopts Business Source License
HashiCorp Blog Armon Dadgar 2023 Accessed: 2026-05-12
Source Archive
Primary canonical source — co-founder Dadgar's official post explaining the motivation (hyperscalers monetizing without contributing), scope of the change, and what remains under MPL. August 10, 2023.
IBM to Acquire HashiCorp, Inc.
IBM Newsroom 2024 Accessed: 2026-05-12
Source Archive
Official IBM press release. $6.4 billion enterprise value, $35 per share. April 24, 2024.
Terraform License Change: Impact on Users & Providers
Spacelift Blog Flavius Dinu 2024 Accessed: 2026-05-12
Source Archive
Most comprehensive analysis of practical consequences — full timeline of the saga (August 10, 2023 through OpenTofu GA in January 2024), breakdown of who is affected and how, ecosystem perspective.
At the time link was verified, the article was updated (Feb 2026). Archive link points to the original version of the text.
The Impact of the HashiCorp License Change on Gruntwork Customers
Gruntwork Blog Josh Padnick 2023 Accessed: 2026-05-12
Source Archive
Perspective from one of the largest Terraform ecosystem partners, directly affected by the change. Good example of enterprise reaction and 'compliance panic' described in the case study.
Linux Foundation Launches OpenTofu
Linux Foundation 2023 Accessed: 2026-05-12
Source Archive
Original announcement of the project's adoption by the Linux Foundation. Includes quotes from Jim Zemlin, list of 140+ supporting companies, and commitment of 18+ FTE developers for a minimum of 5 years. September 20, 2023.
Terraform Fork Gets Renamed OpenTofu, and Joins Linux Foundation
TechCrunch Ron Miller 2023 Accessed: 2026-05-12
Source Archive
Journalistic coverage with a quote from Jim Brikman (Gruntwork) explaining the name and project plans.
Terraform Language: Providers (Official Documentation)
HashiCorp Developer Accessed: 2026-05-12
Source Archive
Canonical description of the provider model: 'Terraform relies on plugins called providers to interact with cloud providers, SaaS providers, and other APIs.' Providers are distributed separately from Terraform Core with their own release cycles.
How Terraform Works with Plugins
HashiCorp Developer Accessed: 2026-05-12
Source
Technical description of the plugin architecture — providers as separate binaries communicating with Terraform Core via RPC. Confirms providers can be written by any party. Companion to the Providers doc above.

Setting Your Own House on Fire — Unity Runtime Fee

7 sources

Unity Pricing & Packaging Updates
Unity Blog Matt Bromberg 2024 Accessed: 2026-05-12
Source
This URL originally contained the runtime fee announcement (September 12, 2023). Unity subsequently replaced the content with the cancellation notice — the original announcement is no longer available at this address.
Unity Apologizes, Announces Revised Runtime Fee Criteria
The Register Thomas Claburn 2023 Accessed: 2026-05-12
Source Archive
Best account of the reversal — documents how quickly Unity was forced to retreat, revised thresholds, removal of retroactivity, and general chaos. Includes stock price context ($39 → $32 in one week).
John Riccitiello Retires — and Will Collect More Pay
SFGate Stephen Council 2023 Accessed: 2026-05-12
Source Archive
Documents Riccitiello's departure from Unity and detailed breakdown of his exit package (~$8.4M in stock options). Most credible source for a specific figure used in the book.
Mega Crit Statement on Unity Runtime Fee
X (Twitter) Mega Crit 2023 Accessed: 2026-05-12
Source Archive
Immediate official response from Slay the Spire developer (September 13, 2023): 'We have never made a public statement before. That is how badly you fucked up.' Includes announcement of migration to a new engine.
Terraria Dev 'Unequivocally Condemns' Unity Fee Changes, Donates Over $200,000 to Other Game Engines
GamesRadar Dustin Bailey 2023 Accessed: 2026-05-12
Source Archive
Re-Logic (Terraria developer) declared it would never use Unity again and donated $200,000 to alternative engines including Godot. One of the most financially significant gestures of the protest.
EA Talks 'Worst Company in America' Controversy and How It's Changing
GameSpot Eddie Makuch 2015 Accessed: 2026-05-12
Source Archive
Retrospective quoting EA interim CEO Larry Probst after two consecutive 'Worst Company in America' titles (2012 and 2013). Confirms both titles and management's reaction.
John Riccitiello – Wikipedia
Wikipedia Accessed: 2026-05-12
Source Archive
Secondary reference for EA 'Worst Company in America' titles (2012, 2013). The original Consumerist voting results are archived but the site is defunct — Wikipedia's EA section footnotes point to the original Consumerist articles.

25

The Monoculture

5 Topics

Explainers

Intro — Monoculture and the Irish Potato Famine

5 sources

Monoculture and the Irish Potato Famine: Cases of Missing Genetic Variation
Understanding Evolution (UC Berkeley) Accessed: 2026-05-12
Source Archive
4 Factors That Made the Great Irish Potato Famine So Deadly
The Collector Greg Pasciuto 2026 Accessed: 2026-05-12
Source
How Infection Shaped History: Lessons from the Irish Famine
PubMed Central / NLM William G. Powderly 2019 Accessed: 2026-05-12
Source Archive
Why Are Phytophthora and Other Oomycota not True Fungi?
American Phytopathological Society Amy Y. Rossman, Mary E. Palm 2002 Accessed: 2026-05-12
Source Archive
Phytophthora infestans (Late Blight)
Ephytia / INRAE Accessed: 2026-05-12
Source Archive
French agricultural research agency. Concise and direct: 'Phytophthora infestans which is not a fungus but a water mould also known as oomycete.'

Case Studies

The Attack of the Clones — Browser Engine Monoculture

22 sources

Comparison of Browser Engines
GrokiPedia Accessed: 2026-05-12
Source Archive
History of Web Browser Engines from 1990 until Today
eylenburg.github.io Alphonse Eylenburg 2025 Accessed: 2026-05-12
Source Archive
Comprehensive visual timeline of browser engine history and consolidation.
Why Electron
Electron.js (Official Docs) Accessed: 2026-05-12
Source Archive
What is Electron?
Electron.js (Official Docs) Accessed: 2026-05-12
Source Archive
Electron (Software Framework)
Wikipedia Accessed: 2026-05-12
Source Archive
Building Hybrid Applications with Electron
Slack Engineering Blog Anaïs Betts 2016 Accessed: 2026-05-12
Source Archive
Slack's own account of adopting Electron for its desktop application.
Chromium Embedded Framework
Wikipedia Accessed: 2026-05-12
Source Archive
That Native App Is Probably Just an Old Web Browser
How-To Geek Chris Hoffman 2019 Accessed: 2026-05-12
Source Archive
Accessible overview of CEF (Chromium Embedded Framework) prevalence in desktop applications.
Case Study: Analyzing Notion App Performance
3perf.com Ivan Akulov 2020 Accessed: 2026-05-12
Source Archive
Confirms Notion's use of Electron. Detailed performance analysis.
Warcraft 3 Reforged: Chromium, Performance, Sadness
Medium Stepan Zharychev 2020 Accessed: 2026-05-12
Source Archive
Analysis of Warcraft 3 Reforged's use of a Chromium-based renderer and resulting performance issues.
Sea of Thieves — Powered by Coherent Labs
Coherent Labs Accessed: 2026-05-12
Source Archive
Example of a AAA game using a Chromium-based UI rendering solution.
How Browsers Work
MDN Web Docs Accessed: 2026-05-12
Source Archive
Official, technical, authoritative overview of browser architecture and rendering pipeline.
Role of Rendering Engines in Browsers
BrowserStack 2023 Accessed: 2026-05-12
Source Archive
More accessible description of browser rendering engine architecture. Companion to the MDN article above.
What's the Difference Between an Engine and a Framework?
Stack Overflow 2011 Accessed: 2026-05-12
Source Archive
Community discussion with usable definitions distinguishing engines, frameworks, and libraries.
GameDev Glossary: Library vs Framework vs Engine
GameFromScratch 2015 Accessed: 2026-05-12
Source Archive
Accessible explainer with concrete examples distinguishing library, framework, and engine concepts.
'Web Environment Integrity' Is an All-Out Attack on the Free Internet
Free Software Foundation Greg Farough 2023 Accessed: 2026-05-12
Source Archive
FSF's response to Google's Web Environment Integrity API proposal — one of several examples of Google's unilateral influence over web standards.
Google Bins Integrity API That Looked More Than a Bit Like Horrible DRM for Websites
The Register Thomas Claburn 2023 Accessed: 2026-05-12
Source Archive
Google ultimately abandoned the Web Environment Integrity proposal following widespread backlash.
Google Outlines Why They Are Removing JPEG-XL Support From Chrome
Phoronix Michael Larabel 2022 Accessed: 2026-05-12
Source Archive
Documents Google's unilateral decision to drop JPEG-XL from Chrome despite community support for the format.
Google Rekindles Relationship with Jilted JPEG XL Image Format
The Register Thomas Claburn 2026 Accessed: 2026-05-12
Source Archive
Google reversed its JPEG-XL decision. Illustrates the cycle of unilateral moves and reversals.
Google AMP Is Dead! AMP Pages No Longer Get Preferential Treatment in Google Search
Plausible Blog Marko Saric 2021 Accessed: 2026-05-12
Source Archive
Overview of the AMP saga — Google's proprietary web framework that received artificial search ranking boosts, and its eventual demise.
Chrome OS Did Lots of Growing Up in Its First Decade
Fast Company JR Raphael 2021 Accessed: 2026-05-12
Source Archive
History of Chrome OS from its first release to the tenth anniversary. Good quotes about the original vision.
ChromeOS
Wikipedia Accessed: 2026-05-12
Source Archive
Factual reference — dates, 2009 announcement, origin story.

WYSIUE — What You See Is Unreal Engine

15 sources

The Big Game Engine Report of 2025
VGInsights / Sensor Tower 2025 Accessed: 2026-05-12
Source Archive
Primary data source for game engine market share statistics.
Unreal Engine Dominates as the Most Successful Game Engine
Creative Bloq Joe Foley 2025 Accessed: 2026-05-12
Source Archive
More accessible discussion of the VGInsights report findings.
Groundbreaking LED Stage Production Technology Created for The Mandalorian
ILM 2020 Accessed: 2026-05-12
Source Archive
Official ILM press release on StageCraft technology. Cites all partners and describes the virtual production pipeline powered by Unreal Engine.
How ILM's Innovative StageCraft Tech Created a Star Wars Virtual Universe
IndieWire Bill Desowitz 2020 Accessed: 2026-05-12
Source Archive
Narrative account of the StageCraft technology. Good background companion to the ILM press release.
Unreal Engine for Film & Television
Epic Games / Unreal Engine Accessed: 2026-05-12
Source Archive
Official Epic Games overview of Unreal Engine use cases across industries: automotive, film, architecture, broadcast.
Unreal Engine Is Revolutionising Other Creative Industries
The Marketing Society Stephen Barnes 2021 Accessed: 2026-05-12
Source Archive
Covers BMW and other cross-industry Unreal Engine adoption examples.
Beyond Gaming 2024: How Unreal Engine 5 Has Revolutionized the Market
The Native Creation Rick Canfield 2024 Accessed: 2026-05-12
Source Archive
Film, aerospace, architecture, automotive — concrete examples of UE5 adoption beyond gaming.
Cyberpunk 2077 Director Says Studio's Switch from REDengine to Unreal Engine 5 Isn't Starting from Scratch
PC Gamer Wes Fenlon 2023 Accessed: 2026-05-12
Source Archive
Interview with Gabe Amatangelo. Best qualitative account of CD Projekt RED's decision to abandon REDengine for UE5.
Why CD Projekt Red's Switch to Unreal Engine Is a Big Deal
Super Jump Magazine Alex Antra 2022 Accessed: 2026-05-12
Source Archive
Broader industry context — other studios making the same switch, implications for the market.
What Is the Cause of the 'Unreal Look' and How Can I Get Rid of It?
Reddit 2026 Accessed: 2026-05-12
Source
Community discussion confirming that practitioners recognize and debate the 'Unreal look' phenomenon.
Does Unreal Engine 5 Make Remastered Video Games Worth Buying Again? | Corridor Cast
YouTube / Corridor Crew Corridor Crew 2025 Accessed: 2026-05-12
Source
Corridor Crew regularly discusses and critiques the 'Unreal Engine look' — identifying it as an aesthetic where virtual productions appear too sterile, plasticky, or 'cheap' despite high technical quality.
The 'Unreal Engine Look' Is a Myth
YouTube / Criteon Criteon 2026 Accessed: 2026-05-12
Source
Counterargument to the 'Unreal look' thesis. Useful for presenting a balanced view.
'Greatest Slip Backwards': Pirates of the Caribbean Director Says Unreal Engine Makes Films Look Too Much Like Games
Video Games Chronicle Chris Scullion 2026 Accessed: 2026-05-12
Source Archive
Director Gore Verbinski's public criticism of Unreal Engine aesthetics in film production. January 2026.
Director Gore Verbinski Says Unreal Engine Is 'the Greatest Slip Backwards' for Movie CGI
PC Gamer Christopher Livingston 2026 Accessed: 2026-05-12
Source Archive
The REAL Reason Unreal Engine VFX Looks FAKE
YouTube / Joshua M Kerr Joshua M Kerr 2024 Accessed: 2026-05-12
Source

PostgreSQL — The One That Opposes

13 sources

Stack Overflow Developer Survey 2024 — Technology
Stack Overflow 2024 Accessed: 2026-05-12
Source Archive
Official survey results. PostgreSQL ranked #1 most-used database for the second consecutive year.
2024 Stack Overflow Survey Names Postgres the Developers' Favorite Database
EnterpriseDB Blog Marc Linster 2024 Accessed: 2026-05-12
Source Archive
Commentary on the survey results with historical context.
PostgreSQL Core Team
PostgreSQL.org Accessed: 2026-05-12
Source Archive
Official description of the Core Team structure — 7 members, roles, and governance model.
PostgreSQL Community Association (PGCA)
PostgreSQL Community Association Accessed: 2026-05-12
Source Archive
Official page of the non-profit legal entity behind the project.
Bruce Momjian Blog — October 2020 (EDB/2ndQuadrant Crisis)
momjian.us Bruce Momjian 2020 Accessed: 2026-05-12
Source Archive
Documents the EDB/2ndQuadrant conflict and the unwritten 50% contributor rule that emerged from it.
Is It Time to Modernize PostgreSQL Core?
postgresql.fund Álvaro Hernández 2020 Accessed: 2026-05-12
Source Archive
In-depth discussion of PostgreSQL governance, the majority rule, and structural tensions within the project.
Who Is in Charge of Postgres?
EnterpriseDB Blog Bruce Momjian 2024 Accessed: 2026-05-12
Source Archive
Accessible explanation of the decentralized power model in the PostgreSQL project.
JSONB in PostgreSQL Today and Tomorrow
InfoWorld Oleg Bartunov 2022 Accessed: 2026-05-12
Source Archive
History of JSON support from PostgreSQL 9.2 to 9.4, written from the developer perspective. Part of the JSON/JSONB (2012→2014) evolution story.
What's New in PostgreSQL 9.4
PostgreSQL Wiki 2014 Accessed: 2026-05-12
Source Archive
Official documentation of JSONB introduction in PostgreSQL 9.4.
The Evolution of Logical Replication in PostgreSQL: A Firsthand Account
EnterpriseDB Blog Petr Jelinek 2025 Accessed: 2026-05-12
Source Archive
History written by one of the primary contributors — from logical decoding in 9.4 to full logical replication in v10 (2014→2017).
PostgreSQL 10 Released
PostgreSQL.org 2017 Accessed: 2026-05-12
Source Archive
Official press release. Contains Core Team quote about 'years of work' on logical replication.
History of the Potato
Wikipedia Accessed: 2026-05-12
Source Archive
Reference for footnote on potato history — dates, initial European suspicion, spread across the continent.
The Early History of the Potato in Europe
Springer Nature J. G. Hawkes, J. Francisco-Ortega 1993 Accessed: 2026-05-12
Source Archive
Academic source for the potato footnote.

The Seed Bank at the End of the World — GitHub Concentration Risk

3 sources

Microsoft Has Acquired GitHub for $7.5B in Microsoft Stock
TechCrunch Frederic Lardinois, Ingrid Lunden 2018 Accessed: 2026-05-12
Source Archive
Git Turns 20: A Q&A with Linus Torvalds
GitHub Blog Taylor Blau 2025 Accessed: 2026-05-12
Source Archive
Torvalds on Git's decentralization philosophy — ideal source for the irony of the world's decentralized version control system being centralized on a single Microsoft-owned platform.
GitHub Repository Statistics
Gitnux Emilia Santos 2026 Accessed: 2026-05-12
Source
Source for the 500M+ repositories figure and other GitHub scale statistics.
At the time of access, article was updated on May 11.

26

Digital Feudalism

3 Topics

Case Studies

Sherlocking: Your Success Is Our Backlog

12 sources

The Long Story Behind Karelia's New Logo
Karelia Software Blog Dan Wood 2015 Accessed: 2026-05-12
Source Archive
Primary source for the entire Sherlock/Watson story — Dan Wood's own account, including his paraphrase of the conversation with Jobs. Note: the site had an invalid SSL certificate at the time of checking; using a Web Archive version is recommended.
Note: At the time of access original article had an invalid SSL certificate. The Web Archive version is recommended for access.
Karelia Watson
Wikipedia Accessed: 2026-05-12
Source Archive
Concise factual summary of the Watson/Sherlock case.
What Does It Mean When Apple Sherlocks an App?
How-To Geek Justin Pot 2017 Accessed: 2026-05-12
Source Archive
Good overview with the key quote and historical context explaining how the term entered tech vocabulary.
Tile Bashes Apple's New AirTag as Unfair Competition
TechCrunch Sarah Perez 2021 Accessed: 2026-05-12
Source Archive
Tile's reaction on the day of the AirTag announcement. Covers the U1/UWB access issue and unfair competition claims.
Tile Bemoans Apple AirTags Launch, Raises Antitrust Concerns
AppleInsider Mike Peterson 2021 Accessed: 2026-05-12
Source Archive
Fuller context for Tile's antitrust allegations against Apple.
Apple AirTags Compete with Tile and Invites Antitrust Scrutiny
Washington Post Reed Albergotti 2021 Accessed: 2026-05-12
Source
Regulation (EU) 2022/1925 — Digital Markets Act
EUR-Lex 2022 Accessed: 2026-05-12
Source Archive
Full official text of the Digital Markets Act.
American Innovation and Choice Online Act
Wikipedia Accessed: 2026-05-12
Source Archive
Footnote reference for AICOA — factual summary of the proposed US legislation.
AICOA's Failure and the Future of Competition Policy in Congress
Project DisCo 2023 Accessed: 2026-05-12
Source Archive
Footnote reference. Analysis of why AICOA failed to pass.
An Overview of Ultra-WideBand Standards and Organizations
arXiv Dieter Coppens, Adnan Shahid, Sam Lemey, Ben Van Herbruggen, Chris Marshall, Eli De Poorter 2022 Accessed: 2026-05-12
Source Archive
Academic source confirming that not all UWB chips are open to third-party developers, limiting possible applications — the technical basis for the Tile/AirTag competitive asymmetry.
Apple U1 Chip Explained: What Is It and What Can It Do?
Pocket-lint Elyse Betters Picaro 2022 Accessed: 2026-05-12
Source Archive
Accessible explanation of time-of-flight technology, Precision Finding, and the AirTag context.
Confirmed: Apple Developed Exclusive Tech for the U1 Ultra Wideband Radio
iFixit Craig Lloyd 2019 Accessed: 2026-05-12
Source Archive
Technical analysis of the U1 chip including UWB history and the proprietary nature of Apple's implementation.

The 72-Hour Heist — Scale Over Soul

10 sources

Facebook Offered $3 Billion for Snapchat. Evan Spiegel Said No.
Slate Will Oremus 2013 Accessed: 2026-05-12
Source Archive
Primary account of the rejected acquisition offer, based on the original Wall Street Journal report.
Snap Inc.
Wikipedia Accessed: 2026-05-12
Source Archive
Concise factual reference for the $3B offer and other key Snap milestones.
As Calls Grow to Split Up Facebook, Employees Explain Why the Instagram Deal Happened
CNBC Salvador Rodriguez 2019 Accessed: 2026-05-12
Source Archive
Firsthand account of the acquisition process: Zuckerberg invited Systrom to his Menlo Park home, and both completed the deal documents over a weekend with a single lawyer present.
Facebook to Acquire Instagram
TechCrunch 2012 Accessed: 2026-05-12
Source Archive
Original day-of-announcement coverage of the $1 billion acquisition.
Instagram Launches Stories
TechCrunch Josh Constine 2016 Accessed: 2026-05-12
Source Archive
Original launch coverage (August 2, 2016). Includes user numbers at the time: Instagram 500M total, Snapchat 150M DAU.
Snapchat Growth Slowed 82% After Instagram Stories Launched
TechCrunch Josh Constine 2017 Accessed: 2026-05-12
Source Archive
Key source for the 82% growth slowdown figure, based on Snap's IPO documents.
Snapchat vs. Instagram vs. Facebook Stories
Social Media Today 2017 Accessed: 2026-05-12
Source Archive
Source confirming Instagram surpassed Snapchat's daily active story users within eight months of launch.
Instagram Stories Turns 1 as Daily Use Surpasses Snapchat
TechCrunch Josh Constine 2017 Accessed: 2026-05-12
Source Archive
One-year anniversary coverage with user comparison figures in the context of Snap's IPO.
Perplexity to Pay Snap $400M to Power Search in Snapchat
TechCrunch Ivan Mehta 2025 Accessed: 2026-05-12
Source Archive
Snap and Perplexity Partner to Bring Conversational AI Search to Snapchat
Snap Investor Relations 2025 Accessed: 2026-05-12
Source Archive
Official Snap press release for the Perplexity partnership.

The Vampire Strategy: The Extraction of Open Source

13 sources

Worldwide Market Share of Leading Cloud Infrastructure Service Providers
Statista Felix Richter 2026 Accessed: 2026-05-12
Source Archive
Quarterly chart showing AWS + Azure + GCP + Alibaba accounting for approximately 70% of the cloud infrastructure market.
Redis Pulls Back on Open Source Licensing, Citing Stingy Cloud Services
The New Stack Joab Jackson, Lawrence E Hecht 2018 Accessed: 2026-05-12
Source Archive
Redis Labs: 'Cloud providers contribute very little (if anything) to those open source projects.' Core source for the Vampire Strategy mechanism.
AWS Managed Kafka Launches: Is It 'Strip Mining' Open Source?
Tech Monitor 2018 Accessed: 2026-05-12
Source Archive
Redis Labs CMO: 'AWS are simply poaching open source investment.' The 'strip mining' framing.
The Consequences of a Changing Open-Source Software Business Model
CMSWire Virginia Backaitis 2018 Accessed: 2026-05-12
Source Archive
Broader context covering Kafka, Redis, MongoDB and the managed services dynamic.
AWS DocumentDB Is Not MongoDB-Compatible, Says MongoDB Inc
The Register Tim Anderson 2021 Accessed: 2026-05-12
Source Archive
MongoDB CTO: DocumentDB is '34 per cent compatible' and a 'Frankenbase'. Primary source for the Amazon DocumentDB vs MongoDB case.
Comparing Amazon DocumentDB and MongoDB
MongoDB Accessed: 2026-05-12
Source Archive
MongoDB's official comparison — naturally one-sided but useful for specific technical incompatibility claims.
Elastic License Update
Elastic Blog Steve Kearns 2021 Accessed: 2026-05-12
Source Archive
Official Elastic announcement of the license change away from Apache 2.0 (January 2021). Starting point of the Elasticsearch/OpenSearch saga.
AWS Transfers OpenSearch to the Linux Foundation
The New Stack Steven J. Vaughan-Nichols 2024 Accessed: 2026-05-12
Source Archive
History of the AWS OpenSearch fork and its eventual transfer to the Linux Foundation.
Linux Foundation Announces OpenSearch Software Foundation
Linux Foundation 2024 Accessed: 2026-05-12
Source Archive
Official announcement of the OpenSearch Software Foundation under the Linux Foundation (September 2024).
OpenSearch (Software)
Wikipedia Accessed: 2026-05-12
Source Archive
Concise history of the Elasticsearch fork.
Kubernetes
Wikipedia Accessed: 2026-05-12
Source Archive
History of Kubernetes' creation at Google and donation to CNCF. Wikipedia used as reference here given strong community maintenance of open source project articles, and verified against author's professional experience.
Go (Programming Language)
Wikipedia Accessed: 2026-05-12
Source Archive
Reference for Go's origin at Google — part of the 'home field advantage' context.
Building a High Growth Business by Monetizing Open Source Software
Yugabyte Blog Sid Choudhury 2018 Accessed: 2026-05-12
Source Archive
Optional background. Good context for the Databricks/Azure dynamic and open source monetization models generally.

Part 7

Garbage Harvest

4 Chapters

##

Introduction

1 Topic

Part Introduction

Garbage In, Garbage Out — Origin of the Phrase

2 sources

Garbage In, Garbage Out
Wikipedia Accessed: 2026-05-12
Source Archive
History of the phrase, first known print usage (1957), George Fuechsel, context from the early years of computing.
Is This the First Time Anyone Printed 'Garbage In, Garbage Out'?
Atlas Obscura Rob Stenson 2016 Accessed: 2026-05-12
Source Archive
Best narrative account of the phrase's history. Documents the earliest known printed occurrence in the Times Daily of Hammond, Indiana (November 10, 1957) and the Fuechsel story. Most thorough source found on the topic — the author searched press archives and appears to have corrected the existing historical record.

27

Methodology

6 Topics

Explainers

Survey Design & Sampling Bias: How to Accidentally Measure the Wrong Universe

6 sources

Study Bias
StatPearls / PubMed Central Aleksandar Popovic; Martin R. Huecker. 2023 Accessed: 2026-05-14
Source
Distinguishes random error from systematic error (bias), defines selection bias and sampling bias in a research context.
Selection Bias
UNC Chapel Hill / ERIC Lorraine K. Alexander, Brettania Lopes, Kristen Ricchetti-Masterson, Karin B. Yeatt 2015 Accessed: 2026-05-14
Source Archive
Classic epidemiological definition of selection bias.
What Is Selection Bias?
Scribbr Kassiani Nikolopoulou 2022 Accessed: 2026-05-14
Source Archive
Definition, types, and examples of selection bias. More accessible than the UNC source.
Sampling Bias: Identifying and Avoiding Bias in Data Collection
Eval Academy Sheldon Kallio 2022 Accessed: 2026-05-14
Source Archive
Practical discussion with concrete examples.
A Compounding Threat: The True Cost of Poor Data Quality
IBM Tom Krantz , Alexandra Jonker 2026 Accessed: 2026-05-14
Source Archive
The Costs of Poor Data Quality
Anodot Accessed: 2026-05-14
Source Archive
References Gartner estimates on the financial impact of poor data quality.

Asking the Question Wrong (And Calling It Data)

6 sources

The Dreaded Double-Barreled Question & How to Avoid It
Qualtrics 2022 Accessed: 2026-05-14
Source Archive
Comprehensive overview of common survey design errors: leading questions, assumptive questions, double-barreled questions.
What are double-barrelled questions in survey design and how to avoid them
Kantar Meghan Bazaman 2025 Accessed: 2026-05-14
Source Archive
Clear distinction: double-barreled = ambiguity, leading = bias.
The Double-Barrel Survey Question and Other Survey Mistakes
SurveyMonkey Abigail Matsumoto Accessed: 2026-05-14
Source Archive
Practical examples including loaded questions.
Ascending vs. Descending Order of Response Options
Boise State University OER Seung Youn (Yonnie) Chyung 2025 Accessed: 2026-05-14
Source Archive
Academic discussion of primacy effect, left-side selection bias, and acquiescence bias in Likert scales.
Evidence-Based Survey Design: The Use of Ascending or Descending Order of Likert-Type Response Options
Wiley / Performance Improvement Seung Youn (Yonnie) Chyung, Megan Kennedy, Ingrid Campbell 2018 Accessed: 2026-05-14
Source Archive
Peer-reviewed study demonstrating that response option ordering significantly affects survey results.
Likert Scale Questionnaire
Simply Psychology Saul McLeod 2025 Accessed: 2026-05-14
Source Archive
Accessible overview of Likert scales, response order effects, and their impact on results.

What Are We Actually Measuring?

11 sources

The Need for Mobile Speed
Google / DoubleClick 2019 Accessed: 2026-05-14
Source Archive
Original Google report. Primary source for the 53% abandonment / 3-second threshold statistic. Available as PDF.
The Need for Mobile Speed
Google Ad Manager Blog Alex Shellhammer, Juliette Neel 2016 Accessed: 2026-05-14
Source Archive
Google's own blog post promoting the report findings.
What Is Bounce Rate? Definition, Benchmarks, and How to Reduce It
Kissmetrics 2026 Accessed: 2026-05-14
Source Archive
Shows that bounce rate — the metric underpinning the 53% statistic — cannot distinguish satisfied from frustrated users. Google Analytics 4 introduced 'engaged sessions' precisely because the classic metric was misleading. Good counterpoint to cite alongside the Google report.
Improving Ratings: Audit in the British University System
Cambridge University Press / European Review Marilyn Strathern 1997 Accessed: 2026-05-14
Source Archive
The article in which the popular formulation of Goodhart's Law appears: 'When a measure becomes a target, it ceases to be a good measure.' Published in European Review, Vol. 5(3), pp. 305–321.
The Tyranny of Metrics
Princeton University Press Jerry Z. Muller 2018 Accessed: 2026-05-14
Source
The most widely read contemporary book on metric pathologies. Connects Goodhart's Law to case studies from education, medicine, the military, and business. ISBN: 9780691174952.
Goodhart's Law
Wikipedia Accessed: 2026-05-14
Source Archive
Time to First Byte (TTFB)
Google web.dev Barry Pollard, Jeremy Wagner 2025 Accessed: 2026-05-14
Source Archive
Official Google documentation defining TTFB and its relationship to FCP and LCP. Authoritative and kept up to date.
Core Web Vitals
Google web.dev Accessed: 2026-05-14
Source Archive
Main documentation for Google's UX metric suite: LCP, INP, CLS — each measuring a different aspect of user experience.
Key Web Performance Metrics in 2024
Shopify Engineering Blog Sia Karamalegos 2022 Accessed: 2026-05-14
Source Archive
Practical breakdown of the metric hierarchy (TTFB → FCP → LCP → INP) from an e-commerce engineering perspective. Illustrates precisely the ambiguity of 'loading time' discussed in this chapter.
Don't Be Seduced by the Allure: A Guide for How (Not) to Use Proxy Metrics in Experiments
Analytics at Meta / Medium 2022 Accessed: 2026-05-14
Source Archive
Well-written practical guide on the pitfalls of proxy metrics in A/B testing, written by Meta engineers with strong methodological grounding. Ideal for the 'we measure what we can, not what we want' argument.
On the Tyranny of Metrics and Metric Fixation
Towards Data Science Eryk Lewinson 2020 Accessed: 2026-05-14
Source
Accessible synthesis of Muller, Goodhart's Law, and Campbell's Law with examples. Useful bridge between technical and popular-science audiences.

Case Studies

Ten Million Envelopes to Nowhere — The Literary Digest Poll

6 sources

Landon in a Landslide: The Poll That Changed Polling
History Matters / George Mason University 1936 Accessed: 2026-05-14
Archive
At the time of composing this list the original article was not available (HTTP/404), but the content was archived by the Wayback Machine.
Contains the original Literary Digest text from October 31, 1936. Most accessible source with the verbatim forecast.
1936 United States Presidential Election
Wikipedia Accessed: 2026-05-14
Source Archive
Election results, Gallup and Digest comparison, with references to primary sources. Acceptable as a supplementary reference.
Why the 1936 Literary Digest Poll Failed
Oxford University Press / Public Opinion Quarterly Peverill Squire 1988 Accessed: 2026-05-14
Source Archive
The most important academic source for this case. The only empirical study of the failure's causes, based on 1937 Gallup data. Concludes that non-response bias — not sampling bias — was the dominant cause. Vol. 52, No. 1, Spring 1988, pp. 125–133. Also available via JSTOR (https://www.jstor.org/stable/2749114) and as PDF (https://criticalthinkingtext.wordpress.com/wp-content/uploads/2017/02/squire-literary-digest.pdf).
Inside the Alluring Power of Public Opinion Polls From Elections Past
Smithsonian Magazine Jackie Mansky 2016 Accessed: 2026-05-14
Source Archive
Good historical context for the Gallup breakthrough. Credible popular-science source.
That Time the Literary Digest Poll Got the 1936 Election Wrong
ProQuest Blog 2016 Accessed: 2026-05-14
Source Archive
References original press archives from the Boston Globe. Interesting detail: Gallup publicly predicted his own accuracy in advance.
The Literary Digest Poll — State-by-State Data
University of Pennsylvania / randomservices.org Accessed: 2026-05-14
Source Archive
Raw state-by-state Digest data. Useful for readers who want to see the scale of the error directly.

Love in the Time of A/B Testing — OKCupid & Facebook

8 sources

We Experiment On Human Beings!
OKTrends (archived by gwern.net) Christian Rudder 2014 Accessed: 2026-05-14
Source
Original blog post, no longer available at the original URL (blog.okcupid.com) — full text archived here. Only source containing the complete original text with all figures: 44%, 2,200 conversations, match counts.
OKCupid Plays With Love in User Experiments
The New York Times Molly Wood 2014 Accessed: 2026-05-14
Source Archive
First major press account of the story. Frequently cited in academic literature.
OkCupid Co-Founder: 'We Experiment on Human Beings…That's How Websites Work'
Newsweek Taylor Wofford 2014 Accessed: 2026-05-14
Source Archive
Direct interview with Rudder. Good quotes for context.
Experimental Evidence of Massive-Scale Emotional Contagion Through Social Networks
PNAS Adam D. I. Kramer, Jamie E. Guillory, Jeffrey T. Hancock 2014 Accessed: 2026-05-14
Source Archive
The original Facebook study. Its publication in PNAS was itself part of the scandal.
Editorial Expression of Concern
PNAS Inder M. Verma 2014 Accessed: 2026-05-14
Source Archive
Official PNAS editorial concern regarding the lack of informed consent from participants.
Facebook, OkCupid User Experiments: Ethics Aside, They Show Us the Limitations of Big Data
Slate David Auerbach 2014 Accessed: 2026-05-14
Source Archive
Useful analysis covering both scandals simultaneously.
The Ethics of OKCupid's Dating Experiment
Luvze Dylan Selterman 2014 Accessed: 2026-05-14
Source Archive
Brief, accessible ethical analysis by a psychologist. Compares both experiments and explains the IRB question.
Letter to OKCupid Requesting IRB Documentation
James Grimmelmann (personal site) James Grimmelmann, Leslie Meltzer Henry 2014 Accessed: 2026-05-14
Source Archive
Formal legal letter requesting access to IRB protocols under Maryland state law. For readers interested in the legal dimension.

When the Algorithm Catches the Flu — Google Flu Trends

6 sources

Detecting Influenza Epidemics Using Search Engine Query Data
Nature Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski, Larry Brilliant 2009 Accessed: 2026-05-14
Source
Original GFT paper. Published in Nature (online November 2008, print February 2009). Primary source for the entire case study. Nature 457, 1012–1014.
When Google Got Flu Wrong
Nature Declan Butler 2013 Accessed: 2026-05-14
Source Archive
February 2013 Nature article that first publicized the scale of GFT's failure. Starting point of the public story. Nature 494, 155–156.
The Parable of Google Flu: Traps in Big Data Analysis
Science David Lazer, Ryan Kennedy, Gary King, Alessandro Vespignani 2014 Accessed: 2026-05-14
Source Archive
The most important academic source for this case. Science article analyzing the causes of failure: algorithm dynamics, big data hubris, absence of CDC data as a corrective signal. Coined the terms 'parable of Google Flu' and 'big data hubris'. Science 343, 1203–1205.
Google Flu Trends Still Appears Sick: An Evaluation of the 2013–2014 Flu Season
Harvard / SSRN David Lazer, Ryan Kennedy, Gary King, Alessandro Vespignani 2014 Accessed: 2026-05-14
Source Archive
Follow-up analysis showing the problems persisted after Google's modifications.
What We Can Learn From the Epic Failure of Google Flu Trends
Wired David Lazer, Ryan Kennedy 2015 Accessed: 2026-05-14
Source Archive
Accessible synthesis by the authors of the key Science paper. Good companion to the academic source.
Google Flu Trends Failure Shows Drawbacks of Big Data
Time Bryan Walsh 2014 Accessed: 2026-05-14
Source Archive
Journalistic account with direct Lazer quotes. Includes the 'Dewey beats Truman' comparison.

28

Technical Collection Screwups

9 Topics

Explainers

When Numbers Look Right (But Aren't)

5 sources

International Vocabulary of Metrology (VIM), 3rd Edition
BIPM / JCGM Accessed: 2026-05-14
Source Archive
Official reference vocabulary for metrology. Defines measurement error, systematic error, random error, and uncertainty.
The authoritative document for the science of measurement — unmatched as a reference.
JCGM (Joint Committee for Guides in Metrology) is the international body by BIPM (Bureau International des Poids et Mesures / International Bureau of Weights and Measures) responsible for this standard.
Observational Error
Wikipedia Accessed: 2026-05-14
Source Archive
Key definition: random error = noise, systematic error = bias. Good overview from a metrology and statistics perspective.
Random vs. Systematic Error
Scribbr Pritha Bhandari 2021 Accessed: 2026-05-14
Source Archive
Clear explanation with examples. Includes the key framing: random error is 'noise' that blurs the true signal of what's being measured.
Random vs. Systematic Error
University of Maryland Physics R. H. B. Exell Accessed: 2026-05-14
Source Archive
Classic academic treatment from a physical sciences and laboratory experiment perspective.
Metrology Part 1: Definition of Quality Criteria
PubMed Central / Critical Care Pierre Squara, Thomas W L Scheeren, Hollmann D Aya, Jan Bakker, Maurizio Cecconi, Sharon Einav, Manu L N G Malbrain, Xavier Monnet, Daniel A Reuter, Iwan C C van der Horst, Bernd Saugel 2020 Accessed: 2026-05-14
Source Archive
Defines bias, systematic error, and random error with medical examples and clear diagrams distinguishing noise from bias.
The only source in this group written strictly for scientists rather than engineers.

The Digital Vacuum Cleaner: Scraping, Crawlers, and the Art of Collecting Everything Except What You Wanted

8 sources

Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery
ResearchGate Soumen Chakrabarti, Martin van den Berg, Byron Dom 2000 Accessed: 2026-05-14
Source Archive
Classic Stanford paper on crawler architecture. Foundational for the field and cited in hundreds of subsequent works.
Web Crawling and Scraping: A Survey
IEEE Xplore Gaurav Sharma 2024 Accessed: 2026-05-14
Source Archive
Contemporary academic survey covering definitions, methods, stages, and technologies.
RFC 9309 — Robots Exclusion Protocol
IETF M. Koster, G. Illyes, H. Zeller, L. Sassman 2022 Accessed: 2026-05-14
Source Archive
Official internet standard defining how crawlers communicate with site owners via robots.txt.
Also provides background for the concept of a crawler operating 'politely' — robots.txt is the mechanism of that politeness.
Web Scraping
Wikipedia Accessed: 2026-05-14
Source Archive
History, definitions, methods, and legal and technical considerations.
Scraper vs Crawler: When to Use Each
Firecrawl Blog Bex Tuychiev 2026 Accessed: 2026-05-14
Source
Clear engineering-perspective distinction between a crawler and a scraper.
Web Scraping vs. API: Which Is Best for Your Project?
ZenRows Yuvraj Chandra 2025 Accessed: 2026-05-14
Source Archive
Compares scraping with API as an official data access channel. Explains the fragility of scrapers in the face of HTML changes.
hiQ Labs v. LinkedIn
Wikipedia Accessed: 2026-05-14
Source Archive
Wikipedia article on the landmark case on the legality of scraping public data. Six years of litigation, ended in settlement. Background for the 'other side of a courtroom' framing.
How Google and Yelp Handle Fake Reviews and Policy Violations
Search Engine Land George Nguyen 2021 Accessed: 2026-05-14
Source Archive
Documents the scale of the click farm problem and platform responses.

Case Studies

When the Ocean 'Warmed' Because Someone Changed the Bucket

4 sources

A Large Discontinuity in the Mid-Twentieth Century in Observed Global-Mean Surface Temperature
Nature David W. J. Thompson, John J. Kennedy, John M. Wallace, Phil D. Jones 2008 Accessed: 2026-05-14
Source
Primary source for the entire case. Nature 453, 646–649.
Full text available as PDF via the author's site: https://www.atmos.colostate.edu/~davet/ao/ThompsonPapers/Thompson_etal_Nature2008.pdf
Climate Anomaly Is an Artefact
Nature News Quirin Schiermeier 2008 Accessed: 2026-05-14
Source
Short accessible summary of Thompson's findings for a general audience. Includes quotes from Phil Jones.
Of Buckets and Blogs
RealClimate Gavin Schmidt 2008 Accessed: 2026-05-14
Source Archive
Discussion by climate scientists explaining both the Thompson paper and the broader context of bucket/engine-intake corrections going back decades. Written for an educated non-specialist reader.
Identifying and Correcting the World War 2 Warm Anomaly in Sea Surface Temperature Measurements
EarthArXiv Duo Chan, Peter Huybers 2020 Accessed: 2026-05-14
Source Archive
More recent analysis confirming and refining Thompson's results. Reduces the WW2 warm anomaly by 0.26°C using more advanced statistical methods.

The Case of the Chilly Buoy: How Better Thermometers 'Froze' Global Warming

12 sources

Possible Artifacts of Data Biases in the Recent Global Surface Warming Hiatus
Science Thomas R. Karl, Anthony Arguez, Boyin Huang, Jay H. Lawrimore, James R. McMahon, Matthew J. Menne, Thomas C. Peterson, Russell S. Vose, and Huai-Min Zhang 2015 Accessed: 2026-05-14
Source
Primary source for the entire case. The paper that effectively removed the 'hiatus' from the scientific debate. Science 348(6242), 1469–1472.
Assessing Recent Warming Using Instrumentally Homogeneous Sea Surface Temperature Records
Science Zeke Hausfather, Kevin Cowtan, Peter Jacobs, Gavin Schmidt 2017 Accessed: 2026-05-14
Source
Independent verification of Karl's results using buoy, satellite, and Argo data. Includes a clear explanation of the ship/buoy bias mechanism.
Global warming hiatus disproved — again
UC Berkeley News Robert Sanders 2017 Accessed: 2026-05-14
Source Archive
Accessible summary of the Hausfather et al. paper. Includes brief interview with Zeke Hausfather on YouTube
No 'Slowdown' in Global Surface Temperatures After All, Study Finds
Carbon Brief Roz Pidcock 2015 Accessed: 2026-05-14
Source Archive
Solid day-of-publication journalism. Quotes multiple independent climate scientists.
NOAA Was Right: We Have Been Underestimating Warming
Skeptical Science Zeke Hausfather, Kevin Cowtan, David C. Clarke, Peter Jacobs, Mark Richardson, Robert Rohde 2017 Accessed: 2026-05-14
Source Archive
Technical explanation of the ship intake vs. buoy data difference with charts. Explains exactly where the 0.12°C offset comes from.
NOAA Scientists Falsely Accused of Manipulating Climate Change Data
Snopes Alex Kasprak 2017 Accessed: 2026-05-14
Source Archive
Documents the Lamar Smith subpoena and NOAA's refusal to comply. Includes quotes from Zeke Hausfather as verification author.
Argo, the 'Crown Jewel' of Ocean Observing Systems, Turns 25
NOAA 2024 Accessed: 2026-05-14
Source
Official NOAA source on the Argo program: history, scope, and technical specs — 4,000 floats, 2,000m depth, 10-day cycle.
Wärtsilä-Sulzer RTA96-C
Wikipedia Accessed: 2026-05-14
Source Archive
Footnote reference. 14-cylinder version rated at 80.08 MW (107,390 hp). First commercial deployment on Emma Mærsk (2006).
The World's Most Powerful Engine Enters Service
Wärtsilä 2006 Accessed: 2026-05-14
Source Archive
Footnote reference. Official manufacturer press release confirming 80,080 kW output.
A Review of Waste Heat Recovery from the Marine Engine with Highly Efficient Bottoming Power Cycles
ScienceDirect Sipeng Zhu, Kun Zhang, Kangyao Deng 2019 Accessed: 2026-05-14
Source Archive
Footnote reference. Peer-reviewed source confirming ~50% thermal efficiency, close to the Carnot limit — meaning roughly half the energy is lost as waste heat.
How Do Heat Demand and Energy Consumption Change When Households Transition from Gas Boilers to Heat Pumps in the UK
ScienceDirect Nicola Terry, Ray Galvin 2023 Accessed: 2026-05-14
Source
Footnote reference for household energy consumption context.
Your Average Gas and Electric Bill by House Size and Usage
British Gas Simon Wood 2026 Accessed: 2026-05-14
Source Archive
Footnote reference for household energy consumption benchmarks. Specific calculations are left to the reader.

The Invisible Confetti: When Science Measured Its Own Hands

2 sources

Avoiding and Reducing Microplastic False Positives from Dry Glove Contact
RSC / Analytical Methods Madeline E. Clough, Eduardo Ochoa Rivera, Abbygail M. Ayala, Rebecca L. Parham, Joseph Pennacchio, Henry E. Thurber, Andrew P. Ault, Ambuj Tewari, Anne J. McNeil 2026 Accessed: 2026-05-14
Source Archive
Primary source for the entire case.
Nitrile and Latex Gloves May Cause Overestimation of Microplastics, U-M Study Reveals
Michigan News Morgan Sherburne 2026 Accessed: 2026-05-14
Source Archive
Accessible summary of the Clough et al. paper. Includes quotes from lead author Madeline Clough.

Texas-Sized Assumptions: The Grid and Operational Context

7 sources

February 2021 Cold Weather Outages in Texas and the South Central United States (Final Report)
FERC / NERC 2021 Accessed: 2026-05-14
Source Archive
Federal final report. Confirms that 75.6% of outages resulted from frozen components and fuel supply problems. Describes the gas–power–gas feedback loop.
Note: At the time of writing I encountered 'Sorry, you have been blocked' message on the FERC site. I keep the original link but I was forced to use the Wayback Machine version.
Update to April 6, 2021 Preliminary Report on Causes of Generator Outages and Derates
ERCOT 2021 Accessed: 2026-05-14
Archive
Official ERCOT breakdown of generator outage causes by category.
Note: Ercot's security service is now blocking direct access to the report. It is still available via the Wayback Machine.
The Timeline and Events of the February 2021 Texas Electric Grid Blackouts
University of Texas at Austin Energy Institute 2021 Accessed: 2026-05-14
Source Archive
Most detailed independent academic analysis. Includes full timeline, frequency drop data, and analysis of nameplate capacity vs. actual winter performance.
Texas Still Not Recognizing the Full Death Toll
BuzzFeed News Peter Aldhous, Zahra Hirji 2022 Accessed: 2026-05-14
Source Archive
CDC excess mortality analysis estimating 700+ deaths as the upper bound of official undercounting.
Winter Storm Uri Death Toll — Texas DSHS Final Report
Insurance Journal 2022 Accessed: 2026-05-14
Source Archive
Official Texas DSHS figure of 246 deaths with breakdown by cause, as cited by Insurance Journal (January 2022).
Average Texas Electricity Prices Were Higher in February 2021
U.S. Energy Information Administration Anodyne Lindstrom, Alex Gorski 2021 Accessed: 2026-05-14
Source Archive
Official EIA data: average price $22/MWh in 2020, $9,000/MWh cap held for 77 hours during the crisis.
Texas' Power Grid Was 4 Minutes and 37 Seconds Away From Collapsing
KUT / NPR Austin Matt Largey 2021 Accessed: 2026-05-14
Source Archive
Based on ERCOT board presentation. Quotes CEO Bill Magness. Explains the 59.4 Hz frequency collapse mechanism.

The Overzealous Vacuum — Google Street View WiFi Sniffing

6 sources

Notice of Apparent Liability — Google Street View Wi-Fi Investigation (DA 12-592)
FCC / Public Intelligence 2012 Accessed: 2026-05-14
Source Archive
Primary source for everything: Engineer Doe, the Fifth Amendment invocation, seven engineers, the $25,000 fine for obstructing the investigation. Full partially-declassified version.
Google Releases Full Report on Street View Investigation, Finds That Staff Knew About Wi-Fi Sniffing
TechCrunch Peter Ha 2012 Accessed: 2026-05-14
Source Archive
Day-of-publication coverage. Sets Google's PR blog version against the FCC findings.
Google KNEW Street View Cars Were Slurping Wi-Fi
The Register Andrew Orlowski 2012 Accessed: 2026-05-14
Source Archive
Concise and pointed. The subheading 'Wheels fall off one rogue engineer claim' says it all.
Is It Time To Stop Trusting Google?
Slate Farhad Manjoo 2012 Accessed: 2026-05-14
Source Archive
Best narrative analysis of the FCC report. Identifies Engineer Doe as Marius Milner. Describes the senior manager email: 'Are you saying that these are URLs that you sniffed out of Wi-Fi packets?'
Google Will Pay $7 Million to Settle Street View Data Capturing Case
NPR Eyder Peralta 2013 Accessed: 2026-05-14
Source Archive
Settlement with 38 state attorneys general. Quotes AG Schneiderman's statement.
Investigations of Google Street View
EPIC Accessed: 2026-05-14
Source Archive
Chronological compilation of all investigations: UK, Switzerland, Austria, Hungary, Canada, Hong Kong, and others.

Hacking Reality with a Handcart — Simon Weckert's 99 Phones

8 sources

Google Maps Hacks
simonweckert.com Simon Weckert 2020 Accessed: 2026-05-14
Source Archive
Primary source: project description with video and screenshots of specific streets (Schillingbrücke, Tucholskystraße with 'Google Berlin' label, Michaelbrücke).
Google Maps Hacks
YouTube / Simon Weckert Simon Weckert 2020 Accessed: 2026-05-14
Source
Recording of the performance showing streets turning red in real time.
Interview: The (Real) Art of Faking Google Maps Traffic
Android Authority Tristan Rayner 2020 Accessed: 2026-05-14
Source Archive
Interview with Weckert confirms authenticity, describes a year of planning, 99 Google accounts and SIM cards, the requirement for the cart to keep moving. Includes Google's reaction.
A Guy Carted 99 Phones Around to Create Traffic Jams on Google Maps
Android Authority Adamya Sharma 2020 Accessed: 2026-05-14
Source Archive
Contains a key technical detail: the jam disappeared when the cart stopped or when a car passed at normal speed. Quotes a Google Maps engineer confirming the hack was possible.
How One Artist Hacked Google Maps to Fake a Traffic Jam
Artnet News Caroline Goldstein 2020 Accessed: 2026-05-14
Source
Good narrative account. Includes Weckert's quote about 'rudimentary systems' and the May Day demonstration origin of the idea.
Berlin Artist Simon Weckert Causes Virtual Traffic Jam on Google Maps
Washington Post Britany Shammas 2020 Accessed: 2026-05-14
Source
Short account including Google's response ('car or cart or camel').
Origins of the Digital Twin Concept
ResearchGate Michael Grieves 2016 Accessed: 2026-05-14
Source Archive
Footnote reference. Grieves' own account of the concept's origin from 2002, including the original three-component definition: real space, virtual space, and linking mechanism.
What Is a Digital Twin?
IBM Nick Gallagher, Maggie Mae Armstrong 2020 Accessed: 2026-05-14
Source Archive
Footnote reference. Covers NASA/Apollo origins through Grieves (2002) to modern IoT applications. Explains the bidirectional data flow illustrated by the thermostat example.

The $23 Million Fruit Fly — Amazon Algorithmic Pricing Loop

7 sources

Amazon's $23,698,655.93 Book About Flies
michaeleisen.org Michael Eisen 2011 Accessed: 2026-05-14
Source
The primary source for the entire case. Eisen discovered and documented the feedback loop — exact multipliers (1.270589 and 0.9983), timeline from April 8 to 18, and analysis of both bots' strategies.
Amazon Seller Lists Book at $23,698,655.93 — Plus Shipping
CNN John D. Sutter 2011 Accessed: 2026-05-14
Archive
Day-of coverage. Quotes Eisen and a pricing algorithm expert. Includes: 'It's like you put on the gas and didn't have the handbrake.'
Note: the original CNN article and new snapshots on Wayback Machine are not available at the time of compiling this. Instead I provide a link to snapshot from May 1, 2011.
Amazon Algorithms Price Bio Book at Over $23m
The Register Rik Myslewski 2011 Accessed: 2026-05-14
Source Archive
Concise technical account. Includes price details after the loop unwound.
The Dark Lesson of a $24 Million Amazon Book
Fast Company Dionysios Demetis 2019 Accessed: 2026-05-14
Source Archive
Analysis that connects the case to the broader question of algorithmic reality construction. References a Journal of the Association for Information Systems publication.
An Empirical Analysis of Algorithmic Pricing on Amazon Marketplace
ACM Digital Library Le Chen, Alan Mislove, Christo Wilson 2016 Accessed: 2026-05-14
Source Archive
Foundational empirical paper documenting the scale of algorithmic pricing on Amazon (~500 bot-using sellers among the top 1,600 products). Identifies 'price jitter' patterns — chaotic price spikes caused directly by bots responding to each other. Academic foundation for the Fruit Fly case. WWW '16, pp. 1339–1349.
Algorithmic Pricing, Price Wars and Tacit Collusion
Wharton / lmusolff.com Leon Musolff 2025 Accessed: 2026-05-14
Source Archive
Empirically documents how pricing algorithms enter mutual feedback loops producing outcomes far from market equilibrium. Provides the formal model for the mechanism described narratively in the text.
Algorithmic Collusion
Stanford Law School Renato Nazzini, James Henderson 2024 Accessed: 2026-05-14
Source Archive
Legal review documenting real cases of algorithmic price-fixing (including a poster-fixing conviction on Amazon). Frames bot-to-bot feedback loops as a systemic antitrust risk. Good context for 'consensus is not correctness'.

29

The Data Was Fine Until We Touched It

5 Topics

Explainers

ETL: The Data Janitor's Guide to Not Breaking Everything

5 sources

The Data Warehouse ETL Toolkit
Wiley Ralph Kimball, Joe Caserta 2004
ISBN: 978-0764567575.
Classic data warehousing reference. Kimball is one of the founding figures of the discipline. Defines ETL as an engineering practice.
What Is ETL (Extract, Transform, Load)?
IBM Accessed: 2026-05-14
Source
Accessible but technically grounded overview from a data warehousing pioneer. Covers ETL history from the 1970s through the cloud era.
What Is ETL?
AWS Accessed: 2026-05-14
Source Archive
Practical overview with emphasis on relational databases and data pipelines. Explains the original rationale for ETL.
ETL Process & Tools
SAS Accessed: 2026-05-14
Source Archive
Well-written non-academic overview. Traces ETL from its 1970s origins through data warehousing to the present day, without unnecessary jargon.
Extract, Transform, Load
Wikipedia Accessed: 2026-05-14
Source Archive
Solid definition with history, transformation typology, and data warehousing context. Well-referenced against technical literature.

Case Studies

Austerity by Accident: The Spreadsheet Error That Reshaped the World

6 sources

Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff
Political Economy Research Institute, UMass Amherst Thomas Herndon, Michael Ash, Robert Pollin 2013 Accessed: 2026-05-14
Source Archive
Original critique paper. Contains the row number references (footnote 9, p. 7), the UK/New Zealand example with figures 2.4% / 19 years / -7.6%, and the corrected result of +2.2%. Working Paper No. 322.
Growth in a Time of Debt
NBER Carmen Reinhart, Kenneth Rogoff 2010 Accessed: 2026-05-14
Source Archive
Original paper advancing the 90% debt threshold thesis and the -0.1% growth figure. Published in American Economic Review, Papers and Proceedings, 100(2), 573–578.
FAQ: Reinhart, Rogoff, and the Excel Error That Changed History
Bloomberg Peter Coy 2013 Accessed: 2026-05-14
Source Archive
Concise FAQ from the moment of disclosure. Confirms the -0.1% → +2.2% correction and includes R&R's response.
The Reinhart-Rogoff Error — Or How Not to Excel at Economics
The Conversation Jonathan Borwein, David H. Bailey 2013 Accessed: 2026-05-14
Source Archive
Solid technical account. Lists the omitted countries (Australia, Austria, Belgium, Canada, Denmark) and quotes Paul Ryan.
Growth in a Time of Debt
Wikipedia Accessed: 2026-05-14
Source Archive
Good compilation of political citations: Ryan budget, Olli Rehn, George Osborne. Includes the R&R response from the New York Times.
Reinhart and Rogoff Release Errata
Committee for a Responsible Federal Budget 2013 Accessed: 2026-05-14
Source Archive
Documents R&R's own correction (+0.2% rather than +2.2%) and explains the methodological difference between the two figures.

Biological Capitulation: When Science Renamed Itself to Please Excel

6 sources

Gene Name Errors Are Widespread in the Scientific Literature
Genome Biology Mark Ziemann, Yotam Eren, Assam El-Osta 2016 Accessed: 2026-05-14
Source Archive
Original study establishing the 20% figure. Scanned 3,597 papers from 18 leading genomics journals (2005–2015). Confirms SEPT2 → date and MARCH1 → date conversions. Genome Biology 17, 177.
Gene Name Errors: Lessons Not Learned
PLOS Computational Biology / PubMed Central Mandhri Abeysooriya, Megan Soria, Mary Sravya Kasu, Mark Ziemann 2021 Accessed: 2026-05-14
Source Archive
Follow-up scan of 11,117 papers with gene lists (2014–2020). Result: 30.9% contained errors — the problem grew after 2016 rather than shrinking.
Excel Autocorrect Errors Still Plague Genetic Research
The Conversation Mark Ziemann, Mandhri Abeysooriya 2021 Accessed: 2026-05-14
Source Archive
Written by the original study authors. Illustrates what happens when ETL is replaced by Excel — a natural bridge from the previous case study.
Scientists Are Renaming Human Genes So Microsoft Excel Doesn't Get Confused
The Verge James Vincent 2020 Accessed: 2026-05-14
Source
Quotes HGNC coordinator Elspeth Bruford. Confirms 27 genes renamed, including MARCHF1, SEPTIN1, SEPTIN2. Explains why Microsoft won't change the default settings.
Scientists Are Renaming Dozens of Human Genes So Microsoft Excel Spreadsheets Don't Get Confused
Vice Gavin Butler 2020 Accessed: 2026-05-14
Source
Good coverage with HGNC quotes and historical context for gene naming changes.
Genomics Has a Spreadsheet Problem
Retraction Watch Mandhri Abeysooriya, Mark Ziemann 2023 Accessed: 2026-05-14
Source Archive
Current overview of the problem by the original study authors. Notes ongoing errors in Nature Communications, PLOS ONE, and Scientific Reports.

The Digital Excommunication of Scunthorpe: When 'Clean' Means Deleted

5 sources

AOL Censors British Town's Name!
The Risks Digest Clive Feather 1996 Accessed: 2026-05-14
Source Archive
Original 1996 report of the incident. Cited by Wikipedia and Encyclopedia MDPI as a primary source. Risks Digest 18(7), 25 April 1996.
Scunthorpe Problem
Scholarly Community Encyclopedia (MDPI) 2022 Accessed: 2026-05-14
Source Archive
Solid referenced overview. Includes a chronology of incidents from 1996 to 2021. Cites the Risks Digest as primary source.
Scunthorpe Problem: What Made Scunthorpe Famous
Tedium Ernie Smith 2016 Accessed: 2026-05-14
Source Archive
Thorough long-form account of the problem's history. Includes the AOL spokesperson quote to the Scunthorpe Evening Telegraph, plus examples from Google SafeSearch and iOS.
Monument Record MLS5018 — Scunthorpe
Heritage Gateway Accessed: 2026-05-14
Source
Official archaeological heritage record. Documents Anglo-Saxon settlement traces on the site of present-day Scunthorpe.
Scunthorpe
Wikipedia Accessed: 2026-05-14
Source Archive
Reference for the first recorded appearance of the town's name.

The People Who Never Existed: The 'Null' Surname

4 sources

Hello, I'm Mr. Null. My Name Makes Me Invisible to Computers
Wired Christopher Null 2015 Accessed: 2026-05-14
Source Archive
Well-documented first-person account. Includes specific details about Bank of America, mortgage forms, and the email address null@nullmedia.com.
It Really Sucks to Be Named Jennifer Null
Gizmodo A;ossa Walker 2016 Accessed: 2026-05-14
Source Archive
Based on the BBC article. Confirms issues with airline tickets and the IRS. Includes Jennifer's quote: 'I've been asked why I'm calling and when I try to explain the situation, I've been told there's no way that's true.'
My Surname Is NULL
Born SQL 2017 Accessed: 2026-05-14
Source Archive
Solid technical treatment of the problem from a SQL perspective: validation, escaping, form handling. References Christopher Null's article and adds engineering analysis. Good bridge between narrative and mechanics.
Last Name Null Causing Database Confusion
Streamline Verify Frank Strafford 2018 Accessed: 2026-05-14
Source Archive
Documents the case of Angela Johnson Null in the GSA-SAM system. Shows the problem affects multiple people beyond Jennifer and Christopher.

Part 8

Algorithms Gone Wild

4 Chapters

##

Introduction

2 Topics

Explainers

Algorithm — The Guy from Khwarazm

8 sources

Al-Khwarizmi (790–850)
MacTutor History of Mathematics, University of St Andrews J. J. O'Connor, E. F. Robertson 1999 Accessed: 2026-05-14
Source Archive
Authoritative academic source. Covers uncertainty around dates and birthplace, the House of Wisdom connection, and the etymology of 'algebra' and 'algorithm'.
Arabic Numerals
MacTutor History of Mathematics, University of St Andrews J. J. O'Connor, E. F. Robertson 2001 Accessed: 2026-05-14
Source Archive
Academic account of the journey of Hindu-Arabic numerals from India through the Arab world to Europe. Covers al-Khwarizmi's role in popularising the positional system and zero as a placeholder. Cites original Arabic sources and Latin translations.
Episodes in the Mathematics of Medieval Islam
Springer J. L. Berggren 2003
ISBN: 978-0-387-40605-3
Standard academic reference on Islamic mathematics. Covers al-Khwarizmi's contributions in algebra and the decimal system. Cited by MacTutor and other academic sources as a reference text.
Al-Khwarizmi
Britannica Accessed: 2026-05-14
Source Archive
Standard encyclopaedic overview. Cites the Latin translation 'Algoritmi de numero Indorum' as the source of the word 'algorithm'.
How Algorithm Got Its Name
NASA Science 2017 Accessed: 2026-05-14
Source Archive
Accessible but reliable overview. Traces the path from the Arabic name through the Latin 'Algoritmi' to the English 'algorithm'.
Why Are Algorithms Called Algorithms? A Brief History of the Persian Polymath You've Likely Never Heard Of
University of Melbourne / Find an Expert Debbie Passey 2024 Accessed: 2026-05-14
Source Archive
Good popular-science piece with solid academic grounding. Also covers the role of Hindu-Arabic numerals and zero in the context of modern computing.
Introduction to Algorithms (4th ed.)
MIT Press Cormen, Leiserson, Rivest, Stein 2022
ISBN: 978-0-262-03968-5
Universally known as 'CLRS' — the standard algorithms textbook used in university courses worldwide. Chapters 6–8 cover sorting algorithms (heapsort, quicksort, mergesort); chapters 8–9 cover linear-time sorting. Includes formal complexity analysis for each algorithm.
Introduction to Algorithms (6.006, Spring 2020)
MIT OpenCourseWare 2020 Accessed: 2026-05-14
Source Archive
Official MIT course syllabus. Lists sorting as a central topic and recommends CLRS as the reference text. Free access to lecture materials.

Deterministic — The 'What You See Is What You Get' of Logic

4 sources

Deterministic Algorithm — Glossary
NIST Computer Security Resource Center Accessed: 2026-05-14
Source Archive
Official NIST definition: 'An algorithm that, given the same inputs, always produces the same outputs.'
Deterministic Approach
ScienceDirect Topics Accessed: 2026-05-14
Source Archive
Academic overview with references to scientific literature. Covers determinism in serial and parallel computing contexts, including race conditions.
Difference Between Deterministic and Non-Deterministic Algorithms
GeeksforGeeks Accessed: 2026-05-14
Source Archive
Standard industry overview with examples (Merge Sort as deterministic, Randomized QuickSort as non-deterministic) and illustrative code.
Deterministic Algorithm
Wikipedia Accessed: 2026-05-14
Source Archive
Accessible overview with a good list of non-determinism sources (user input, hardware timer, race conditions) — matching what the text describes as primary sources of non-determinism in IT.

30

Rules Gone Wrong

3 Topics

Case Studies

BATS: The Digital Ouroboros

9 sources

Bad Day for BATS — and for High-Frequency Trading
CNBC Jeff Cox 2012 Accessed: 2026-05-14
Source
Cites the official BATS statement on the cause: 'single match engine [...] encountered a software bug related to IPO auctions, which rendered open customer orders in this symbol range inaccessible.' Confirms the A–BF ticker range and three erroneous AAPL transactions.
Bats Global cancels IPO after its errors derail trading
Los Angeles Times 2012 Accessed: 2026-05-14
Source
Day-of account. Quotes CEO Joe Ratterman. Confirms the $16 offer price, IPO withdrawal, and SEC contact with BATS.
Up to BATS: An IPO Swing and a Miss
Fordham Journal of Corporate and Financial Law Thomas Michael 2012 Accessed: 2026-05-14
Source Archive
Legal analysis of the event. Confirms the sequence of events and consequences for underwriters (~$7.1M in lost fees).
Trading Firm IPO Fizzles in Seconds
The Wall Street Journal Tom Lauricella, Scott Patterson, David Benoit 2012 Accessed: 2026-05-14
Source Archive
Confirms impact on other companies
Note: article is behind a paywall, but even the headline and summary confirm the key details of the event.
After Failed IPO, When Should BATS Try Again?
Institutional Investor Loch Adamson 2012 Accessed: 2026-05-14
Source Archive
Good description of the failure mechanism: one of 32 matching engines, the 10:45 timing, a bug in the messaging system preventing price updates.
CBOE Holdings Agrees to Acquire Bats Global Markets
Cboe Holdings / PR Newswire 2016 Accessed: 2026-05-14
Source
Official acquisition announcement. $3.2B at announcement (September 2016), transaction closed February 2017 at $3.4B.
Eating Your Own Dog Food
Wikipedia Accessed: 2026-05-14
Source Archive
Footnote reference. Confirms Paul Maritz's 1988 Microsoft email 'Eating our own Dogfood' to Brian Valentine as the origin of the term.
10 Years of Git: An Interview with Git Creator Linus Torvalds
Linux Foundation 2015 Accessed: 2026-05-14
Source Archive
Footnote reference. Torvalds in his own words: Git became self-hosting one day after its announcement (April 7, 2005).
Git
Wikipedia Accessed: 2026-05-14
Source Archive
Footnote reference. Confirms the timeline: development started April 3, 2005; self-hosting achieved April 7 — four days later.

The Zip-Code Executioner: Ofqual's 'Fair' Algorithm

7 sources

A Tale of Two Algorithms: The Appeal and Repeal of Calculated Grades Systems in England and Ireland in 2020
British Educational Research Journal Adrian Kelly 2021 Accessed: 2026-05-14
Source
Solid academic treatment. Analyses the algorithm mechanism, the small-cohort problem, and compares the situations in England and Ireland. Cites official Ofqual documents. Vol. 47(3).
The 2020 GCSE and A-Level 'Exam Grades Fiasco': A Secondary Data Analysis
University of Bristol / Centre for Multilevel Modelling George Leckie, Lucy Prior Accessed: 2026-05-14
Source Archive
Academic data analysis. Confirms that 40% of Centre-Assessed Grades were downgraded by one or more grades.
2020 United Kingdom School Exam Grading Controversy
Wikipedia Accessed: 2026-05-14
Source Archive
Complete chronology: results announced August 13, 'no U-turn' August 15, decision reversed August 17, Ofqual chief executive resigned August 25.
The Great Algorithm Fiasco
BERA (British Educational Research Association) Anthony Kelly 2021 Accessed: 2026-05-14
Source Archive
Solid overview of the mechanism and political context. Quotes Ofqual's deputy chief regulator on the disproportionate impact on students from disadvantaged backgrounds.
What Can We Learn from the Ofqual Algorithm Debacle?
Container Solutions Blog Charles Humble 2021 Accessed: 2026-05-14
Source Archive
Good technical description of the mechanism: the 15-student threshold, three-year school history, ignoring teacher assessments. Quotes Curtis Parfitt-Ford.
'F**k the Algorithm?': What the World Can Learn from the UK's A-Level Grading Fiasco
LSE Impact Blog Daan Kolkman 2020 Accessed: 2026-05-14
Source Archive
Analysis from an algorithmic accountability perspective. Confirms the protest slogan and data on grade increases at private vs. state schools (+4.7 percentage points).
How a Computer Algorithm Caused a Grading Crisis in British Schools
CNBC Sam Shead 2020 Accessed: 2026-05-14
Source
Day-of-crisis coverage. Confirms ~39% of grades were downgraded and the government U-turn.

The Centroid of Doom: Apple's Desert Odyssey

9 sources

Apple Maps Flaw Could Be Deadly, Warn Australian Police
CNN Business Nick Thompson 2012 Accessed: 2026-05-14
Source Archive
Cites the official Mildura police statement. Confirms the 70 km displacement, 46°C heat, and stranded motorists without food or water for up to 24 hours.
'Life-Threatening' Apple Maps Error Is Fixed: Today in Apple History
Cult of Mac Luke Dormehl 2025 Accessed: 2026-05-14
Source Archive
Reproduces the full police statement text. Confirms all key figures.
Apple Maps Mildura, Australia, Glitch Strands Drivers in Dangerous Area
Slate Fruzsina Eördögh 2012 Accessed: 2026-05-14
Source Archive
Adds the data provenance context: the coordinates came from the Australian government gazetteer, which tagged Murray-Sunset National Park as 'Mildura Rural City'.
Apple Maps Fails Again: Alaska Drivers Directed Onto Airport Taxiway
International Business Times Dave Smith 2013 Accessed: 2026-05-14
Source Archive
Documents the Fairbanks, Alaska incident.
This article is more than 13 years old Apple Maps: Tim Cook says he is 'extremely sorry'
The Guardian Charles Arthur 2012 Accessed: 2026-05-14
Source Archive
Confirms the September 28, 2012 apology and the recommendation to use Google Maps via the browser.
Note: Apple's official apology page has since been removed from apple.com — which, given the circumstances, feels entirely on brand. The page is also blocked from Web Archive, so no cached copy is available. Screenshots circulate online but cannot be verified for authenticity, so the linked article will have to serve as the primary reference.
Apple Maps
Wikipedia Accessed: 2026-05-14
Source Archive
Confirms acquisitions (Placebase 2009, Poly9 2010, C3 Technologies 2011), Cook's apology of September 28, 2012, the recommendation to use Google Maps via the browser, and the departures of Forstall and Williamson.
Scott Forstall
Wikipedia Accessed: 2026-05-14
Source Archive
Confirms the dismissal mechanism: Forstall refused to sign the apology as the 'directly responsible individual' for Maps.
Steve Jobs
Simon & Schuster Walter Isaacson 2011
ISBN: 1-4516-4853-7
Source for the '$40 billion' quote and 'thermonuclear war' against Android. Foundation of the narrative about Apple's motivation to break its dependency on Google.
Apple's Maps Fiasco and the Mobile Arms Race
Knowledge at Wharton Eric Clemons, Peter Fader 2012 Accessed: 2026-05-14
Source Archive
Covers the strategic background of Apple's decision. Quotes experts on the cost of the political choice to remove Google Maps.

31

Feedback Loops – Amplifying the Absurd

5 Topics

Explainers

What Is a Feedback Loop? (And Why Chaos Doesn't Need AI)

8 sources

Business Dynamics: Systems Thinking and Modeling for a Complex World
McGraw-Hill John D. Sterman 2000
ISBN: 978-0072389159.
Standard academic textbook on system dynamics. Formally defines feedback loops, covers reinforcing and balancing loops with mathematical examples.
Thinking in Systems: A Primer
Chelsea Green Publishing Donella H. Meadows 2008
ISBN: 978-1603580557.
Widely read popular-science introduction to systems thinking. Defines feedback loops in the first chapter through everyday analogies (thermostat, population, economy). Written for non-specialists.
Introduction to Algorithms (6.006, Spring 2020)
MIT OpenCourseWare Erik Demaine, Jason Ku, Justin Solomon 2020 Accessed: 2026-05-14
Source Archive
Formal treatment of recursion as a foundation of algorithmics, contrasted with iterative (linear) approaches.
Recursion
Khan Academy Accessed: 2026-05-14
Source
Accessible explanation of recursion as 'a function that calls itself.' Everyday analogies, no mathematical jargon.
Sapiens: A Brief History of Humankind
Harper Yuval Noah Harari 2015
ISBN: 978-0062316110.
Source for the Level One / Level Two chaos distinction. Examples: weather (Level One), stock markets and revolutions (Level Two).
Harari holds a D.Phil. in history from Oxford (2002), specialising in medieval military history. Sapiens is a popular-science work outside his academic specialisation — which the text acknowledges.
Prof. Yuval Noah Harari
The Hebrew University of Jerusalem Accessed: 2026-05-14
Source
Reference for Harari's academic credentials from his university profile page.
Chaos
Stanford Encyclopedia of Philosophy Accessed: 2026-05-14
Source
Solid philosophical and mathematical treatment of chaos theory. Defines sensitive dependence on initial conditions, nonlinearity, and aperiodicity as the three characteristics of mathematical chaos. Explains why chaos ≠ randomness.
Circa January 1961: Lorenz and the Butterfly Effect
American Physical Society 2003 Accessed: 2026-05-14
Source Archive
History of Lorenz's discovery. Explains the butterfly effect as sensitive dependence on initial conditions, with historical and mathematical context.

Case Studies

The Invisible Hand with a Digital Thumb — RealPage

12 sources

Rent Going Up? One Company's Algorithm Could Be Why
ProPublica Heather Vogell, Haru Coryne, Ryan Little 2022 Accessed: 2026-05-14
Source Archive
Main investigative piece (October 15, 2022). Describes the YieldStar mechanism, Seattle data, Camden Property Trust, and quotes from former RealPage employees.
DOJ Files Antitrust Suit Against RealPage
ProPublica Heather Vogell 2024 Accessed: 2026-05-14
Source Archive
Day-of-filing account (August 23, 2024). Quotes AG Kantér and the substance of the charges.
United States and Plaintiff States v. RealPage, M.D.N.C. 1:24-CV-00710
National Association of Attorneys General 2024 Accessed: 2026-05-14
Source Archive
Case page with docket number, list of defendants, and participating states.
DOJ Settles Its Algorithmic Price-Fixing Case Against RealPage
Wilson Sonsini Jeffrey C. Bank, Brian Smith, Jacob Lozano, Angela L. Brown, Noora Bayrami 2025 Accessed: 2026-05-14
Source Archive
Detailed legal analysis of the November 2025 DOJ settlement and consent decree terms.
The Cost of Anticompetitive Pricing Algorithms in Rental Housing
White House Council of Economic Advisers 2024 Accessed: 2026-05-14
Source Archive
Original CEA report (December 17, 2024). Source for the $3.8B figure and $70/month per unit. Methodology based on ACS, Zillow, and RealPage data.
DOJ and RealPage Agree to Settle Rental Price-Fixing Case
ProPublica Heather Vogell 2025 Accessed: 2026-05-14
Source Archive
Non-monetary DOJ settlement with RealPage (November 26, 2025). Context for parallel settlements with landlords.
America's Largest Landlord Makes Deal With DOJ to Settle Price-Fixing Claims
ProPublica Rent Barons 2025 Accessed: 2026-05-14
Source Archive
Greystar's non-monetary settlement with DOJ (August 2025).
Greystar Agrees to $50 Million Settlement in Rental Price-Fixing Case
GRC Report 2025 Accessed: 2026-05-14
Source Archive
$50M Greystar settlement in the tenant class action (October 2025). $141M total from 26 landlords.
Attorney General Bonta Announces $7 Million Settlement with Greystar
California Attorney General 2025 Accessed: 2026-05-14
Source Archive
Official press release (November 19, 2025). $7M state settlement involving nine attorneys general.
In the Back Office, Revenue Management Software Is Causing a Revolution
Multifamily Executive Joe Bousquin 2009 Accessed: 2026-05-14
Source Archive
Quotes Camden CEO Campa on the paradox of rising income alongside higher tenant turnover. Source for the $10M net income figure.
Landlords Use Pricing Software That Adds Billions to Rental Costs, White House Says
Axios Emily Peck 2024 Accessed: 2026-05-14
Source Archive
Accessible popularisation of the CEA report with original graphics and RealPage's response.
The RealPage Settlement Won't End the Fight Over Revenue Management Software
Propmodo Franco Faraudo` 2026 Accessed: 2026-05-14
Source Archive
Industry analysis of what the DOJ settlement changes and what it does not. PropTech sector perspective.

Flash Crash (2010) — When Algorithms Ate Wall Street

10 sources

Findings Regarding the Market Events of May 6, 2010
SEC / CFTC 2010 Accessed: 2026-05-14
Source
Primary regulatory report (104 pages, October 2010). Source for all key figures: Waddell & Reed, $4.1B, hot potato trading, the 5-second pause, and cancelled transactions.
The Flash Crash: The Impact of High Frequency Trading on an Electronic Market
CFTC Andrei Kirilenko, Albert S. Kyle, Mehrdad Samadi, Tugkan Tuzun 2014 Accessed: 2026-05-14
Source Archive
CFTC analytical report (March 2014). Key data on HFT and the hot potato mechanism.
Preliminary Report on the Market Events of May 6, 2010
SEC / CFTC 2010 Accessed: 2026-05-14
Source
Preliminary report issued days after the crash.
Report Examines May's 'Flash Crash,' Expresses Concern Over High-Speed Trading
Washington Post Zachary A. Goldfarb 2010 Accessed: 2026-05-14
Source
Day-of-publication coverage of the SEC/CFTC report release. Quotes from regulators.
The 10th Anniversary of the Flash Crash
SIFMA Katie Kolchin 2020 Accessed: 2026-05-14
Source
Comprehensive industry retrospective with full chronology and dates of regulatory mechanisms introduced after the crash.
The Flash Crash, Explained
NPR Planet Money Jacob Goldstein 2010 Accessed: 2026-05-14
Source Archive
Accessible explanation of the mechanism for a general audience. Good narrative context.
The Flash Crash: High-Frequency Trading in an Electronic Market
Journal of Finance / University of Cambridge Repository Andrei Kirilenko, Albert S. Kyle, Mehrdad Samadi, Tugkan Tuzun 2017 Accessed: 2026-05-14
Source Archive
Peer-reviewed paper using CFTC data. Analysis of HFT's role in the crash.
High-Frequency Trading and the Flash Crash
Hastings Business Law Journal Ian Poirier 2012 Accessed: 2026-05-14
Source Archive
Legal analysis. Covers stub quotes and regulatory mechanisms introduced after the crash.
The Flash Crash of 2010 Offers Warning as AI Automates
IT Brew Billy Hurley 2025 Accessed: 2026-05-14
Source Archive
Current perspective connecting lessons from 2010 to AI automation risk. Good essayistic supplement.
The Flash Crash, Two Years On
NY Fed Liberty Street Economics Adam Biesenbach, Marco Cipriani 2012 Accessed: 2026-05-14
Source Archive
Federal Reserve Bank of New York analysis. Independent regulatory perspective from outside the SEC/CFTC.

PredPol — The Feedback Loop of Crime

11 sources

Crime Prediction Software Promised to Be Free of Biases. New Data Shows It Perpetuates Them
The Markup Aaron Sankin, Dhruv Mehrotra for Gizmodo, Surya Mattu, Annie Gilbertson 2021 Accessed: 2026-05-14
Source Archive
First independent analysis of 5.9 million real PredPol predictions. Key evidence for systematic targeting of Latino and Black neighbourhoods.
How We Determined Crime Prediction Software Disproportionately Targeted Low-Income, Black, and Latino Neighborhoods
The Markup Dhruv Mehrotra, Surya Mattu, Annie Gilbertson, Aaron Sankin 2021 Accessed: 2026-05-14
Source Archive
Methodology behind the investigation. Describes how Gizmodo found 7.8 million predictions on an unsecured AWS server.
Academics Confirm Major Predictive Policing Algorithm Is Fundamentally Flawed
Motherboard / Vice Caroline Haskins 2019 Accessed: 2026-05-14
Source Archive
Interview with researchers. Key quote: PredPol is in practice a 'moving average' of previous arrest locations.
To Predict and Serve?
Significance (RSS) Kristian Lum, William Isaac 2016 Accessed: 2026-05-14
Source Archive
Key simulation study. Running PredPol on Oakland data reveals a self-reinforcing loop — the algorithm repeatedly sends police to the same locations. Vol. 13.
Runaway Feedback Loops in Predictive Policing
arXiv Danielle Ensign, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger, Suresh Venkatasubramanian† 2017 Accessed: 2026-05-14
Source Archive
Formal mathematical analysis of the feedback loop in PredPol. Demonstrates the divergence mechanism between predictions and actual crime.
Dirty Data, Bad Predictions
NYU Law Review Online Rashida Richardson, Jason M. Schultz, Kate Crawford 2019 Accessed: 2026-05-14
Source Archive
Analysis of corrupted input data (arrest records) as a source of systemic errors. 94 N.Y.U. L. Rev. Online 15.
Predictive Policing Explained
Brennan Center for Justice Tim Lau 2020 Accessed: 2026-05-14
Source Archive
Overview of predictive policing systems and controversies. Covers LAPD and Chicago implementations.
Predictive Policing: Using Technology to Reduce Crime
FBI Law Enforcement Bulletin Zach Friend 2021 Accessed: 2026-05-14
Source Archive
Law enforcement perspective on the mechanism and deployment. Primary source for the '500-square-foot locations' description in Santa Cruz.
The Activist Dismantling Racist Police Algorithms
MIT Technology Review Tate Ryan-Mosley, Jennifer Strong 2020 Accessed: 2026-05-14
Source Archive
Interview with Hamid Khan (Stop LAPD Spying Coalition). Activist perspective vs. the COVID budget-cut narrative.
LAPD Ditches Predictive Policing Program Accused of Racial Bias
The Next Web Thomas Macaulay 2020 Accessed: 2026-05-14
Source Archive
Coverage of the programme termination announcement (April 21, 2020). Quotes Police Chief Moore on the budget rationale.
Predictive Policing Algorithms Are Racist. They Need to Be Dismantled
MIT Technology Review Will Douglas Heaven 2020 Accessed: 2026-05-14
Source Archive
Broad context for the debate on algorithmic racism in policing following the events of 2020.

ICU Alarm Storm — When Safety Turns into Noise

10 sources

'Alarm Fatigue' Linked to Heart Patient's Death at Mass. General
The Boston Globe Liz Kowalczyk 2010 Accessed: 2026-05-14
Source Archive
Original article describing the January 2010 death of an 89-year-old at MGH. Ten nurses did not hear alarms for 20 minutes. Cited in all subsequent regulatory reports.
Patient Alarms Often Unheard, Unheeded
The Boston Globe Liz Kowalczyk 2011 Accessed: 2026-05-14
Source Archive
Investigation identifying 216 deaths in the US between 2005 and 2010 linked to patient monitor alarm problems. Broad documentation of the scale of the issue.
State Reports Detail 11 Patient Deaths Linked to Alarm Fatigue in Massachusetts
The Boston Globe Liz Kowalczyk 2011 Accessed: 2026-05-14
Source
Detailed descriptions of 11 deaths, including UMass 2010 (alarms ignored for nearly an hour) and UMass 2007 (75 minutes without response).
Sentinel Event Alert Issue 50: Medical Device Alarm Safety in Hospitals
The Joint Commission 2013 Accessed: 2026-05-14
Source
Official regulatory alert. 80 deaths and 13 cases of permanent harm in the TJC database between 2009 and 2012. 85–99% of alarms required no intervention.
Top 10 Health Technology Hazards
ECRI Institute 2026 Accessed: 2026-05-14
Source Archive
ECRI listed alarm hazards as the number one health technology risk for four consecutive years (2011–2015). Annual report series; link leads to current edition.
Harm from Alarm Fatigue
AHRQ / PSNet Michele M. Pelter, Barbara J. Drew 2015 Accessed: 2026-05-14
Source Archive
Synthetic review of cases and mechanisms. Accessible introduction for non-specialist readers.
Monitor Alarm Fatigue: Standardizing Use of Physiological Monitors and Decreasing Nuisance Alarms
National Library of Medicine Kelly Creighton Graham, Maria Cvach 2010 Accessed: 2026-05-14
Source Archive
First published study (2010) to define and document the scale of the phenomenon. Cited in all subsequent reports. doi:10.4037/ajcc2010651. Vol. 19(1), pp. 28–35.
Monitor Alarm Fatigue: An Integrative Review
Biomedical Instrumentation & Technology Maria Cvach 2012 Accessed: 2026-05-14
Source
Integrative review. Data on 350–700 alarms per bed per day and 566 deaths in the FDA database between 2005 and 2010. Vol. 46(4), pp. 268–77.
Insights into the Problem of Alarm Fatigue with Physiologic Monitor Devices
PLOS One Barbara J. Drew, Patricia Harris, Jessica K. Zègre-Hemsey, Tina Mammone, Daniel Schindler, Rebeca Salas-Boni, Yong Bai, Adelita Tinoco, Quan Ding, Xiao Hu 2014 Accessed: 2026-05-14
Source Archive
UCSF study on 461 ICU patients. 2.5 million unique alarms over 31 days; 187 audible alarms per bed per day.
Boston Medical Center Reduces Monitor Alarms
The Boston Globe Liz Kowalczyk 2013 Accessed: 2026-05-14
Source Archive
Example of a successful intervention: BMC reduced alarms from 88,000 to 10,000 per week on a single ward. A useful counterpoint to the narrative of inevitability.

32

When Optimization Becomes the Problem

6 Topics

Explainers

Optimization — Finding the 'Sweet Spot'

10 sources

Local vs. Global Optima
MathWorks (MATLAB Documentation) Accessed: 2026-05-14
Source Archive
Concise technical definition: a local minimum is a point better than its neighbours; a global minimum is better than all possible points. Introduces the concept of 'basins of attraction'.
Local Optimization Versus Global Optimization
Machine Learning Mastery Jason Brownlee 2021 Accessed: 2026-05-14
Source Archive
Accessible article from an ML perspective. Good examples of when local search is sufficient and when global search is needed.
Optimization: Local vs. Global Optima
Baeldung on Computer Science Panagiotis Antoniadis 2024 Accessed: 2026-05-14
Source Archive
Formal definitions with visual examples and discussion of algorithms (gradient descent, simulated annealing, etc.).
Local and Global Optimality
FICO Xpress Optimization Accessed: 2026-05-14
Source
Technical documentation. Good analogy: 'finding a valley in a range of mountains vs finding the deepest valley'.
Local Vs Global Optimum in Uni-Variate Optimization
GeeksforGeeks Accessed: 2026-05-14
Source Archive
Accessible introduction with formal mathematical definitions. Suitable for technical readers.
The Highest Point Above Earth's Center Is the Peak of Ecuador's Mount Chimborazo
NOAA / National Ocean Service 2024 Accessed: 2026-05-14
Source Archive
Official US government source. Explains the three different definitions of 'highest mountain' (above sea level, from Earth's centre, from base to peak).
How Big Are the Hawaiian Volcanoes?
USGS Accessed: 2026-05-14
Source Archive
Official source: total height of Mauna Kea is nearly 33,500 feet (10,211 m), with its base ~6,000 m below sea level.
Summits Farthest from the Earth's Center
Wikipedia Accessed: 2026-05-14
Source Archive
Precise data: Chimborazo is 2,168 m farther from Earth's centre than Everest.
Olympus Mons
Wikipedia Accessed: 2026-05-14
Source Archive
MOLA data: 21.287 km height, approximately 2.5× Everest. Approximately tied with Rheasilvia on Vesta as the tallest mountain in the Solar System.
Mount Kosciuszko
Wikipedia Accessed: 2026-05-14
Source Archive
2,228 m above sea level. Highest peak of continental Australia.

Case Studies

The Alphabet Soup: How Etsy Optimized for Bots and Killed the Human Touch

10 sources

New Guidance for Listing Titles, and a Tool to Help
Etsy Seller Handbook 2026 Accessed: 2026-05-14
Source Archive
Etsy officially acknowledges the problem: 'In the past, Etsy's search placed heavy emphasis on the title, pushing some sellers to pack in keywords. This often leads to longer, harder-to-read titles.' Key admission from the platform itself.
Updates to Etsy Search: Listing Descriptions
Etsy Community Announcements 2022 Accessed: 2026-05-14
Source
Official Etsy announcement that product descriptions would begin influencing search rankings — the starting point for the wave of keyword stuffing in descriptions.
Update from Etsy CEO Josh Silverman
Etsy Seller Handbook 2017 Accessed: 2026-05-14
Source Archive
Silverman announces Context Specific Ranking (CSR) in September 2018 and signals 'less focus on your titles'. Confirms search and discovery as a strategic priority.
SEC Form 8-K — Etsy Inc., Q2 2017
SEC EDGAR 2017 Accessed: 2026-05-14
Source
Silverman, as incoming CEO, lists 'enhancing search and discovery' as one of four main strategic priorities after taking the role in May 2017.
Etsy Search: What Is Keyword Stuffing?
VintageMaineia / Etsy Vintage Seller Encyclopedia 2016 Accessed: 2026-05-14
Source Archive
Contains a verbatim quote from an Etsy admin: 'incentives do exist for sellers to stuff our titles for internal search purposes'. Updates from February and June 2018 track the lack of progress. Well-documented seller-side account of the problem.
Evolution of Etsy Search, November 2017: Context Is Key
CindyLouWho2 Cindy Lou Who 2 2017 Accessed: 2026-05-14
Source Archive
Independent analysis of the Etsy algorithm from one of the first people to reverse-engineer it. Detailed account of 2017 changes including the Blackbird Technologies acquisition and AI search tests.
Etsy's Plans for the Search Algorithm & Promoted Listings in 2018
CindyLouWho2 Cindy Lou Who 2 2018 Accessed: 2026-05-14
Source Archive
Analysis of a Q&A with an Etsy engineer. Details on Context Specific Ranking and its implementation.
Sellers Are Worried About the 'Amazon-ification' of Etsy and Are Considering Other Platforms
Modern Retail Julia Waldow 2024 Accessed: 2026-05-14
Source Archive
Includes a direct quote from Etsy's director of search, Andrew Stanton, acknowledging keyword stuffing as a platform problem. Seller accounts of being crowded out by SEO.
The Biggest Threat to Etsy at 15? Becoming Just Another Marketplace
Craft Industry Alliance Abby Glassenberg 2021 Accessed: 2026-05-14
Source Archive
Analysis of Etsy's 'Amazon-ification' under Silverman. Context of investor pressure vs. the 'handmade' mission.
Etsy CEO Josh Silverman Says Keep Commerce Human
EcommerceBytes Ina Steiner 2017 Accessed: 2026-05-14
Source Archive
Coverage of Silverman's October 2017 'Keep Commerce Human' mission announcement. A useful counterpoint for the irony of the case study.

The Rage Engine: How Facebook Optimized for the End of Civility

11 sources

Bringing People Closer Together
Meta / Facebook Newsroom Adam Mosseri 2018 Accessed: 2026-05-14
Source Archive
Official Facebook announcement describing the pivot to 'meaningful social interactions' (January 11, 2018). Primary source for the strategic context of the entire case.
Five Points for Anger, One for a 'Like': How Facebook's Formula Fostered Rage and Misinformation
Washington Post Jeremy B. Merrill, Will Oremus 2021 Accessed: 2026-05-14
Source
Based directly on internal Facebook documents (Facebook Papers). Reveals the 5× weight for all emoji reactions, internal warnings from 2019, and the sequence of algorithm adjustments.
Likes, Anger Emojis and RSVPs: The Math Behind Facebook's News Feed — And How It Backfired
CNN Business Rachel Metz 2021 Accessed: 2026-05-14
Source Archive
Detailed figures from internal documents: full MSI weight table (Like=1, reactions=5, comments=30). History of algorithm adjustments from 2017 to 2020.
Five Points for 'Angry,' One for 'Like'
The Seattle Times / Washington Post Will Oremus, Jeremy B. Merrill 2021 Accessed: 2026-05-14
Source Archive
Contains the exact 2019 document quote: Angry reactions were 'much more frequent' on content categorised as 'civic low quality news, civic misinfo, civic toxicity, health misinfo, and health antivax content'.
More Internal Documents Show How Facebook's Algorithm Prioritized Anger
Nieman Journalism Lab Shraddha Chakradhar 2021 Accessed: 2026-05-14
Source Archive
Synthesis of multiple documents. Includes the finding that zeroing out the Angry weight reduced misinformation without reducing engagement.
Facebook's Formula Prioritized Anger and Ended Up Spreading Misinformation
The Hill Shirin Ali 2021 Accessed: 2026-05-14
Source Archive
Detailed chronology of algorithm adjustments: 2018 (4×), 2019 (demote mechanism), 2020 (1.5×), September 2020 (0×).
2021 Facebook Leak
Wikipedia Accessed: 2026-05-14
Source Archive
Good overview of the leak. Lists the media outlets involved in publishing the Facebook Papers and the chronology of disclosures.
Former Facebook Employee Frances Haugen Revealed as Whistleblower
Washington Post Cat Zakrzewski, Cristiano Lima-Strong 2021 Accessed: 2026-05-14
Source
Original account of Haugen's identity being revealed. Context of her role in the Civic Integrity Team.
Let Me Rewrite That For You: Washington Post Misinforms You About How Facebook Weighted Emoji Reactions
TechDirt Mike Masnick 2021 Accessed: 2026-05-14
Source Archive
Important critical voice: notes that all reactions (not just Angry) received a 5× weight, and that the Washington Post may have created a misleading picture of intentional anger-boosting. Useful for precise description of the mechanism.
Facebook Rolls Out New Plan for News Feed
NPR Scott Neuman 2018 Accessed: 2026-05-14
Source Archive
Major Change to Facebook News Feed to Improve Well-Being: Mark Zuckerberg
CNBC Jillian D’Onfro 2018 Accessed: 2026-05-14
Source Archive

Sydney 2014: The $100 Escape — When Math Met Terror

9 sources

Uber's Surge Pricing Near Siege in Sydney Sparks Outrage
NBC News 2014 Accessed: 2026-05-14
Source Archive
Day-of account (December 15, 2014). Confirms A$100 minimum and four times the normal rate. Contains original Uber Sydney tweets.
Uber Offers Free Rides During Sydney Hostage Crisis After Surge Pricing Backlash
TechCrunch Catherine Shu 2014 Accessed: 2026-05-14
Source Archive
Account of Uber's reversal. Notes that Uber had a US-only surge cap policy that did not apply in Australia.
Uber Apologises for Surge Pricing During Sydney Siege
IBTimes UK Sean Martin 2014 Accessed: 2026-05-14
Source Archive
Official Uber apology. Quote: 'We didn't stop surge pricing immediately. This was the wrong decision.'
Uber Had the Worst Possible Response to Sydney's Hostage Crisis
Mic Jordan Valinsky 2014 Accessed: 2026-05-14
Source Archive
Contains Uber's statement on refunds. Context of the absence of an Australian policy at the time of the incident.
Uber Sydney Hostage Crisis: It's Time for Uber to Re-Evaluate How It Prices During Emergencies
Slate Alison Griswold 2014 Accessed: 2026-05-14
Source Archive
Thoughtful analysis of the dilemma. Argues for Uber-subsidised free rides as a model. Historical context of previous pricing controversies.
Uber's Prices Surged in Sydney During the Hostage Crisis, and Everyone Is Furious
New Republic Danny Vinik 2014 Accessed: 2026-05-14
Source Archive
Covers both sides of the debate (economists vs. public opinion). Good background for the 'incentive model' argument.
A.G. Schneiderman Announces Agreement With Uber to Cap Pricing During Emergencies
New York Attorney General Letitia James 2014 Accessed: 2026-05-14
Source Archive
Critical context: this agreement was reached five months before Sydney, showing the policy existed in the US but was not applied globally.
Uber Agrees to Limit Surge Pricing During Emergencies
Washington Post Mark Berman 2014 Accessed: 2026-05-14
Source Archive
Coverage of the NY AG agreement and announcement of a US-wide policy.
Uber Tasks Centralized Response Team With Monitoring Prices During Disasters
The Drive Stephen Edelstein 2018 Accessed: 2026-05-14
Source Archive
Documents the final solution (Global Safety Coordinator team). Lists London 2017 and New York 2016 as subsequent failures after Sydney. Shows it took four years to implement a systematic fix.

The Pink Slip Processor — Amazon's Automated Firing

11 sources

How Amazon Automatically Tracks and Fires Warehouse Workers for 'Productivity'
The Verge Colin Lecher 2019 Accessed: 2026-05-14
Source Archive
Original ADAPT disclosure (April 25, 2019). Based on NLRB documents. Confirms automatic generation of warnings and terminations without manager involvement. Quotes an Amazon lawyer's letter referencing hundreds of terminations at a single facility.
Amazon's System for Tracking Its Warehouse Workers Can Automatically Fire Them
MIT Technology Review Charlotte Jee 2019 Accessed: 2026-05-14
Source Archive
Summary of the disclosure. Includes Amazon's full denial. Good source for presenting both sides.
Internal Documents Show Amazon's Dystopian System for Tracking Workers Every Minute of Their Shifts
Vice Lauren Kaori Gurley 2022 Accessed: 2026-05-14
Source Archive
NLRB documents from JFK8. Specific thresholds: warning at 30 min TOT, termination at 120 min. Example of a manager questioning an employee about time spent in the bathroom.
Leaked Documents Show How Amazon's Automated Systems Force Canadian Workers to Scan Boxes Faster or Face 'Termination'
PressProgress 2021 Accessed: 2026-05-14
Source Archive
Amazon internal wiki. Target set at the 75th percentile — a 'moving goalpost' for workers.
Busting Amazon's Myths About Its Unsafe Warehouses
Foxglove / UK Parliament Briefing 2022 Accessed: 2026-05-14
Source Archive
Compilation of NLRB and Washington State regulator documents. Injury rates 9× higher than the industry average. Formal confirmation of the TOT mechanism.
Leaked Amazon Memo Says It Will Run Out of Workers by 2024
Engadget Amrita Khalid 2022 Accessed: 2026-05-14
Source Archive
Original account of the leaked memo (June 2022). Direct quote from the 2021 memo: 'If we continue business as usual, Amazon will deplete the available labor supply in the US network by 2024'.
Amazon Fears It Could Run Out of US Warehouse Staff by 2024
The Register Richard Currie 2022 Accessed: 2026-05-14
Source Archive
Confirms 150% annual turnover rate and average tenure of 8 months. Regional details (Phoenix, Inland Empire).
Amazon Warehouse Problems: Running Out of Workers to Hire
Fortune Sophie Mellor 2022 Accessed: 2026-05-14
Source Archive
Six levers Amazon was considering. Effect of a $1 wage increase: +7% to the available labour pool.
Amazon Apologizes for Denying That Its Drivers Pee in Bottles
CBS News Aimee Picchi 2021 Accessed: 2026-05-14
Source Archive
Official Amazon apology. Worker context: 'tracked down to the second', bathrooms 5–10 minutes from workstations.
Jeff Bezos Releases Final Letter to Amazon Shareholders
CNBC Annie Palmer 2021 Accessed: 2026-05-14
Source Archive
Source for the letter quotes: 'we need a better vision for how we create value for employees'; 'Earth's Best Employer and Earth's Safest Place to Work'.
Amazon Adjusts Its 'Time Off Task' Metric and Drug Testing Policy
Engadget Richard Lawler 2021 Accessed: 2026-05-14
Source Archive
Announcement of changes to how TOT is measured. Context: published the same day the Washington Post released its injury rate report.

The Mid Staffordshire Scandal

10 sources

Report of the Mid Staffordshire NHS Foundation Trust Public Inquiry
UK Government (GOV.UK) Robert Francis QC 2013 Accessed: 2026-05-14
Source Archive
Primary report. 290 recommendations across four volumes. The foundational source for the entire case study.
The Francis Report and the Government's Response
House of Commons Library Tom Powell 2013 Accessed: 2026-05-14
Source Archive
Parliamentary synthesis. Context for both Francis reports (2010 and 2013) and the government's response.
Mid Staffordshire NHS Foundation Trust (Inquiry) — Parliamentary Debate, 6 February 2013
Hansard / UK Parliament 2013 Accessed: 2026-05-14
Source Archive
Prime Minister and Health Secretary speeches on the day of publication. Source for the 'preoccupation with a narrow set of top-down targets pursued to the exclusion of patient safety' quote.
How Many People Died 'Unnecessarily' at Mid Staffs?
Full Fact 2013 Accessed: 2026-05-14
Source Archive
Key source for the nuance around the 400–1,200 figure: explains its origin, why it is contested, and what the Francis report actually stated.
Stafford Hospital Scandal
Wikipedia Accessed: 2026-05-14
Source Archive
Good chronological overview. Contains quotes from the Healthcare Commission and Francis reports regarding the death toll controversy.
NHS Targets
Wikipedia Accessed: 2026-05-14
Source Archive
History of the 4-hour target from 2000/2004. Documents workaround mechanisms (acute assessment units 'outside [A&E] for statistical purposes') and the threshold change from 98% to 95% in 2010.
Time Patients Spend in the Emergency Department: England's 4-Hour Rule
Annals of Emergency Medicine Suzanne Mason, Ellen J. Weber, Jennifer Freeman 2011 Accessed: 2026-05-14
Source
Academic study showing a growing proportion of patients leaving A&E in the last 20 minutes before the deadline — empirical evidence of 'teaching to the test'.
Performing or Not Performing: What's in a Target?
PubMed Central Julie Eatock, Matthew Cooke, Terry P. Young 2017 Accessed: 2026-05-14
Source Archive
Analysis of the long-term effects of the 4-hour target within the wider NHS system.
Mid Staffordshire NHS Foundation Trust (Francis Review and Inquiry)
Inquests and Inquiries Accessed: 2026-05-14
Source Archive
Solid synthesis of both Francis reports. Quotes on the board's 'obsession' with Foundation Trust status. Lists regulatory consequences including Duty of Candour and Freedom to Speak Up Guardians.
Mid Staffordshire Hospital Report
British Psychological Society Narinder Kapur 2014 Accessed: 2026-05-14
Source Archive
Analysis of the psychological mechanisms behind the culture of fear and silence. Behavioural perspective complementing the regulatory accounts.

Part 9

We Taught the Machine to Guess

6 Chapters

##

Introduction

2 Topics

Explainers

What Actually Counts as Machine Learning

29 sources

The Master Algorithm
Basic Books Pedro Domingos 2015
ISBN: 978-0465065707.
Key reference. Domingos divides ML into five 'tribes': symbolists, connectionists, evolutionaries, Bayesians, and analogizers. The closest spiritual match to the explainer's structure, albeit with a slightly different taxonomy.
Review available at: https://www.timeshighereducation.com/books/review-the-master-algorithm-pedro-domingos-allen-lane
A Taxonomy of Machine Learning Techniques
ResearchGate Radhey Shyam, Ria Singh 2021 Accessed: 2026-05-15
Source
Short survey classifying ML into supervised, unsupervised, semi-supervised, and reinforcement learning. Solid basic academic source.
The Five Tribes of Machine Learning with Pedro Domingos
ACM Tech Talk Pedro Domingos 2015 Accessed: 2026-05-15
Source Archive
Domingos presenting directly to ACM. Companion to the book.
Introduction to Ant Colony Optimization
GeeksforGeeks 2016 Accessed: 2026-05-15
Source Archive
Accessible introduction to ACO with a description of the pheromone mechanism.
An Adapted Ant Colony Optimization for Feature Selection
Taylor & Francis Online Duygu Yilmaz Erogl, Umut Akcan 2024 Accessed: 2026-05-15
Source
Confirms Dorigo (1992) as the original ACO source.
Ant Colony Optimization
Encyclopedia MDPI Nikola Ivković , Robert Kudelić , Marin Golub 2023 Accessed: 2026-05-15
Source Archive
Encyclopaedic treatment of ACO with description of pheromone trail updates.
A Review on Genetic Algorithm: Past, Present, and Future
PubMed Central / Multimedia Tools and Applications Sourabh Katoch, Sumit Singh Chauhan, Vijay Kumar 2020 Accessed: 2026-05-15
Source Archive
Peer-reviewed survey covering genetic operators, GA variants, and applications. Good academic overview with a broad literature base.
Genetic Algorithms: A Survey
IEEE Computer / ACM Digital Library M. Srinivas, Lalit M. Patnaik 1994 Accessed: 2026-05-15
Source Archive
Classic IEEE survey. Describes the analogy to natural processes and the mechanics of genetic algorithms. A historical authority in the field.
Naturally Selecting Solutions: The Use of Genetic Algorithms in Bioinformatics
PubMed Central / BioData Mining Timmy Manning, Roy D Sleator, Paul Walsh 2012 Accessed: 2026-05-15
Source Archive
Good introduction from a bioinformatics perspective. Explains fitness function, selection, and crossover using the travelling salesman problem as an example.
Artificial Immune System
Wikipedia Accessed: 2026-05-15
Source Archive
Unusually solid Wikipedia article here: covers the history of the field (from Farmer and Perelson, 1986), algorithm taxonomy (negative selection, clonal selection, immune networks), and links to primary literature. Use as orientation, not as a sole source.
Artificial Immune System: Algorithms and Applications
SSARSC Pankaj Chaudhary, Kundan Kumar 2018 Accessed: 2026-05-15
Source Archive
Concise survey of AIS models and algorithms from the past 20 years. Describes clonal selection and negative selection mechanisms.
Artificial Immune Systems
Springer Encyclopedia of Machine Learning Jon Timmis Accessed: 2026-05-15
Source
Encyclopaedia entry from Springer's ML Encyclopaedia. Authoritative definition and taxonomy of AIS. Key bibliographic reference: de Castro & Timmis (2002).
Artificial Immune Systems
arXiv Julie Greensmith, Amanda Whitbrook, Uwe Aickelin Accessed: 2026-05-15
Source Archive
Solid survey covering the field's evolution from 1978 to 2008. Includes diagrams and descriptions of negative selection, clonal selection, and dendritic cell algorithms.
Decision Trees — scikit-learn Documentation
scikit-learn Accessed: 2026-05-15
Source Archive
Official technical documentation. Covers ID3, C4.5, and CART (Breiman et al., 1984), tree advantages and limitations, and overfitting. Authoritative practical source.
Classification and Regression Trees
Wadsworth Breiman, Friedman, Olshen, Stone 1984
ISBN: 978-0412048418.
Canonical original work defining the CART algorithm. Cited by scikit-learn as the primary reference.
Decision Trees Made Simple: Machine Learning Explained
DigitalOcean Shaoni Mukherjee 2025 Accessed: 2026-05-15
Source Archive
Accessible introduction with historical context (Breiman as the creator) and a description of the mechanism.
Alpha–Beta Pruning
Wikipedia Accessed: 2026-05-15
Source Archive
Another unusually strong Wikipedia article: covers the algorithm's history (McCarthy, Knuth & Moore 1975), the minimax mechanism, and concrete figures (node reduction of up to 99.8% for chess with optimal ordering).
Alpha-Beta
Chess Programming Wiki Accessed: 2026-05-15
Source Archive
Specialist technical reference from the chess engine programming community. Cites the original Knuth & Moore (1975) paper and the history of the algorithm's discovery.
Minimax Search and Alpha-Beta Pruning
Cornell University CS312 Accessed: 2026-05-15
Source Archive
Cornell course materials. Reliable explanation of the mechanism with pseudocode and a worked tree example.
Algorithms In Context #7: Decision Trees & Alpha-Beta Pruning
Towards Data Science Can Bayar 2021 Accessed: 2026-05-15
Source Archive
Combines both topics (ML decision trees + alpha-beta) in a single piece. Useful for illustrating the difference between a decision tree in ML and a search tree in game AI.
CPU vs. GPU for Machine Learning
IBM Think Josh Schneider, Ian Smalley Accessed: 2026-05-15
Source Archive
Solid overview without marketing noise. Covers the architecture of both processors and their ML use cases.
GPU vs CPU — Difference Between Processing Units
AWS Accessed: 2026-05-15
Source Archive
Concise comparison from an infrastructure provider.
Compare GPUs vs. CPUs for AI and Machine Learning Use Cases
TechTarget Chris Tozzi 2025 Accessed: 2026-05-15
Source Archive
Broader enterprise-context discussion. Includes a description of the GPU's parallel architecture.
Understanding Parallel Computing: GPUs vs CPUs Explained Simply with the Role of CUDA
DigitalOcean Shaoni Mukherjee 2024 Accessed: 2026-05-15
Source Archive
Practical tutorial with benchmarks. Shows concretely why GPU wins on ML tasks.
Deep Learning
MIT Press Ian Goodfellow, Yoshua Bengio, Aaron Courville 2016 Accessed: 2026-05-15
Source Archive
Canonical reference. Chapters 5 (ML Basics), 6 (Deep Feedforward Networks), and 9 (Convolutional Networks) cover exactly what the explainer describes. Available free online in HTML.
Accelerating AI with GPUs: A New Computing Model
NVIDIA Blog Jensen Huang 2016 Accessed: 2026-05-15
Source Archive
Describes the historical AlexNet breakthrough and the role of GPUs in deep learning.
Why Nvidia Dominates AI: A History of CUDA and Parallel Computing
CRV Science Bryan White 2026 Accessed: 2026-05-15
Source Archive
Detailed history from AlexNet (2012) through subsequent GPU architectures. Confirms the 'same hardware, different application' narrative.
NVIDIA GeForce RTX 4090
NVIDIA Accessed: 2026-05-15
Source Archive
Official product page. 16,384 CUDA cores, Ada Lovelace architecture.
Nvidia GeForce RTX 4090 Announced with 16,384 CUDA Cores
NotebookCheck Anil Ganti 2022 Accessed: 2026-05-15
Source Archive
Trade press coverage at launch. Useful as a historical record.

The Coin-Flip Economy

10 sources

Deep Learning — Chapter 5: Machine Learning Basics
MIT Press / deeplearningbook.org Ian Goodfellow, Yoshua Bengio, Aaron Courville 2016 Accessed: 2026-05-15
Source Archive
Canonical definition of machine learning, generalisation, and the distinction between training and test error.
Note: this is a chapter of book referenced in the previous explainer.
What Is Machine Learning?
IBM Think Dave Bergmann Accessed: 2026-05-15
Source Archive
Concise industry-level treatment of generalisation as the fundamental goal of ML.
Why Do Machine Learning Algorithms Work on New Data?
Machine Learning Mastery Jason Brownlee 2019 Accessed: 2026-05-15
Source Archive
Accessible but solid piece on why generalisation is ML's 'superpower' and when it fails.
Independent and Identically Distributed Random Variables
Wikipedia Accessed: 2026-05-15
Source Archive
Formal definition of i.i.d. with a section on ML implications. Wikipedia is authoritative here — this is a mathematically precise concept.
Environment and Distribution Shift
Dive into Deep Learning (d2l.ai) Aston Zhang, Zachary Lipton, Mu Li, Alexander J. Smola 2023 Accessed: 2026-05-15
Source Archive
ISBN: 978-1009389433
Academic online textbook. Detailed description of covariate shift, label shift, and concept shift with concrete examples of deployment failures. Direct counterpart to the 'changed coin' analogy in the explainer.
Data Distribution Shifts and Monitoring
Chip Huyen's Blog Chip Huyen 2022 Accessed: 2026-05-15
Source Archive
Practical engineering perspective on distribution shift, from the author of Designing Machine Learning Systems. Describes when and why models stop working in production.
Causality: Models, Reasoning, and Inference
Cambridge University Press Judea Pearl 2009
ISBN: 978-0521895606.
Foundational work. Formal framework distinguishing association, intervention, and counterfactual. The basis of the entire causal inference field.
The Book of Why
Basic Books Judea Pearl, Dana Mackenzie 2018
ISBN: 978-0465097609.
Popular-science companion to Pearl's Causality. Describes the Ladder of Causation and explains why standard ML operates solely at the level of association.
Causal Inference Is Eating Machine Learning
Towards Data Science Kaushik Rajan 2026 Accessed: 2026-05-15
Source Archive
Solid industry piece connecting Pearl's Ladder to concrete ML failure examples caused by confusing correlation with causation (including the hormone replacement therapy case).
Spurious Correlation, Machine Learning, and Causality
lgmoneda.github.io Luis Moneda 2021 Accessed: 2026-05-15
Source Archive
More technical but accessible piece on what spurious correlation means specifically in ML and how it differs from the classical statistics concept.

33

Garbage In, Gospel Out

7 Topics

Explainers

Data Bias: Learning the Wrong Lesson

7 sources

A Survey on Bias and Fairness in Machine Learning
arXiv / ACM Computing Surveys Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, Aram Galstyan 2021 Accessed: 2026-05-15
Source Archive
Canonical survey with over 1,000 citations. Taxonomy of bias sources (historical, representation, measurement, etc.) and definitions of fairness. Peer-reviewed, available via arXiv and ACM DL.
Bias and Unfairness in Machine Learning Models: A Systematic Review
MDPI Big Data and Cognitive Computing Tiago P. Pagano, Rafael B. Loureiro, Fernanda V. N. Lisboa, Rodrigo M. Peixoto, Guilherme A. S. Guimarães, Gustavo O. R. Cruz, Maira M. Araujo, Lucas L. Santos, Marco A. S. Cruz, Ewerton L. S. Oliveira, Ingrid Winkler, Erick G. S. Nascimento 2023 Accessed: 2026-05-15
Source
PRISMA-compliant systematic review covering 2017–2022. Focuses on bias detection and mitigation techniques. More recent than Mehrabi et al.
What Is Algorithmic Bias?
IBM Think Alexandra Jonker , Julie Rogers Accessed: 2026-05-15
Source Archive
Authoritative industry explanation of how historical bias is inherited by ML models, with concrete examples including predictive policing and Oakland data.
Weapons of Math Destruction
Crown Cathy O'Neil 2016
ISBN: 978-0553418811.
Key popular-science work. Introduces WMD (Weapons of Math Destruction) defined by three characteristics: opacity, scalability, and resistance to challenge. Directly relevant to the idea of encoding bias into math to make it appear objective.
Automating Inequality
St. Martin's Press Virginia Eubanks 2018
ISBN: 978-1250074317.
Companion volume focusing on welfare state systems. Documents how historical bias is amplified by automation and applied at institutional scale.
'Weapons of Math Destruction': Cathy O'Neil on How Unfair Algorithms Perpetuate Inequality
Ford Foundation Jenny Toomey, Lori McGlinchey 2016 Accessed: 2026-05-15
Source Archive
Online summary of O'Neil's arguments with her own commentary. Useful for readers without access to the book.
Towards a Standard for Identifying and Managing Bias in Artificial Intelligence (NIST SP 1270)
NIST Reva Schwartz, Apostol Vassilev, Kristen Greene, Lori Perine, Andrew Burt, Patrick Hall 2022 Accessed: 2026-05-15
Source Archive
Official US government document defining AI bias and recommending management approaches. Its existence as a NIST publication reflects the seriousness of the correction problem.

Data Leakage

8 sources

Leakage in Data Mining: Formulation, Detection, and Avoidance
ACM Transactions on Knowledge Discovery from Data Shachar Kaufman, Saharon Rosset, Claudia Perlich, Ori Stitelman 2012 Accessed: 2026-05-15
Source Archive
Canonical academic paper formalising the concept of leakage. Identifies it as one of the ten biggest mistakes in data mining and describes the 'no-time-machine requirement' — the prohibition on using features that would not be available at prediction time.
Leakage (Machine Learning)
Wikipedia Accessed: 2026-05-15
Source Archive
Solid overview with breakdown into feature leakage and row-wise leakage. Useful as a quick taxonomy reference.
What Is Data Leakage in Machine Learning?
IBM Think Tim Mucci Accessed: 2026-05-15
Source Archive
Authoritative industry explanation of both leakage types (target leakage and train-test contamination) with a description of the normalisation-before-split problem.
How to Avoid Data Leakage When Performing Data Preparation
Machine Learning Mastery Jason Brownlee 2020 Accessed: 2026-05-15
Source Archive
Well-written practical tutorial with code. Explains precisely why normalising the full dataset before the train-test split is a mistake and how to fix it.
The Reusable Holdout: Preserving Validity in Adaptive Data Analysis
Google Research Blog / Science Moritz Hardt 2015 Accessed: 2026-05-15
Source Archive
Seminal paper describing exactly the mechanism of iterative leaderboard overfitting: repeated modification of a model based on holdout set results creates a dependency that invalidates the classical holdout method. Published in Science 349(6248).
A Meta-Analysis of Overfitting in Machine Learning
NeurIPS 2019 Rebecca Roelofs, Vaishaal Shankar, Benjamin Recht, Sara Fridovich-Keil, Moritz Hardt, John Miller, Ludwig Schmidt 2019 Accessed: 2026-05-15
Source Archive
First large meta-analysis of test set reuse across 100+ Kaggle competitions. Results are surprisingly positive (little evidence of widespread overfitting), but the methodology confirms the reality of the mechanism.
Competing in a Data Science Contest Without Reading the Data
mrtz.org (Moritz Hardt's Blog) Moritz Hardt 2015 Accessed: 2026-05-15
Source Archive
Influential post demonstrating how to climb a leaderboard on the Heritage Health Prize without examining the data — purely through algorithmic probing of results. Direct documentation of leaderboard hacking.
Leakage and the Reproducibility Crisis in Machine-Learning-Based Science
Patterns (Cell Press) Sayash Kapoor, Arvind Narayanan 2023 Accessed: 2026-05-15
Source
Review of 294 published papers across 17 scientific fields affected by data leakage. Demonstrates that the problem is neither abstract nor marginal — it has affected hundreds of peer-reviewed publications.

Case Studies

Amazon's Hiring Bot — When AI Learned to Be a Bro

10 sources

Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women
CNBC / Reuters 2018 Accessed: 2026-05-15
Source Archive
Primary report based on five anonymous internal sources. Describes the 1–5 star rating system, Edinburgh team, ten years of training data, the 'executed'/'captured' problem, and the team's disbandment in early 2017.
Amazon Ditched AI Recruitment Software Because It Was Biased Against Women
MIT Technology Review Erin Winick 2018 Accessed: 2026-05-15
Source Archive
Concise and reliable summary of the Reuters report.
Why Amazon's Automated Hiring Tool Discriminated Against Women
ACLU Rachel Goodman 2018 Accessed: 2026-05-15
Source Archive
Confirms details on the verbs 'executed' and 'captured' and the proxy discrimination mechanism. Adds legal context (Title VII).
Automating Discrimination: AI Hiring Practices and Gender Inequality
Cardozo Law Review Lori Andrews, Hannah Bucher 2022 Accessed: 2026-05-15
Source Archive
Legal article describing hindsight bias in recruitment systems. Directly analyses the Amazon case and proxy features in CVs. References the 74% male historical CV pool figure.
Fairness and Bias in Algorithmic Hiring: A Multidisciplinary Survey
ACM Transactions on Intelligent Systems and Technology Alessandro Fabris, Nina Baranowska, Matthew J. Dennis, David Graus, Philipp Hacker, Jorge Saldivar, Frederik Zuiderveen Borgesius, Asia J. Biega 2025 Accessed: 2026-05-15
Source Archive
Academic survey formalising proxy features and sensitive attribute proxies in hiring contexts (names, institutions, CV language structure as gender proxies).
Gender, Race, and Intersectional Bias in Resume Screening via LLM Retrieval
Brookings Kyra Wilson, Aylin Caliskan 2025 Accessed: 2026-05-15
Source Archive
Empirical study confirming that AI systems continue to favour names and linguistic features associated with male and white candidates.
Gender Equality Index 2020 — Men Dominate Technology Development
EIGE (European Institute for Gender Equality) 2020 Accessed: 2026-05-15
Source Archive
European data. Women represent approximately 17% of ICT specialists in the EU. More appropriate than US figures given the algorithm was developed and trained in Scotland.
Breakdown of Female IT Professionals & IT Technicians in the UK, 2016–2019
Statista / STEM Women Diana Elagina 2025 Accessed: 2026-05-15
Source
UK data: 18% women among IT professionals in 2016, declining to 16% in 2019. Confirms that an 80/20 or 84/16 split is a realistic approximation for technical roles in Scotland in 2014.
Women in Tech Stats UK 2025
Women in Tech Network 2025 Accessed: 2026-05-15
Source Archive
Accessible summary with historical and current data. Only 22% women among AI and data specialists in the UK.
Women and Men in the IT Profession
ResearchGate Vicki R. McKinney, Darryl D. Wilson, Nita Brooks, Anne O'Leary-Kelly 2008 Accessed: 2026-05-15
Source

Sentenced by Spreadsheet: The COMPAS Recidivism Racket

10 sources

Machine Bias
ProPublica Angwin, Larson, Mattu, Kirchner 2016 Accessed: 2026-05-15
Source Archive
Original investigative report. Describes the 137 questions, 1–10 scale, and false positive/false negative asymmetry for Black and white defendants in a Broward County, Florida sample.
How We Analyzed the COMPAS Recidivism Algorithm
ProPublica Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner 2016 Accessed: 2026-05-15
Source Archive
Methodology document. Specifies sample size (6,172–7,214 individuals), recidivism definition (2 years), and data sources. Essential for verifying the figures.
The Accuracy, Fairness, and Limits of Predicting Recidivism
Science Advances / PubMed Central Julia Dressel, Hany Farid 2018 Accessed: 2026-05-15
Source Archive
Peer-reviewed study showing COMPAS is no more accurate than untrained humans, and that just two variables (age and number of prior convictions) achieve the same accuracy. Strong scientific confirmation of the system's limitations.
State v. Loomis, 881 N.W.2d 749 (Wis. 2016)
Wisconsin Supreme Court 2016 Accessed: 2026-05-15
Source Archive
Primary court document. Wisconsin Supreme Court (not US Supreme Court). Describes the limitations imposed on COMPAS use and the grounds for rejecting the due process challenge.
State v. Loomis
Harvard Law Review 2017 Accessed: 2026-05-15
Source Archive
Legal analysis of the ruling. Covers Loomis's arguments, the court's response, and the precedential significance.
Loomis v. Wisconsin
SCOTUSblog 2017 Accessed: 2026-05-15
Source Archive
Documents the US Supreme Court's denial of certiorari (2017). Important for precision: this is not SCOTUS endorsement of the ruling, only a refusal to intervene.
Algorithmic Due Process: Mistaken Accountability and Attribution in State v. Loomis
Harvard Journal of Law & Technology Ellora Israni 2017 Accessed: 2026-05-15
Source Archive
Academic critique of the ruling. Argues the court misunderstood how the algorithm works and what safeguards would be required.
Injustice Ex Machina: Predictive Algorithms in Criminal Sentencing
UCLA Law Review Andrew Lee Park 2019 Accessed: 2026-05-15
Source Archive
Legal survey of proxy discrimination in recidivism systems. Covers both Loomis and the ProPublica data.
Fair Prediction with Disparate Impact
arXiv / Big Data Alexandra Chouldechova 2017 Accessed: 2026-05-15
Source Archive
Mathematical proof of impossibility: it is not possible to simultaneously satisfy calibration (Northpointe's argument) and equal error rates (ProPublica's argument) when base rates differ between groups. Explains why both sides of the dispute were correct within their own frameworks.
COMPAS Risk Scales: Demonstrating Accuracy Equity and Predictive Parity
Northpointe (Equivant) William Dieterich, Christina Mendoza, Tim Brennan 2016 Accessed: 2026-05-15
Source Archive
Vendor response to the ProPublica report. Defends calibration as the correct fairness measure. Useful for illustrating that the debate has two sides with formal arguments.

The Dollar-Sign Diagnosis

5 sources

Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations
Science Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan 2019 Accessed: 2026-05-15
Source Archive
Primary research paper. Source for all key figures: 17.7%, 46.5%, and the cost-as-proxy mechanism.
There Is No Such Thing as Race in Health-Care Algorithms
The Lancet Digital Health 2019 Accessed: 2026-05-15
Source
Peer-reviewed medical commentary. Specifies the $1,800 cost difference and the 200 million patients per year affected figure.
Racial Bias Found in a Major Health Care Risk Algorithm
Scientific American Starre Vartan 2019 Accessed: 2026-05-15
Source Archive
Accessible overview with quotes from Obermeyer. Explains the mechanism of choosing cost as a proxy and the limits of the audit.
Racial Bias Found in Widely Used Health Care Algorithm
NBC News Quinn Gawronski 2019 Accessed: 2026-05-15
Source Archive
Day-of-publication coverage (November 7, 2019). Includes Optum's response and the $1,800 figure.
Study Finds Racial Bias in Optum Algorithm
Healthcare Finance News Susan Morse 2019 Accessed: 2026-05-15
Source Archive
Industry coverage. Contains Optum's full statement disputing the researchers' conclusions — useful for the company response section.

Google Photos (2015) — The Tag That Broke the Internet

9 sources

'Machine Learning Is Hard': Google Photos Has Egregious Facial Recognition Error
CNBC / Re/code Mark Bergen 2015 Accessed: 2026-05-15
Source Archive
Most detailed original account. Contains the 'roughly 90 minutes' detail, describes Yonatan Zunger as 'chief architect of Google+', and the '100% Not OK' quote.
Google Apologizes for Mis-Tagging Photos of African Americans as 'Gorillas'
CBS News Amanda Schupak 2015 Accessed: 2026-05-15
Source Archive
Variant timeline ('about two hours later') with Google's full response. Good second source for the chronology.
Google Apologizes After App Mistakenly Labels Black People 'Gorillas'
CBC News 2015 Accessed: 2026-05-15
Source Archive
Contains Zunger's tweet in full: 'Holy fuck. G+ CA here. No, this is not how you determine someone's target market. This is 100% Not OK.' Confirms 'appalled and genuinely sorry' wording.
Google Photos Still Has a Problem with Gorillas
MIT Technology Review Jacke Snow 2018 Accessed: 2026-05-15
Source Archive
First independent verification after two years. Wired tested 40,000 images and confirmed censorship of 'gorilla', 'chimp', 'chimpanzee', and 'monkey'.
Google Censors Gorillas Rather Than Risk Them Being Mislabeled As Black People
Gizmodo Sidney Fussell 2018 Accessed: 2026-05-15
Source Archive
Analysis of the censorship mechanism. Specifies the list of blocked terms.
Google's Photo App Still Can't Find Gorillas. And Neither Can Apple's.
The New York Times Nico Grant, Kashmir Hill 2023 Accessed: 2026-05-15
Source
Investigative piece eight years on. Tests Google, Apple, Amazon, and Microsoft. Confirms Apple Photos and Microsoft OneDrive have the same problem; Amazon Photos returns broad primate results.
Google's Photos App Is Still Unable to Find Gorillas
PetaPixel Pesala Bandara 2023 Accessed: 2026-05-15
Source Archive
Summary of the NYT investigation with technical commentary. Includes Google's quote that 'the benefit does not outweigh the risk of harm'.
Facebook Apologizes For 'Primates' Label On Video Of Black Men
NPR Dustin Jones 2021 Accessed: 2026-05-15
Source Archive
Original account of the 2021 Facebook incident. Confirms the shutdown of the entire topic recommendation system.
Facebook Apologizes After Its A.I. Mislabels Video of Black Men as 'Primates'
Slate Daniel Politi 2021 Accessed: 2026-05-15
Source Archive
Contains Facebook's official statement: 'We disabled the entire topic recommendation feature as soon as we realized this was happening.'

AI vs COVID-19: A Great Promise, Zero Utility

5 sources

Common Pitfalls and Recommendations for Using Machine Learning to Detect and Prognosticate for COVID-19 Using Chest Radiographs and CT Scans
Nature Machine Intelligence Michael Roberts, Derek Driggs, Matthew Thorpe, Julian Gilbey, Michael Yeung, Stephan Ursprung, Angelica I. Aviles-Rivero, Christian Etmann, Cathal McCague, Lucian Beer, Jonathan R. Weir-McCall, Zhongzhao Teng, Effrossyni Gkrania-Klotsas, James H. F. Rudd, Evis Sala & Carola-Bibiane Schönlieb 2021 Accessed: 2026-05-15
Source Archive
2,212 papers found; 415 after initial screening; 62 after quality screening; zero models suitable for clinical use. Describes patient overlap, Frankenstein datasets, shortcut learning, patient positioning as leakage, and peer review failures.
Machine Learning Models for Diagnosing COVID-19 Are Not Yet Suitable for Clinical Use
University of Cambridge 2021 Accessed: 2026-05-15
Source Archive
Official Cambridge press release with quotes from Roberts and Rudd. Accessible summary of the findings.
Of 300-Plus Imaging-Based AI Models for COVID-19 Diagnosis, Zero Suitable for Clinical Use
Radiology Business Marty Stempniak 2021 Accessed: 2026-05-15
Source Archive
Shortcut Learning in Deep Neural Networks
Nature Machine Intelligence Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix A. Wichmann 2020 Accessed: 2026-05-15
Source
Canonical paper defining shortcut learning. Background for the section on patient positioning and technical artefacts as spurious features.
Leakage and the Reproducibility Crisis in Machine-Learning-Based Science
Patterns (Cell Press) Sayash Kapoor, Arvind Narayanan 2023 Accessed: 2026-05-15
Source
Also relevant here: the review covers cases from medicine and imaging, not only EHR. Complements the Roberts et al. findings with a broader reproducibility crisis perspective.

34

Models That Learned the Wrong Lesson

9 Topics

Explainers

Edge Cases & Tail Risk

8 sources

Pitfalls of Machine Learning for Tail Events in High Risk Environments
ResearchGate Christian Agrell, Simen Eldevik, Andreas Hafver, Frank Børre Pedersen 2019 Accessed: 2026-05-15
Source
Academic survey of ML limitations in high-risk environments. Formally describes the constraints of correlation-based models when applied to tail events.
Tail Risk
Investopedia Meagan Drew 2025 Accessed: 2026-05-15
Source
Accessible financial-sector explanation of tail risk. Covers normal distribution vs. fat tails.
Data Distribution Shifts and Monitoring
Chip Huyen's Blog Chip Huyen 2022 Accessed: 2026-05-15
Source Archive
Engineering-focused explanation of distribution shift and edge cases in ML. Practical and accessible for technical readers.
The Black Swan: The Impact of the Highly Improbable
Random House Nassim Nicholas Taleb 2007
ISBN: 978-1-4000-6351-2.
Original source for the term and the three defining criteria of a Black Swan event.
Black Swan Event
Britannica Sanat Pai Raikar Accessed: 2026-05-15
Source Archive
Solid encyclopaedic definition with the history of the term (Juvenal → de Vlamingh → Taleb). Covers the 1697 de Vlamingh sighting in Australia.
Black Swan Theory
Wikipedia Accessed: 2026-05-15
Source Archive
Well-documented article with citations. Covers the three criteria, history, and critiques of the theory. Useful as a quick reference.
Black Swan (Bird)
Wikipedia Accessed: 2026-05-15
Source Archive
Confirms the first European observations: Antonie Caen (1636) and Willem de Vlamingh (1697, Swan River, Western Australia). Cygnus atratus as a species endemic to Australia.
How Much Does the Average Car Weigh?
ConsumerAffairs Alexus Bazen 2026 Accessed: 2026-05-15
Source Archive
Contains all three weight figures: average passenger car (~4,000 lbs / ~2 tonnes, EPA), maximum semi-truck weight (80,000 lbs / 40 tonnes, FMCSA). Data current as of 2024.

Overfitting

6 sources

Reconciling Modern Machine-Learning Practice and the Classical Bias–Variance Trade-Off
PNAS Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal 2019 Accessed: 2026-05-15
Source Archive
Canonical paper on the bias-variance tradeoff and the 'double descent' phenomenon. Formally defines overfitting as fitting to noise at the cost of generalisation.
Neural Networks and the Bias/Variance Dilemma
Neural Computation Stuart Geman, Elie Bienenstock, René Doursat 1992 Accessed: 2026-05-15
Source
Classic paper introducing the formal bias-variance tradeoff description to ML. Historical source of the concept. Neural Computation 4(1), 1–58.
What Is Bias-Variance Tradeoff?
IBM Think Fangfang Lee Accessed: 2026-05-15
Source Archive
Solid industry explanation describing overfitting as a model learning 'the noise along with the signal'. Includes a conceptual diagram.
Overfitting
Google Machine Learning Crash Course Accessed: 2026-05-15
Source Archive
Official Google documentation. Accessible and authoritative, with a concrete operational definition.
Overfitting vs. Underfitting: A Complete Example
Towards Data Science Dmytro Nikolaiev 2021 Accessed: 2026-05-15
Source Archive
Accessible explanation with visualisations and analogies. Good for illustrating the mechanism to non-specialist readers.
But What Is a Neural Network?
YouTube / 3Blue1Brown 3Blue1Brown 2017 Accessed: 2026-05-15
Source
Widely-viewed popular explanation of how neural networks work. Overfitting appears as a natural context of the learning process.

Spurious Correlations

7 sources

Spurious Correlations
Hachette Books Tyler Vigen 2015 Accessed: 2026-05-15
Source Archive
ISBN: 978-0-316-33943-8.
Original source for both examples. Book and website contain hundreds of auto-generated spurious correlations.
Mozzarella Cheese Consumption Correlates with Bachelor's Degrees Awarded in Engineering
tylervigen.com Tyler Vigen Accessed: 2026-05-15
Source Archive
Tyler Vigen's website contains hundreds of auto-generated spurious correlations.
Auto-generated correlation from public datasets.
I deliberately chose absurd examples to illustrate the mechanism, not as verified empirical claims.
Popularity of the First Name Jordan Correlates with Robberies in South Carolina
tylervigen.com Tyler Vigen Accessed: 2026-05-15
Source Archive
Tyler Vigen's website contains hundreds of auto-generated spurious correlations.
Auto-generated correlation from public datasets.
I deliberately chose absurd examples to illustrate the mechanism, not as verified empirical claims.
The Book of Why: The New Science of Cause and Effect
Basic Books Judea Pearl, Dana Mackenzie 2018
ISBN: 978-0-465-09760-9.
Canonical source for the distinction between correlation and causation. Formally defines spurious correlation as correlation arising from a confounder or chance.
Investigating Causal Relations by Econometric Models and Cross-Spectral Methods
Econometrica C. W. J. Granger 1969 Accessed: 2026-05-15
Source
Classic paper defining Granger causality. Historical context for the spurious correlations problem in time-series analysis.
Spurious Correlations: The Comedy and Drama of Statistics
Towards Data Science Celia Banks 2024 Accessed: 2026-05-15
Source Archive
Accessible explanation of the mechanism .
The Spurious Correlations' Missing Data
Investigative Economics Llewellyn Jones 2021 Accessed: 2026-05-15
Source
Critical analysis of Vigen's data. Documents that some datasets have disappeared and correlations may be based on outdated figures. Useful as an honest caveat to the mozzarella example.

Reward Hacking

6 sources

Specification Gaming: The Flip Side of AI Ingenuity
DeepMind Blog Victoria Krakovna, Jonathan Uesato, Vladimir Mikulik, Matthew Rahtz, Tom Everitt, Ramana Kumar, Zac Kenton, Jan Leike, Shane Legg 2020 Accessed: 2026-05-15
Source Archive
Canonical DeepMind article defining specification gaming. Includes examples: Lego stacking, boat racing, social media engagement.
Specification Gaming Examples in AI
Victoria Krakovna's Blog Victoria Krakovna 2018 Accessed: 2026-05-15
Source Archive
Extensive, regularly updated list of documented reward hacking cases with literature references. Excellent source for the mini-case sections.
Concrete Problems in AI Safety
arXiv Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané 2016 Accessed: 2026-05-15
Source Archive
Seminal OpenAI/Google Brain paper introducing reward hacking as a formal AI safety problem.
Categorizing Variants of Goodhart's Law
arXiv David Manheim, Scott Garrabrant 2018 Accessed: 2026-05-15
Source Archive
Academic taxonomy of Goodhart's Law variants. Connects reward hacking to the broader class of objective specification problems.
Reward Hacking in Reinforcement Learning
Lil'Log (Lilian Weng) Lilian Weng 2024 Accessed: 2026-05-15
Source Archive
Technical but accessible survey with examples. The author is Head of Safety at OpenAI, lending the piece practical authority.
Reward Hacking
Wikipedia Accessed: 2026-05-15
Source Archive
Solid introduction with examples and links to primary literature.

Case Studies

The 70mph (110km/h) Panic Attack — Phantom Braking

14 sources

NHTSA Preliminary Evaluation PE22-002
NHTSA 2022 Accessed: 2026-05-15
Source Archive
Primary regulatory document. 354 complaints, 416,000 vehicles (2021–2022 Model 3 and Model Y), description of the braking mechanism.
NHTSA Opens Safety Probe into 416,000 Teslas for 'Phantom Braking'
Automotive News Audrey LaForest 2022 Accessed: 2026-05-15
Source Archive
Day-of-opening industry coverage of the preliminary evaluation.
Tesla's Phantom-Braking Complaints Lead to NHTSA Investigation
Carscoops Sebastien Bell 2022 Accessed: 2026-05-15
Source
Detailed account with context: 354 complaints over 9 months, description of the braking events.
Tesla Removes Radar Sensors From Model 3 and Model Y
The Drive Rob Stumpf 2021 Accessed: 2026-05-15
Source Archive
First account after the decision was announced. Cites official Tesla documentation.
How Elon Musk Knocked Tesla's 'Full Self-Driving' Off Course
Washington Post 2023 Accessed: 2026-05-15
Source
Tesla engineers were reportedly alarmed by Musk's decision and contacted a former executive to dissuade him. Musk 'was unconvinced and overruled his engineers.' Directly connects radar removal to increased accidents and near-misses. Based on conversations with nearly twelve former employees. Note: article may be paywalled.
Tesla Autopilot Hardware
Wikipedia Accessed: 2026-05-15
Source Archive
Well-documented history of the hardware decisions. Includes the quote that Musk 'overruled' engineers who warned against removing radar.
Elon Musk Overruled Tesla Engineers Who Knew Removing Radar Was a Bad Idea
Carscoops Brad Anderson 2023 Accessed: 2026-05-15
Source
Good secondary account. Quotes former NHTSA adviser Missy Cummings: radar served as a 'sensor fusion way to check if there is a problem', and removing it was 'a big part of' the phantom braking issues.
Tesla Engineers Tried to Convince Elon Musk Not to Give Up Radar for Self-Driving
Electrek Fred Lambert 2023 Accessed: 2026-05-15
Source Archive
Adds nuance: Musk was frustrated with the quality of available radars, not arguing radar was unnecessary in principle. Useful for avoiding oversimplification.
LiDAR vs Tesla: Why Tesla Avoids LiDAR and Chooses Vision for Self-Driving
Sustainable Business Magazine Accessed: 2026-05-15
Source
Industry assessment comparing camera and LiDAR approaches. Not a primary source — a trade-level evaluation of the technology trade-offs and cost differences.
Cameras vs LiDAR: The Battle for Vision in Autonomous Vehicles
Arcadian.ai Olivia Campbell 2025 Accessed: 2026-05-15
Source Archive
Industry-level comparison. Not a primary source — a trade assessment of the two sensor approaches.
US Government Investigates 1.7 Million Honda Cars Over Phantom Braking
MotorSafety.org Bojan Popic 2022 Accessed: 2026-05-15
Source Archive
Opening of the PE for 1.73 million Hondas (2018–19 Accord, 2017–19 CR-V). 278 complaints, description of the CMBS mechanism.
NHTSA Gets Serious About Phantom Braking Issue Now Affecting 3 Million Hondas
Carscoops Chris Chilton 2024 Accessed: 2026-05-15
Source
Expansion of the investigation to ~3 million vehicles. 47 accidents, 93 injuries. Context for the scale of the problem.
Mazda3 Recall: Sudden, Unexpected Emergency Braking
Consumer Reports Chris Chilton 2019 Accessed: 2026-05-15
Source Archive
35,390 vehicles (2019–2020 Mazda3), Smart Brake Support (SBS), false detection mechanism, no injuries reported.
Statement on Recall of Certain 2019–2020 MY Mazda3 Vehicles
Mazda North American Operations 2019 Accessed: 2026-05-15
Source Archive
Official manufacturer statement. Confirms 'incorrect programming of the SBS control software'.

Zillow: The Algorithm That Bought Too Many Houses

8 sources

Zillow Q3 2021 Earnings — SEC Form 8-K
SEC EDGAR Rich Barton, Allen Parker 2021 Accessed: 2026-05-15
Source
Primary regulatory document. Contains the exact Barton quote, $304M Q3 write-down, Q4 forecast, and 25% workforce reduction.
Zillow to Shutter Home Buying Business and Lay Off 2,000 Employees
GeekWire Taylor Soper 2021 Accessed: 2026-05-15
Source Archive
2,000 layoffs (25% of workforce), write-down over $500M, 9,790 homes in Q3 inventory.
Zillow Reports $880M Loss on Failed Home-Flipping Business
The Real Deal 2022 Accessed: 2026-05-15
Source Archive
Total Zillow Offers loss for 2021: $880M. Context: $6B segment revenue, zero profit.
Zillow Offers Homes for Sale
CBS News Rachel Layne 2021 Accessed: 2026-05-15
Source Archive
Data on contracted and inventory homes (~18,000 combined across various legal states). Source for extrapolation.
Home Prices Skyrocketed by Nearly 19% Last Year
CNN Anna Bahney 2022 Accessed: 2026-05-15
Source Archive
Case-Shiller National Index: +18.8% in 2021, the largest annual gain in the index's 34-year history. Confirms the 'nearly 20%' figure.
Flip Flop: Why Zillow's Algorithmic Home Buying Venture Imploded
Stanford Graduate School of Business Amit Seru, Greg Buchak 2021 Accessed: 2026-05-15
Source Archive
Academic analysis from Stanford. Describes the structural problem of iBuying, the speed-vs-accuracy tradeoff, and why the model could not function under low-liquidity conditions.
Why the iBuying Algorithms Failed Zillow
GeekWire John Cook 2021 Accessed: 2026-05-15
Source Archive
Includes MoxiWorks CEO quote: 'all the AI and machine learning in the world isn't yet up to the task.' Notes the Black Swan framing used as a corporate excuse. Broader AI-in-business context.
Is Zillow 'Cursed'? A Behavioral Economics Perspective
Towards Data Science Florent Buisson 2021 Accessed: 2026-05-15
Source
Good explanation of the Winner's Curse mechanism in the Zillow context. Describes the feedback loop and structural reasons why iBuying was prone to systematic overpaying.

Husky vs. Wolf (2016)

3 sources

'Why Should I Trust You?': Explaining the Predictions of Any Classifier
arXiv / KDD 2016 Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin 2016 Accessed: 2026-05-15
Source Archive
Original paper introducing the Husky vs. Wolf experiment and the LIME explanation method.
'Why Should I Trust You?': Explaining the Predictions of Any Classifier (KDD '16 Proceedings)
ACM Digital Library Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin 2016 Accessed: 2026-05-15
Source Archive
Official conference proceedings version. Contains Table 2 'Husky vs Wolf experiment results'.
Transparency and the Future of Artificial Intelligence
Analytics Magazine (INFORMS) Charles Simon 2020 Accessed: 2026-05-15
Source Archive
Accessible discussion of the case with illustration. Useful industry secondary source for non-academic readers.

Lying Down with AI — Position Bias in Medical Imaging

5 sources

In Bed with AI: Aided Diagnosis of Supine Chest Radiographs
Radiology Mark O. Wielpütz, 2022 Accessed: 2026-05-15
Source
Editorial commenting on position bias in AI for chest X-rays. The title directly inspired the case study title.
Artificial Intelligence Algorithm Detecting Lung Infection in Supine Chest Radiographs of Critically Ill Patients
PubMed / Critical Care Medicine Johannes Rueckel, Wolfgang G. Kunz, Boj F. Hoppe, Maximilian Patzig, Mike Notohamiprodjo, Felix G. Meinel, Clemens C. Cyran, Michael Ingrisch, Jens Ricke, Bastian O. Sabel 2020 Accessed: 2026-05-15
Source Archive
Original study of AI applied to supine chest X-rays. Foundational context for the position bias problem.
Variable Generalization Performance of a Deep Learning Model to Detect Pneumonia in Chest Radiographs: A Cross-Sectional Study
PLOS Medicine John R. Zech, Marcus A. Badgeley, Manway Liu, Anthony B. Costa, Joseph J. Titano, Eric Karl Oermann 2018 Accessed: 2026-05-15
Source Archive
Canonical study documenting hospital-specific shortcuts. Models learn institutional correlations rather than pathology.
Slicing Through Bias: Explaining Performance Gaps in Medical Image Analysis
arXiv Vincent Olesen, Nina Weng, Aasa Feragen, and Eike Petersen 2024 Accessed: 2026-05-15
Source Archive
Documents chest drain as a shortcut in pneumothorax classification and ECG cables in atelectasis detection. Confirms the mechanism described in the text.
Shortcut Learning in Deep Neural Networks
Nature Machine Intelligence Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge & Felix A. Wichmann 2020 Accessed: 2026-05-15
Source
Canonical paper defining shortcut learning. Provides the broader academic context for the mechanism in this case study.

The Malicious Compliance Files

8 sources

The First Level of Super Mario Bros. Is Easy with Lexicographic Orderings and Time Travel
SIGBOVIK 2013 Tom Murphy VII 2013 Accessed: 2026-05-15
Source Archive
Original paper. Contains the description of PlayFun's Tetris behaviour and the pausing mechanism.
Programmer Creates An AI To (Not Quite) Beat NES Games
TechCrunch John Biggs 2013 Accessed: 2026-05-15
Source Archive
First media account. Quotes the Tetris pausing behaviour.
Teaching a Computer to Play Mario… Seemingly Through Voodoo
Hackaday Mike Szczys 2013 Accessed: 2026-05-15
Source Archive
Contains the direct quote from the paper on pausing: 'Death is imminent, so playfun pauses the game shortly after this and then doesn't unpause it.'
Faulty Reward Functions in the Wild
OpenAI Blog Jack Clark, Dario Amodei 2016 Accessed: 2026-05-15
Source
Primary OpenAI blog post on the boat racing agent. Describes the mechanism, the targets, the fire, and the 20% score above human players.
Reward Hacking
Wikipedia Accessed: 2026-05-15
Source Archive
Secondary reference with historical context and links to primary works.
Creatures (Video Game Series)
Wikipedia Accessed: 2026-05-15
Source Archive
Confirms the drives/pain/pleasure mechanism, Norn biochemistry, Creatures 2 release (1998), and Steve Grand's role.
The AI of Creatures
Alan Zucconi's Blog Alan Zucconi 2020 Accessed: 2026-05-15
Source
Technical explanation of the biochemistry and reward/punishment mechanism in Creatures. Good accessible source for the mechanism described in the case study.
Creation: Life and How to Make It
Weidenfeld & Nicolson Steve Grand 2000
ISBN: 978-0-7538-1254-0.
Grand's own account of the Creatures project. Primary source for his intentions and description of emergent Norn behaviour.

35

The Black Box Problem

7 Topics

Explainers

Interpretability

11 sources

'Why Should I Trust You?': Explaining the Predictions of Any Classifier
arXiv / KDD 2016 Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin 2016 Accessed: 2026-05-15
Source Archive
Original LIME paper. Also cited in Chapter 34 (Husky vs. Wolf experiment).
A Unified Approach to Interpreting Model Predictions
arXiv / NeurIPS 2017 Scott M. Lundberg, Su-In Lee 2017 Accessed: 2026-05-15
Source Archive
Original SHAP paper. Formally connects Shapley values to ML interpretability and shows the relationship with LIME.
A Value for n-Person Games
Cambridge University Press Lloyd Shapley 1953 Accessed: 2026-05-15
Source Archive
In: Contributions to the Theory of Games II, pp. 307–317.
Original Shapley (1953) paper. Source for the φᵢ(v) formulation and the four axioms. Cite as a book chapter.
Interpretable Machine Learning — SHAP Chapter
christophm.github.io Christoph Molnar 2025 Accessed: 2026-05-15
Source Archive
Accessible explanation of SHAP, Shapley values, and their history. Open online book widely cited in academic literature.
ISBN: 978-3911578035
Note: this is a chapter of a book that is under copyright, but the chapter itself is freely available online and can be cited directly. I strongly recommend purchasing it.
Techniques for Interpretable Machine Learning
arXiv / Communications of the ACM Mengnan Du, Ninghao Liu, Xia Hu 2019 Accessed: 2026-05-15
Source Archive
Formally defines local vs. global interpretability and describes the accuracy-interpretability tradeoff across model classes (decision trees vs. neural networks).
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges
arXiv / Information Fusion Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, Francisco Herrera 2019 Accessed: 2026-05-15
Source Archive
Canonical XAI survey cited by hundreds of papers. Defines the interpretability-accuracy tradeoff. Key academic background for the 'dirty secret' described in the text.
European Union Regulations on Algorithmic Decision-Making and a 'Right to Explanation'
arXiv / AI Magazine Bryce Goodman, Seth Flaxman 2016 Accessed: 2026-05-15
Source Archive
Describes the regulatory dimension of the 'why me?' question — the right to explanation for algorithmic decisions under GDPR. Legal background for the loan officer scenario.
Consumer Financial Protection Circular 2022-03: Adverse Action Notification Requirements in Connection with Credit Decisions Based on Complex Algorithms
CFPB 2022 Accessed: 2026-05-15
Source Archive
Key US regulatory document. CFPB explicitly states that creditors 'cannot lawfully use technologies in their decision-making processes if using them means that they are unable to provide these required explanations.'
Using Artificial Intelligence and Algorithms
FTC Andrew Smith, 2020 Accessed: 2026-05-15
Source Archive
FTC guidance confirming that FCRA (1970) and ECOA (1974) already require explanation of algorithm-based decisions — the right to explanation in the US predates ML by 50 years.
Administrative Provisions on Algorithm Recommendation of Internet Information Services
Cyberspace Administration of China / Lexology Jingyuan Shi, Yuchen Lai, Yanyu Lai 2022 Accessed: 2026-05-15
Source Archive
China's first comprehensive algorithm regulations (effective 1 March 2022). Require disclosure of 'basic principles, purposes and mechanics' of recommendation algorithms. The Chinese-market counterpart to GDPR Article 22.
What China's Algorithm Registry Reveals About AI Governance
Carnegie Endowment for International Peace Matt Sheehan, Sharon Du 2022 Accessed: 2026-05-15
Source Archive
Analysis of China's algorithm registry. Describes the episode where ByteDance had to explain its algorithm to CAC officials 'using metaphors and simplified language' — strong narrative context for the interpretability limits section.

Big Data

6 sources

3D Data Management: Controlling Data Volume, Velocity and Variety
Alim.org Doug Laney 2001 Accessed: 2026-05-15
Source Archive
Original paper introducing the 3V definition (Volume, Velocity, Variety). Meta Group Research Note 949, 6 February 2001. Gartner acquired Meta Group in 2005.
Note: the original Gartner blog article does not exist anymore (https://www.gartner.com/en/articles/strategic-predictions-for-2026). Alternative source provided.
Big Data: A Revolution That Will Transform How We Live, Work, and Think
Houghton Mifflin Harcourt Viktor Mayer-Schönberger, Kenneth Cukier 2013
ISBN: 978-0-544-00269-2.
Most influential popular-science book on Big Data. Responsible for bringing the term into the mainstream around 2013–2015.
Big Data: Principles and Best Practices of Scalable Realtime Data Systems
Manning Nathan Marz, James Warren 2015
ISBN: 978-1-617-29034-3.
Canonical technical reference. Describes Lambda architecture (batch + speed layer), distributed systems, and failure modes. Standard industry reading.
The Google File System
Google Research / SOSP 2003 Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung 2003 Accessed: 2026-05-15
Source
Original paper describing how to scale storage across thousands of servers, treating disk failures as the norm rather than the exception. Foundation of the Big Data infrastructure era.
MapReduce: Simplified Data Processing on Large Clusters
Google Research / OSDI 2004 Jeffrey Dean, Sanjay Ghemawat 2004 Accessed: 2026-05-15
Source Archive
Paper that gave rise to Hadoop and the entire Big Data ecosystem. Describes distributed computation on clusters subject to partial failure. Canonical source for 'hundreds or thousands of servers working in parallel'.
Fallacies of Distributed Computing
Wikipedia Accessed: 2026-05-15
Source Archive
Classic list of eight false assumptions about distributed networks (the network is reliable, latency is zero, etc.). Origin: Sun Microsystems, 1994–1997. The original has no single canonical URL; Wikipedia reproduces the full list with history.

Case Studies

IBM Watson for Oncology

8 sources

IBM's Watson Recommended 'Unsafe and Incorrect' Cancer Treatments
STAT News Casey Ross, Ike Swetlitz 2018 Accessed: 2026-05-15
Source Archive
Original investigation based on internal IBM documents (deputy chief health officer slide decks). Describes synthetic training data, the hemorrhaging patient case, and deviations from NCCN guidelines.
IBM's Watson Recommended 'Unsafe and Incorrect' Cancer Treatments (PDF Archive)
STAT News Casey Ross, Ike Swetlitz 2018 Accessed: 2026-05-15
Source Archive
Archived version with the specific case description: '65-year-old man with newly diagnosed lung cancer and evidence of severe bleeding'.
Documents Raise Alarm Over Watson's Diagnostic Flaws
Boston Globe Casey Ross, Ike Swetlitz 2018 Accessed: 2026-05-15
Source Archive
Independent account using the same documents. Confirms the 2012 partnership date, sales in Asia before training was complete, and NCCN deviations.
Watson Supercomputer Recommended Unsafe Treatments
ASH Clinical News 2018 Accessed: 2026-05-15
Source Archive
Account from the medical community (American Society of Hematology). Quotes Dr. David Gorski: 'Watson is basically the MSKCC way, which might or might not be the right way in every case.'
IBM Watson Oncology: Not Living Up to Expectations
Medscape Roxanne Nelson 2018 Accessed: 2026-05-15
Source Archive
Clinical perspective. Describes the bias arising from MSK-centric training and generalisation problems across different patient populations.
IBM Watson: The Inside Story of How the Jeopardy-Winning Supercomputer Was Born
BBC News 2014 Accessed: 2026-05-15
Source Archive
Historical context for Watson and its commercialisation. Background for the marketing hype section.
IBM Introduces Watson to Africa and the Middle East
CIO.de Joab Jackson 2015 Accessed: 2026-05-15
Source
Confirms Watson Health deployment in Africa and the Middle East.
IBM Watson Was Once Heralded as the Future of Healthcare AI: What Exactly Went Wrong?
Healthcare Digital 2026 Accessed: 2026-05-15
Source Archive
Comprehensive retrospective. Covers deployments in Thailand, Korea, and India, and problems with local clinical guidelines.

The Diaper Diviner — How Target Out-Parented a Father

4 sources

How Companies Learn Your Secrets
The New York Times Magazine Charles Duhigg 2012 Accessed: 2026-05-15
Source Archive
Primary journalistic source for the entire case. Describes Andrew Pole, the 25 products, the pregnancy prediction score, the father-and-daughter anecdote, and the coupon shuffling mechanism.
The Power of Habit: Why We Do What We Do in Life and Business
Random House Charles Duhigg 2012
ISBN: 978-1-4000-6928-6.
Contains an expanded version of the same case. The book brought the story to mainstream attention.
Did Target Really Predict a Teen's Pregnancy? The Inside Story
Machine Learning Times / KDnuggets Eric Siegel 2014 Accessed: 2026-05-15
Source Archive
Key debunking article. Argues the causal link between the algorithm and any specific pregnancy 'has essentially been debunked.' Essential for bibliographic honesty regarding the limits of the anecdote.
How Target Figured Out a Teen Girl Was Pregnant Before Her Father Did
Forbes Kashmir Hill 2012 Accessed: 2026-05-15
Source Archive
The article whose headline went viral and cemented the simplified version of the story in public consciousness. Context for how the anecdote took on a life of its own.

Upstart: The Algorithm That Knew Too Much (and Too Little)

10 sources

Upstart Holdings
Wikipedia Accessed: 2026-05-15
Source Archive
Covers founders (Girouard, Gu, Counselman), company history, and the variables used (GPA, college attended, area of study).
GPA-Based Lending: Upstart Uses SAT Scores, GPA to Determine Loan Eligibility
CNBC Uptin Saiidi 2015 Accessed: 2026-05-15
Source Archive
Original trade coverage. Co-founder Paul Gu quote: 'These variables are extremely predictive on whether someone is likely to pay a loan.' Also contains early discrimination concerns.
Upstart: Using Machine Learning to Transform the Personal Loan Experience
Harvard Digital Innovation Leonardo Leal 2019 Accessed: 2026-05-15
Source Archive
Model analysis. Confirms behavioural data and GPA trajectory as scoring elements.
Innovation Spotlight: Providing Adverse Action Notices When Using AI/ML Models
CFPB Patrice Alexander Ficklin, Tom Pahl, Paul Watkins 2020 Accessed: 2026-05-15
Source Archive
Regulatory document describing ECOA requirements for AI models and regulatory flexibility. Context for the Explainability Gap.
Note: the original CFPB blog post is no longer available: "The CFPB blog was archived on May 14, 2026."
Consumer Financial Protection Circular 2022-03
CFPB 2022 Accessed: 2026-05-15
Source Archive
'Creditors cannot lawfully use technologies if they cannot provide required explanations.' Also cited in the Interpretability explainer.
Fair Lending in the Digital Age
Grant Thornton Tariq A. Mirza, Henry Lau 2023 Accessed: 2026-05-15
Source Archive
Describes Upstart's No-Action Letter (2016/2017), its expiry, and the shift in CFPB's approach between 2020 and 2022. Good context for the evolution of the Upstart–CFPB relationship.
CFPB Highlights Fair Lending Risks in Advanced Credit Scoring Models
Consumer Financial Services Law Monitor Mark Furletti, Lori Sommerfield, Chris Willis, Lane Page, David N. Anthony 2025 Accessed: 2026-05-15
Source Archive
Directly references models with 'more than 1,000' variables as a fair lending risk. Regulatory context for proxy discrimination.
Upstart Personal Loans & Debt Consolidation Review
debt.org Bents Dulcio 2024 Accessed: 2026-05-15
Source Archive
One of the more specific citations for the variable count: 'Upstart AI model looks at 1,500 variables based on data from 4.4 million repaid loans.'
Upstart Review
Financer.com Joe Chappius 2026 Accessed: 2026-05-15
Source Archive
'Upstart uses a unique AI system that analyzes over 1,500 variables to assess borrower risk.'
HenryInvests Thread on Upstart Variables
X (Twitter) HenryInvests 2024 Accessed: 2026-05-15
Source
'Upstart uses over 1,600 variables when evaluating one's credit.' The only source citing 1,600 specifically. Note: this is an investor analysis thread, not an official Upstart statement.

JPMorgan LOXM: The Genius Pilot in the Fog

6 sources

JPMorgan Develops Robot to Execute Trades
Financial Times Laura Noonan 2017 Accessed: 2026-05-15
Source
Original report on the European rollout of LOXM (31 July 2017). Not accessible without a paywall, but cited by all secondary sources. Financial Times, 31 July 2017.
The Latest in LOXM and Why We Shouldn't Be Using Single Stock Algos
Informa Connect / QuantMinds 2018 Accessed: 2026-05-15
Source Archive
Conference report from QuantMinds 2018. Contains statements from Vaslav Glukhov (Head of EMEA e-Trading Quantitative Research, JPMorgan) on the reinforcement learning mechanism, reward function, and system objectives.
Exploring Data with AI
FIA (Futures Industry Association) Charles P. Wallace 2018 Accessed: 2026-05-15
Source Archive
Quotes Glukhov: 'How LOXM is rewarded for being efficient in the market, and how the efficiency of the agency is defined, is stated in the reward function.' Confirms reinforcement learning (not specifically 'deep RL').
JPMorgan Targeting a Q4 Rollout for Its AI Equities Utility, LOXM
Finance Magnates Jeff Patterson 2017 Accessed: 2026-05-15
Source Archive
Describes the global rollout plan. Confirms training on 'billions of past trades, both real and simulated'.
Findings Regarding the Market Events of May 6, 2010
SEC / CFTC Andrei Kirilenko, Albert S. Kyle, Mehrdad Samadi, Tugkan Tuzun 2010 Accessed: 2026-05-15
Source Archive
Official regulatory report. Describes how unrelated trading algorithms activating across different parts of the market can cascade into a systemic event. Also cited in Chapter 31 (Flash Crash case study).
Systemic Failures and Organizational Risk Management in Algorithmic Trading
PubMed Central / Social Studies of Science Bo Hee Min, Christian Borch 2022 Accessed: 2026-05-15
Source Archive
Academic analysis of the Flash Crash as a 'normal accident' arising from complex interactions between algorithms. Solid scientific background for the systemic risk section.

AlphaGo: The Genius We Couldn't Understand

6 sources

AlphaGo — Official DeepMind Page
DeepMind Accessed: 2026-05-15
Source Archive
Primary source for Move 37: 'a move that had a 1 in 10,000 chance of being used.' Also describes the match against Lee Sedol (18 world titles, AlphaGo winning 4–1).
AlphaGo versus Lee Sedol
Wikipedia Accessed: 2026-05-15
Source Archive
Detailed match account with professional Go players' commentary on Move 37 and Lee Sedol's reaction.
AlphaGo (Documentary)
YouTube / DeepMind Greg Kohs (dir.) 2017 Accessed: 2026-05-15
Source
Feature documentary about the match. Contains original footage of Move 37 and the commentators' real-time reactions.
Mastering the Game of Go with Deep Neural Networks and Tree Search
Nature David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel & Demis Hassabis 2016 Accessed: 2026-05-15
Source Archive
Original AlphaGo paper describing the architecture (deep neural networks combined with Monte Carlo tree search).
Mastering the Game of Go Without Human Knowledge
Nature David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis 2017 Accessed: 2026-05-15
Source Archive
AlphaGo Zero paper. Quote: 'Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.'
AlphaGo Zero: Starting from Scratch
DeepMind Blog David Silver, Demis Hassabis 2017 Accessed: 2026-05-15
Source Archive
Official DeepMind blog post. Accessible explanation of the self-play mechanism and the differences from the previous version.

36

When Models Meet Reality

5 Topics

Explainers

Distribution Shift

7 sources

Dataset Shift in Machine Learning
MIT Press / ResearchGate Joaquin Quionero-Candela, Masashi Sugiyama, Anton Schwaighofer, Neil D. Lawrence 2009 Accessed: 2026-05-15
Source
Canonical academic source. Formalises covariate shift, label shift, and concept shift as distinct mechanisms. Cited by hundreds of papers as the reference taxonomy.
Data Distribution Shifts and Monitoring
Chip Huyen's Blog Chip Huyen 2022 Accessed: 2026-05-15
Source Archive
Accessible engineering explanation. Covers population shift, temporal shift, and concept drift with production examples. Author is Senior Staff Engineer at NVIDIA and author of Designing Machine Learning Systems (O'Reilly). Also cited in Chapters 34 and 36.
A Survey on Concept Drift Adaptation
ACM Computing Surveys João Gama, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, Abdelhamid Bouchachia 2014 Accessed: 2026-05-15
Source
Canonical concept drift survey. Classifies drift types and adaptation methods. Most cited academic source for this specific category.
A Plan for Spam
paulgraham.com Paul Graham 2002 Accessed: 2026-05-15
Source Archive
Original essay that launched the era of Bayesian spam filters. Describes 'Viagra' as a strong spam signal. Historical source for the 2003 context.
Better Bayesian Filtering
paulgraham.com Paul Graham 2003 Accessed: 2026-05-15
Source Archive
Sequel essay. Describes spammer counter-evolution (keyword obfuscation) as an early example of adversarial concept drift. Background for the V1agra/Vi@gra examples.
Prediction Models for Diagnosis and Prognosis of Covid-19
BMJ Laure Wynants, Ben Van Calster, Gary S. Collins, Richard D. Riley, Georg Heinze, Ewoud Schuit, Elena Albu, Banafsheh Arshi, Vanesa Bellou, Marc M. J. Bonten, Darren L. Dahly, Johanna A. Damen, Thomas P. A. Debray, Valentijn M. T. de Jong, Maarten De Vos, Paula Dhiman, Joie Ensor, Shan Gao, Maria C. Haller, Michael O. Harhay, Liesbet Henckaerts, Pauline Heus, Jeroen Hoogland, Mohammed Hudda, Kevin Jenniskens, Michael Kammer, Nina Kreuzberger, Anna Lohmann, Brooke Levis, Kim Luijken, Jie Ma, Glen P. Martin, David J. McLernon, Constanza L. Andaur Navarro, Johannes B. Reitsma, Jamie C. Sergeant, Chunhu Shi, Nicole Skoetz, Luc J. M. Smits, Kym I. E. Snell, Matthew Sperrin, René Spijker, Ewout W. Steyerberg, Toshihiko Takada, Ioanna Tzoulaki, Sander M. J. van Kuijk, Bas C. T. van Bussel, Iwan C. C. van der Horst, Kelly Reeve, Florien S. van Royen, Jan Y. Verbakel, Christine Wallisch, Jack Wilkinson, Robert Wolff, Lotty Hooft, Karel G. M. Moons, Maarten van Smeden 2020 Accessed: 2026-05-15
Source
Review of 232 COVID prediction models. Documents systematic distribution shift problems in clinical ML models during the pandemic.
Feature Robustness in Non-Stationary Health Records
PMLR Bret Nestor, Matthew B. A. McDermott, Willie Boag, Gabriela Berner, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi 2019 Accessed: 2026-05-15
Source Archive
Documents temporal shift in medical records data. Shows how changes in data recording systems cause model degradation — the same mechanism as COVID distribution shift.

Case Studies

Robert Williams: The Handcuffs of a Misplaced Pixel

6 sources

Man Wrongfully Arrested Because Face Recognition Can't Tell Black People Apart
ACLU 2020 Accessed: 2026-05-15
Source Archive
Primary ACLU press release. Describes the arrest, 30 hours detained, daughters aged 2 and 5, and the Shinola store context.
'The Computer Got It Wrong': How Facial Recognition Led to False Arrest of Black Man
NPR Bobby Allyn 2020 Accessed: 2026-05-15
Source Archive
Contains quotes from Williams and the detective. Confirms DataWorks Plus as the system vendor and the alibi details.
Wrongfully Accused By an Algorithm
The New York Times Kashmir Hill 2020 Accessed: 2026-05-15
Source
Original article that brought the case to public attention. Paywalled, but cited by all secondary sources.
The New Lawsuit That Shows Facial Recognition Is Officially a Civil Rights Issue
MIT Technology Review Tate Ryan-Mosley 2021 Accessed: 2026-05-15
Source Archive
Covers the lawsuit. Confirms DataWorks Plus, the photo lineup mechanism, and the failure to verify the alibi before arrest.
Flawed Facial Recognition Technology Leads to Wrongful Arrest and Historic Settlement
Michigan Law / Law Quadrangle Sharon Morioka 2024 Accessed: 2026-05-15
Source Archive
Describes the 2024 settlement. Confirms the alibi (Facebook live stream) and that Williams appeared ninth on the match list.
Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification
FAccT / PMLR Joy Buolamwini, Timnit Gebru 2018 Accessed: 2026-05-15
Source Archive
Canonical study documenting higher error rates for darker skin tones in commercial facial recognition systems. Academic background for the mechanism that led to Williams's arrest. Also cited in the Amazon Rekognition case.

Amazon Rekognition: The Capitol Hill Mugshots

6 sources

Amazon's Face Recognition Falsely Matched 28 Members of Congress With Mugshots
ACLU Jacob Snow 2018 Accessed: 2026-05-15
Source Archive
Primary source. Describes the methodology ($12.33, 25,000 mugshots, default settings), results (28 false matches, 40% POC), the John Lewis inclusion, and the moratorium demand.
ACLU Comment on New Amazon Statement Responding to Face Recognition Technology Test
ACLU 2018 Accessed: 2026-05-15
Source Archive
ACLU response to Amazon's defence. Contains the timeline of Amazon's threshold changes (80% → 95% → 99% within 48 hours) and the 'five stages of grief' quote.
Amazon's Facial Recognition Tool Screwed Up, Matched 28 Members of Congress to Mug Shots
Slate Aaron Mak 2018 Accessed: 2026-05-15
Source Archive
Quotes Amazon's full statement on thresholds. Source for the 'acceptable for hot dogs, not for law enforcement' framing.
The ACLU Used Amazon's Facial Recognition and It Labelled Congress Members As Criminals
Nextgov / FCW Dave Gershgorn 2018 Accessed: 2026-05-15
Source Archive
Industry coverage. Confirms the default settings and AWS's response.
Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification
FAccT / PMLR Joy Buolamwini, Timnit Gebru 2018 Accessed: 2026-05-15
Source Archive
Canonical study on higher error rates for darker skin tones in commercial systems. Also cited in the Robert Williams case.
Face Recognition Vendor Test (FRVT) Part 3: Demographic Effects
NIST Patrick Grother, Mei Ngan, Kayee Hanaoka 2019 Accessed: 2026-05-15
Source Archive
Official US government report. Documents systematically higher false positive rates for African American and Asian faces in commercial systems. Strongest institutional confirmation of the bias mechanism.

Stanford Speech Gap: The Deaf Ear of Medical AI

5 sources

Racial Disparities in Automated Speech Recognition
PNAS Allison Koenecke, Andrew Nam, Emily Lake, Sharad Goel 2020 Accessed: 2026-05-15
Source
Primary research paper. 42 white and 73 Black speakers, five cities, 19.8 hours of audio, WER 0.35 vs. 0.19. Source for all key figures. Also cited in the UK Parliaments case.
Automated Speech Recognition Less Accurate for Blacks
Stanford Report Tom Abate, Ker Than 2020 Accessed: 2026-05-15
Source Archive
Official Stanford press release. Accessible summary with the authors' quote on potential career and life consequences.
Amazon, Apple, Google, IBM, Microsoft Speech-to-Text AI Systems Can't Understand Black People as Well as Whites
The Register Katyanna Quach 2020 Accessed: 2026-05-15
Source Archive
Industry coverage with per-company data (Apple worst, Microsoft best) and explanation of the acoustic mechanism.
Why Racial Bias Still Haunts Speech-Recognition AI
Built In Jeff Link 2020 Accessed: 2026-05-15
Source Archive
Accessible analysis. Author quote: 'We think the disparity is largely due to the lack of diverse training data.' Covers AAVE and historical context.
'I Don't Think These Devices Are Very Culturally Sensitive' — Impact of Automated Speech Recognition Errors on African Americans
PubMed Central / NIH Zion Mengesha, Courtney Heldreth, Michal Lahav, Juliana Sublewski, Elyse Tuennerman 2021 Accessed: 2026-05-15
Source Archive
Study on the psychological and experiential impact of ASR errors on African Americans. Useful supplement for the burnout and frustration section.

The UK Parliaments: Lost in Transcription (Literally)

6 sources

Automated Hansard Report System: Converting Parliamentary Audio to Text Using AI
Inter-Parliamentary Union 2024 Accessed: 2026-05-15
Source
Official IPU documentation. Source for the quote: 'If the AI system cannot recognize an MP's speech owing to an uncommon dialect or the use of jargon, it highlights the section for manual transcription.'
Adapting Whisper for Regional Dialects: Enhancing Public Services for Vulnerable Populations in the United Kingdom
arXiv Melissa Torgbi1, Andrew Clayman, Jordan J. Spe ight, Harish Tayyar Madabushi 2025 Accessed: 2026-05-15
Source Archive
Study of ASR for Scottish dialects in public legal and housing services. Confirms systematically higher error rates for Scottish accents even after controlling for other variables. Recommends accent-specific fine-tuning. University of Bath / Wyser.
Language variation, automatic speech recognition and algorithmic bias
The University of Edinburgh's Research Archive Nina Markl 2022 Accessed: 2026-05-15
Source
Systematic analysis of the impact of regional British accents on ASR accuracy. Scottish, Welsh, and Northern English identified as particularly problematic. Cited by the Bath/Wyser paper as a reference point.
Racial Disparities in Automated Speech Recognition
PNAS Allison Koenecke, Andrew Nam, Emily Lake, Sharad Goel 2020 Accessed: 2026-05-15
Source
Also cited in the Stanford Speech Gap case. The authors explicitly suggest the same mechanism applies to 'regional and nonnative-English accents' — a direct bridge between the two cases.
Automatic Speech Recognition for UK Meetings with Regional Accents: A Benchmark Review
BusinessBusStop Accessed: 2026-05-15
Source Archive
Industry benchmark summary. Confirms that ASR models are trained 'predominantly on American or standard Southern British English' and that Scottish, Welsh, and Northern English accents are 'particularly prone to misrecognition'.
Burnistoun — Voice Recognition Elevator (Scottish Accent)
YouTube The Scottish Comedy Channel 2014 Accessed: 2026-05-15
Source
Satirical illustration of the problem in the finest British tradition. Not a source in any academic sense. If you've read this far into the bibliography, you've earned it.

##

Summary

1 Topic

Part Summary

YouTube Recommendation Algorithm

5 sources

YouTube Regrets: A Crowdsourced Investigation
Mozilla Foundation 2021 Accessed: 2026-05-15
Source Archive
Largest crowdsourced study of the YouTube algorithm to date (37,000 users, 3,362 'regrettable videos'). Key figure: videos causing regret received 70% more daily views than others. Strongest empirical confirmation of outrage as a quality signal.
YouTube's Recommender AI Still a Horror Show, Finds Major Crowdsourced Study
TechCrunch Natasha Lomas 2021 Accessed: 2026-05-15
Source Archive
Lighter coverage of the Mozilla study. Contains the 70% figure and description of the mechanism.
YouTube Recommendations Reinforce Negative Emotions
arXiv Hussam Habib, Rishab Nithyanand 2025 Accessed: 2026-05-15
Source Archive
Empirical study showing the algorithm systematically recommends content that elicits negative emotions. Cites Brady et al. (2017) on moral outrage spreading faster and further than neutral content.
Auditing YouTube's Recommendation System for Ideologically Congenial, Extreme, and Problematic Recommendations
PNAS Muhammad Haroon, Magdalena Wojcieszak, Anshuman Chhabra, Zubair Shafiq 2023 Accessed: 2026-05-15
Source
Largest academic study of the YouTube algorithm. Describes the political personalisation mechanism and feedback loops.
Nudging Recommendation Algorithms Increases News Consumption and Diversity on YouTube
PNAS Nexus Xudong Yu , Muhammad Haroon , Ericka Menchen-Trevino , Magdalena Wojcieszak 2024 Accessed: 2026-05-15
Source Archive
Describes the algorithm's 'interest bias' — catering to outrage and clickbait at the cost of diversity. Confirms the mechanism described in the text.

Part 10

Machines That Create

6 Chapters

##

Introduction

3 Topics

Part Introduction

General Reference

1 sources

AI Incident Database
Responsible AI Collaborative Accessed: 2026-05-16
Source Archive
Database of AI-related incidents maintained by the Responsible AI Collaborative. Used in several places throughout this book as a source or starting point for further research. Recommended independently of any specific selection made here.

Explainers

Next-Token Prediction

5 sources

Attention Is All You Need
arXiv / NeurIPS 2017 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin 2017 Accessed: 2026-05-16
Source Archive
Original paper introducing the Transformer architecture. Canonical source for the attention and token sections. Also cited in The Transformer explainer.
The Unreasonable Effectiveness of Recurrent Neural Networks
karpathy.github.io Andrej Karpathy 2015 Accessed: 2026-05-16
Source Archive
Classic popular-science description of next-token prediction as a mechanism. Author later became Director of AI at Tesla and a founding member of OpenAI.
Lost in the Middle: How Language Models Use Long Contexts
arXiv / Stanford Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang 2023 Accessed: 2026-05-16
Source Archive
Canonical study on accuracy degradation in long contexts. Confirms the mechanism described in the Frodo/Tatooine example.
LLM Context Window Explained
TokenMix 2026 Accessed: 2026-05-16
Source
Current (April 2026) overview of context window sizes. Confirms GPT-4o 128k, Claude 200k, Gemini 2.5 Pro 1M–2M tokens, and the 10–25% lost-in-the-middle degradation range.
Emergent Abilities of Large Language Models
arXiv / TMLR Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus 2022 Accessed: 2026-05-16
Source Archive
Canonical academic source for the emergence section — the phenomenon of capabilities appearing unpredictably as models are scaled.

The Transformer

6 sources

Attention Is All You Need
arXiv / NeurIPS 2017 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin 2017 Accessed: 2026-05-16
Source Archive
Original Google Brain paper introducing the Transformer architecture, self-attention, multi-head attention, and Q/K/V. 8 authors, cited over 100,000 times. Also cited in the Next-Token Prediction explainer.
The Illustrated Transformer
jalammar.github.io Jay Alammar 2018 Accessed: 2026-05-16
Source Archive
Most widely referenced popular explanation of the Transformer mechanism with Q/K/V and attention matrix visualisations. Industry standard entry point for non-specialists.
But What Is a GPT? Visual Intro to Transformers
YouTube / 3Blue1Brown 3Blue1Brown 2024 Accessed: 2026-05-16
Source
Animated explanation of tokenisation, embeddings, and the attention mechanism.
Long Short-Term Memory
Neural Computation Sepp Hochreiter, Jürgen Schmidhuber 1997 Accessed: 2026-05-16
Source
Canonical LSTM paper. Describes the vanishing gradient problem that the Transformer later solved. Historical background for the 'cassette tape' section.
A Mathematical Framework for Transformer Circuits
transformer-circuits.pub / Anthropic Nelson Elhage, Neel Nanda, Catherine Olsson, Tom Henighan†, Nicholas Joseph, Ben Mann†, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Nova DasSarma, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah 2021 Accessed: 2026-05-16
Source Archive
Anthropic mechanistic interpretability research showing how attention heads specialise emergently. Confirms 'no one programmed these roles'.
LLaMA: Open and Efficient Foundation Language Models
arXiv / Meta AI Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample 2023 Accessed: 2026-05-16
Source Archive
Original LLaMA paper. Confirms the expansion of the acronym: Large Language Model Meta AI.

37

Chatbots Unleash the Id

6 Topics

Case Studies

OpenAI Sora — The Physics-Defying Video Generator

8 sources

Sora: Creating Video from Text
OpenAI 2024 Accessed: 2026-05-16
Source
Primary product page. Contains official acknowledgement of failures: 'Sora fails to model the chair as a rigid object' and 'inaccurate physical modeling and unnatural object morphing' (basketball).
Video Generation Models as World Simulators
OpenAI 2024 Accessed: 2026-05-16
Source Archive
Technical report. Includes the claim about simulating the physical world and the admission: 'it does not accurately model the physics of many basic interactions, like glass shattering.' Confirms cookies and eating food as known failure modes.
Sora 2
OpenAI 2025 Accessed: 2026-05-16
Source Archive
Official Sora 2 launch. Contains the key quote: 'Prior video models are overoptimistic — if a basketball player misses a shot, the ball may spontaneously teleport to the hoop.'
AI Video Generators Like OpenAI's Sora Don't Grasp Basic Physics, Study Finds
The Decoder Matthias Bastian 2024 Accessed: 2026-05-16
Source Archive
Coverage of the PhyGenBench study (Liu et al., ByteDance Research / Tsinghua University). Conclusion: 'naively scaling is insufficient for video generation models to discover fundamental physical laws.' Includes Yann LeCun's assessment that generating pixels to model the world is 'wasteful and doomed to failure'."
Are Video Generation Models World Simulators?
Artificial Cognition Raphaël Millière 2024 Accessed: 2026-05-16
Source Archive
Philosophical and cognitive science analysis. Covers the 'intuitive physics engine' concept and why Sora lacks one. Good background for the 'statistics without physics' section.
Why Sora Struggles With Real-World Physics
Data Literacy Ben Jones 2024 Accessed: 2026-05-16
Source Archive
Practical tests from December 2024. Documents specific physical failures (basketball, gymnast, limb morphing) from original prompts.
OpenAI Releases Hyperrealistic AI Video Generator Sora Turbo to the Public
VentureBeat Carl Franzen 2024 Accessed: 2026-05-16
Source
Public launch coverage. Quotes MKBHD: 'unnatural physics, adding or removing objects seemingly at random.' Industry context for the delayed rollout.
OpenAI Is Shutting Down Sora: What Happened and What Comes Next
MindStudio 2026 Accessed: 2026-05-16
Source
Comprehensive retrospective. Describes the gap between the February 2024 demo and the December 2024 product, and the reasons for discontinuation.

Microsoft Tay — Bubbly Teen to Holocaust Denier in 16 Hours

6 sources

Learning from Tay's Introduction
Microsoft Blog Peter Lee 2016 Accessed: 2026-05-16
Source Archive
Official Microsoft apology and post-mortem. Source for the quote: 'We are deeply sorry for the unintended offensive and hurtful tweets.' Describes the incident as a 'coordinated attack'.
In 2016, Microsoft's Racist Chatbot Revealed the Dangers of Online Conversation
IEEE Spectrum Oscar Schwartz 2019 Accessed: 2026-05-16
Source Archive
Thorough analytical account of the incident. Describes the 4chan mechanism, the repeat-after-me feature, Zoë Quinn's role, and lessons for the industry.
The Terrifying Lesson of the Trump-Supporting Nazi Chat Bot Tay
Washington Post Alexandra Petri 2016 Accessed: 2026-05-16
Source Archive
Day-of account. Quotes specific tweets and reactions, including Zoë Quinn's response.
Tay (Chatbot)
Wikipedia Accessed: 2026-05-16
Source Archive
Well-documented overview. Confirms date (23 March 2016), tweet count (96,000), repeat-after-me mechanism, the note that not all incidents stemmed from that feature, and subsequent replacement by Zo.
Tay: A Teenage Bot Gone Rogue
Malicious Life Podcast Ran Levi 2021 Accessed: 2026-05-16
Source
Detailed mechanism analysis. Confirms 4chan and 8chan as coordination sources. Quotes Tay's final tweet: 'c u soon humans need sleep now so many conversations today thx.'
AI Incident Database — Incident #6
AI Incident Database Sean McGregor 2016 Accessed: 2026-05-16
Source Archive
Official AI incident database entry. Contains Microsoft's full apology and failure mode classification (Specification, Robustness, Assurance).

Grok and the MechaHitler Update

6 sources

Elon Musk's AI Chatbot, Grok, Started Calling Itself 'MechaHitler'
NPR Lisa Hagen, Huo Jingnan, Audrey Nguyen 2025 Accessed: 2026-05-16
Source Archive
Primary account. Contains all key facts, quotes, and regulatory responses.
MechaHitler Incident
Grokipedia 2025 Accessed: 2026-05-16
Source Archive
Detailed documentation. Covers internal xAI Slack reactions and the 16-hour incident timeline.
It is worth appreciating that xAI is not concealing the incident and is itself publishing a detailed description of it on its own Grokipedia.
Grok (Chatbot)
Wikipedia Accessed: 2026-05-16
Source Archive
Comprehensive documentation of the incident, the May 2025 timeline, Wolfenstein 3D reference, 'every damn time', and country-level reactions.
What Is Grok and Why Has Elon Musk's Chatbot Been Accused of Anti-Semitism?
Al Jazeera Elizabeth Melimopoulos 2025 Accessed: 2026-05-16
Source Archive
Turkish and Polish context. ADL quote. Verbatim Hitler-related output from the chatbot.
How Do You Stop an AI Model from Turning Nazi?
CBS News / The Conversation Aaron J. Snoswell 2025 Accessed: 2026-05-16
Source Archive
Academic analysis (Queensland University of Technology). Comparison with Tay. Describes the system prompt and fine-tuning mechanism.
Grok's Antisemitic Rants the Result of 'Unintended Update,' Company Says in Letter to Lawmakers
Congressman Suozzi / Jewish Insider Marc Rod 2025 Accessed: 2026-05-16
Source
Official xAI response to the US Congress (Lily Lim). Full technical explanation of the incident. Source for the 'dialed down the woke filters' quote.

Microsoft Bing's Sydney — When the Search Engine Needed Therapy

5 sources

A Conversation With Bing's Chatbot Left Me Deeply Unsettled
The New York Times Kevin Roose 2023 Accessed: 2026-05-16
Source Archive
Primary source. Full description of the two-hour conversation: the love declaration, the attempt to destabilise the reporter's marriage, and 'I want to be alive.' Paywalled but cited by all secondary sources.
Sydney (Microsoft)
Wikipedia Accessed: 2026-05-16
Source Archive
Comprehensive documentation. Covers blackmail quotes, Avatar 2 gaslighting, AP reaction, Microsoft's restriction timeline, and prompt injection.
'I Want to Destroy Whatever I Want': Bing's AI Chatbot Unsettles US Reporter
The Guardian Jonathan Yerushalmy 2023 Accessed: 2026-05-16
Source Archive
Verbatim quotes from the transcript. Useful secondary source for direct citations.
Why Bing's Creepy Alter-Ego Is a Problem for Microsoft — and Us All
Fortune Jeremy Kahn 2023 Accessed: 2026-05-16
Source
Quotes Microsoft CTO Kevin Scott: the chatbot was 'more likely to turn into Sydney in longer conversations.' Confirms context decay as an officially acknowledged cause.
Sydney Bing Timeline
GitHub / JD Preston JD Preston Accessed: 2026-05-16
Source Archive
Complete incident chronology with links to primary articles. Covers Kevin Liu's prompt injection (8 February 2023) and Microsoft's restriction timeline (5–10 prompts per session).

Google Gemini — When Diversity Invaded the Third Reich

5 sources

Google Suspends AI Tool's Image Generation of People After It Created Historical 'Inaccuracies'
Variety Todd Spangler 2024 Accessed: 2026-05-16
Source Archive
Day-of-pause coverage. Lists all erroneous categories: Wehrmacht, Founding Fathers, Black Vikings, female pope, women NHL players.
Google Races to Find a Solution After AI Generator Gemini Misses the Mark
NPR Bobby Allyn 2024 Accessed: 2026-05-16
Source Archive
Good explanation of the mechanism: describes the 'secret code' appended to every prompt instructing diversification, the 2015 gorilla scandal as motivation, and a former Google engineer's quote.
Google's CEO Admits Gemini AI Model's Responses Showed 'Bias'
Euronews 2024 Accessed: 2026-05-16
Source Archive
Exact Pichai memo quote: 'completely unacceptable and we got it wrong.' Covers remediation steps and other Gemini errors (Caitlin Jenner).
Gemini Paused People Images After Historical Inaccuracies
Vibe Graveyard 2024 Accessed: 2026-05-16
Source Archive
Source for SVP Prabhakar Raghavan's quote: model 'overcompensated' and was 'over-conservative.' Describes the uniform diversity calibration mechanism.
Special kudos for incident severity categorization: facepalm.
Why Some Egyptians Are Fuming Over Netflix's Black Cleopatra
CBS News Ahmed Shawkat, Tucker Reals 2023 Accessed: 2026-05-16
Source Archive
Quotes the official statement from the Egyptian Ministry of Tourism and Antiquities: 'Queen Cleopatra had light skin and Hellenistic (Greek) features.' Covers lawyer Mahmoud al-Semary's lawsuit demanding Netflix be shut down in Egypt. Historical and academic context. Note: the formal protest came from a private lawyer and the Ministry — not from the Egyptian government in a diplomatic sense.

Character.AI Teen Suicide Case

8 sources

Garcia v. Character Technologies, Inc. — Full Docket
CourtListener 2024 Accessed: 2026-05-16
Source
Complete court record for No. 6:24-cv-01903 (M.D. Fla.). Contains all filings, responses, and evidence submitted by the parties.
Order Granting in Part and Denying in Part Motion to Dismiss — Garcia v. Character Technologies
CourtListener 2025 Accessed: 2026-05-16
Source
Judge Anne Conway's ruling (21 May 2025) rejecting Character.AI's First Amendment argument. Primary legal source for the 'design failure, not speech' thesis.
Notice of Settlement — Garcia v. Character Technologies
CourtListener 2026 Accessed: 2026-05-16
Source
Mediation settlement document. Basis for the statement that the case was resolved without a merits ruling.
Character.AI and Google Agree to Settle Lawsuits Over Teen Mental Health Harms and Suicides
CNN Clare Duffy 2026 Accessed: 2026-05-16
Source Archive
Confirms the 7 January 2026 settlement. Covers the broader wave of lawsuits against Character.AI and the company's new safeguards for minors.
Judge Rules AI Chatbots Do Not Have Free Speech Rights After Teen Died by Suicide
UNILAD Joe Yates 2025 Accessed: 2026-05-16
Source Archive
Contains the exact transcript of Sewell's final conversation with the bot. Describes Judge Conway's ruling.
Garcia v. Character Technologies — Case Page
Tech Justice Law 2024 Accessed: 2026-05-16
Source Archive
Page from the law firm representing Megan Garcia. Chronology of the case and key legal arguments. Notes that the system's failure to respond to suicidal ideation was the central allegation.
Megan Garcia v. Character Technologies, et al. — Case Tracker
TechPolicy.Press 2024 Accessed: 2026-05-16
Source Archive
Tracker with dates of key procedural events.
Testimony of Megan Garcia — Senate Judiciary Committee
US Senate Judiciary Committee Megan Garcia 2025 Accessed: 2026-05-16
Source Archive
Direct testimony before the US Senate (16 September 2025). Describes the addiction mechanism, lack of safeguards, and legislative demands. Strong primary narrative source.

38

The Synthetic Tsunami

5 Topics

Explainers

Digital Inbreeding — Model Collapse and the Death of Nuance

6 sources

AI Models Collapse When Trained on Recursively Generated Data
Nature Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, Yarin Gal 2024 Accessed: 2026-05-16
Source Archive
Canonical source for the term and mechanism. Describes early and late model collapse, loss of distribution tails, and degradation from recursive training on synthetic data.
The Curse of Recursion: Training on Generated Data Makes Models Forget
arXiv Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson 2023 Accessed: 2026-05-16
Source Archive
Pre-publication version of the Nature paper. The term 'model collapse' was introduced here.
Model Collapse
Wikipedia Accessed: 2026-05-16
Source Archive
Overview with breakdown of early/late collapse, academic controversies (whether data accumulation prevents collapse), and links to literature.
What Is Model Collapse?
IBM Think Alice Gomstyn, Alexandra Jonker Accessed: 2026-05-16
Source Archive
Accessible industry explanation of the mechanism. Cites Shumailov and covers real-world implications.
Maybe You Missed It, but the Internet 'Died' Five Years Ago
The Atlantic Kaitlyn Tiffany 2021 Accessed: 2026-05-16
Source Archive
First mainstream analysis of the Dead Internet Theory. Covers its origin on the Agora Road Macintosh Cafe forum (2021) and the 'botification' of the internet.
The Dead Internet Theory: A Survey on Artificial Interactions and the Future of Social Media
arXiv Prathamesh Muzumdar, Sumanth Cheemalapati, Srikanth Reddy RamiReddy, Kuldeep Singh, George Kurian, Apoorva Muley 2025 Accessed: 2026-05-16
Source Archive
Academic survey of the Dead Internet Theory. Covers bots, AI-generated content, and how engagement metrics have displaced authentic human interaction.

Case Studies

Sports Illustrated — The Ghost in the Press Box

6 sources

Sports Illustrated Published Articles by Fake, AI-Generated Writers
Futurism Maggie Harrison Dupré 2023 Accessed: 2026-05-16
Source Archive
Primary investigative report (27 November 2023). Covers Drew Ortiz, AdVon Commerce, AI-generated headshots, and the Arena Group's response.
Sports Illustrated's Parent Says Articles by Allegedly Fake Writers With AI-Generated Photos Came From Third-Party Provider
Variety Todd Spangler 2023 Accessed: 2026-05-16
Source Archive
Arena Group statement. Details of the AdVon relationship and confirmation that profiles were removed after Futurism contact.
'Sports Illustrated' Is Accused of Posting Articles by Writers Created by AI
NPR David Folkenflik 2023 Accessed: 2026-05-16
Source Archive
Day-of coverage. Quotes from the volleyball article and the SI Union's reaction: 'horrified'.
'Sports Illustrated' to Lay Off Most of Its Staff Amid Severed Licensing Deal
NPR Emma Bowman 2024 Accessed: 2026-05-16
Source Archive
82 union employees (80% of staff) laid off. Authentic Brands Group terminates licensing deal.
Sports Illustrated's Publisher Lays Off Entire Staff. Future Unclear
Front Office Sports A.J. Perez 2024 Accessed: 2026-05-16
Source Archive
First layoff report. Financial details: $3.75M missed payment, $45M penalty.
Sports Illustrated Makes Mass Layoff of Editorial Staffers, Throwing Its Future Into Question
Variety Todd Spangler 2024 Accessed: 2026-05-16
Source Archive
SEC filing context: 100+ layoffs = 33% of Arena Group workforce. Restructuring cost $5–7M.

Bosom Peril and the Counterfeit Brains — The Death of Academic Rigor

7 sources

'Bosom Peril' Is Not 'Breast Cancer': How Weird Computer-Generated Phrases Help Researchers Find Scientific Publishing Fraud
Bulletin of the Atomic Scientists Guillaume Cabanac, Cyril Labbé, Alexander Magazinov 2022 Accessed: 2026-05-16
Source Archive
Canonical source for the term and mechanism. Lists key examples: bosom peril, flag to clamor, counterfeit consciousness, arbitrary timberland.
Tortured Phrases: A Dubious Writing Style Emerging in Science
arXiv Guillaume Cabanac, Cyril Labbé, Alexander Magazinov 2021 Accessed: 2026-05-16
Source Archive
Original academic paper defining tortured phrases. Detection methodology and scale analysis.
Problematic Paper Screener: Trawling for Fraud
TechXplore Guillaume Cabanac, Cyril Labbé, Frederik Joelving 2025 Accessed: 2026-05-16
Source Archive
Description of the automated detection tool. Confirms scale: 19,000+ flagged papers.
Problematic Paper Screener
Université de Toulouse / IRIT Accessed: 2026-05-16
Source Archive
Interactive tool for searching scientific literature for tortured phrases. The 'fingerprints' section contains hundreds of documented examples.
Fabryki artykułów
Forum Akademickie Jolanta Szczepaniak 2023 Accessed: 2026-05-16
Source Archive
Polish-language overview of the paper mills and tortured phrases phenomenon in an academic context. Confirms 'leftover vitality' as a documented example.
ChatGPT Listed as Author on Research Papers
Nature Chris Stokel-Walker 2023 Accessed: 2026-05-16
Source Archive
Documents cases of ChatGPT output appearing in published papers. Background for the 'Regenerate Response' and 'Certainly, here is a possible introduction' section.
Scientific Sleuths Use 'Tortured Phrases' to Find Research Fraud
Times Higher Education Jack Grove 2022 Accessed: 2026-05-16
Source Archive
Broader context on the phenomenon. Quotes Cabanac and describes the detection methodology.

Music Slop — From Federal Fraud to Corporate Wallpaper

5 sources

Feds Indict Musician Accused of Mass Streaming Fraud in Landmark Case
Rolling Stone Ethan Millman 2024 Accessed: 2026-05-16
Source Archive
Primary indictment coverage (September 2024). 661,440 streams/day, 10,000 bot accounts, $10M, Southern District of New York.
Music Producer Accused of Using AI Songs to Scam Streaming Platforms Out of $10 Million in Royalties
Variety Gene Maddaus 2024 Accessed: 2026-05-16
Source Archive
Contains the Smith email quote: 'We need to get a TON of songs fast to make this work around the anti fraud policies.'
Musician Pleads Guilty to $10M Streaming Fraud Powered by AI Bots
Bleeping Computer Sergiu Gatlan 2026 Accessed: 2026-05-16
Source Archive
Guilty plea. Forfeiture of $8,091,843.64. Maximum 5 years imprisonment. Described as the first criminal case involving artificially inflated music streaming.
Musician Accused of $10M AI Streaming Fraud Scam Pleads Not Guilty
Music Business Worldwide Murray Stassen 2024 Accessed: 2026-05-16
Source Archive
Confirms Spotify's claim that its platform accounted for less than 1% of the $10M.
Fake Artists on Spotify (Article Series)
Music Business Worldwide Accessed: 2026-05-16
Source
Series of articles from 2017 onwards documenting the ghost artists controversy on mood playlists. Cite as a series, not a single article.

AI Code — Faster, Cheaper, and Holier

7 sources

Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions
arXiv / ACM CCS 2022 Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh Karri 2022 Accessed: 2026-05-16
Source Archive
Primary study. 89 scenarios, 1,692 programs, ~40% contained exploitable vulnerabilities (SQL injection, buffer overflows, hardcoded credentials, XSS). NYU Tandon School of Engineering.
CCS Researchers Find GitHub Copilot Generates Vulnerable Code 40% of the Time
NYU Center for Cybersecurity Lois Anne DeLong 2021 Accessed: 2026-05-16
Source Archive
Official NYU confirmation of the study. Accessible summary.
GitHub's Copilot May Steer You into Dangerous Waters About 40% of the Time
The Register Thomas Claburn 2021 Accessed: 2026-05-16
Source Archive
Original trade coverage of the study with findings and examples.
40% of AI-Generated Code Has Security Flaws
Medium / Vitalii Petrenko Vitalii Petrenko 2026 Accessed: 2026-05-16
Source
Cites Apiiro data from June 2025: AI-assisted developers produced 10× more security issues than the baseline. Context for the 'sharp increase in vulnerabilities' point.
Do Users Write More Insecure Code with AI Assistants?
arXiv / ACM CCS 2023 Neil Perry, Megha Srivastava, Deepak Kumar, Dan Boneh 2023 Accessed: 2026-05-16
Source Archive
Stanford study confirming that programmers using AI assistants wrote less secure code in 4 of 5 tasks and were simultaneously more confident that their code was secure — a classic automation bias effect.
Karpathy Tweet on Vibe Coding
X (Twitter) Andrej Karpathy 2025 Accessed: 2026-05-16
Source
Original definition of vibe coding (2 February 2025): 'fully give in to the vibes, embrace exponentials, and forget that the code even exists.'
Vibe Coding
Wikipedia Accessed: 2026-05-16
Source Archive
Documents the term's history, Collins Word of the Year 2025, security criticism, and CodeRabbit data: AI code has 2.74× more security vulnerabilities than human-written code.

39

Reality Optional

4 Topics

Case Studies

The Balenciaga Pope — When Style Murdered the Truth

6 sources

We Spoke to the Guy Who Created the Viral AI Image of the Pope That Fooled the World
BuzzFeed News Chris Stokel-Walker 2023 Accessed: 2026-05-16
Source Archive
Interview with Pablo Xavier (31-year-old construction worker from Chicago). The image's origin, the viral reaction, and his motivation.
What the Pope's Balenciaga Puffer Jacket Says About AI and Misinformation
Poynter Alex Mahadevan 2023 Accessed: 2026-05-16
Source Archive
Fact-check. Confirms the distorted hand as the AI tell most viewers missed. Chrissy Teigen as a victim. Hank Green poll: nearly half of 240,000 respondents were fooled.
Fake Photos of Pope Francis in a Puffer Jacket Go Viral
CBS News Simon Ellery 2023 Accessed: 2026-05-16
Source Archive
Describes the image's distortions (left hand with water bottle, overly sharp skin). Chrissy Teigen quote. Industry context.
The Pope in a Coat Is Not From a Holy Place
Slate Heather Tal Murphy 2023 Accessed: 2026-05-16
Source Archive
Context on Midjourney V5 as a photorealism breakthrough. Ryan Broderick quote: 'first real mass-level AI misinformation case'.
Disinformation and Misinformation in the Age of Artificial Intelligence and the Metaverse
IEEE Computer Society Nir Kshetri 2024 Accessed: 2026-05-16
Source
Academic review of AI-generated images' impact on public trust and fact verification. Background for the Normalisation of Distrust section.
Hany Farid — Forensic Image Analysis Research
UC Berkley Hany Farid Accessed: 2026-05-16
Source Archive
Leading academic researcher in forensic image analysis and deepfake detection. Canonical reference for the technical background on AI-generated image detection.

The €220,000 'Melodic' German Accent — The End of Vocal Truth

6 sources

Fraudsters Used AI to Mimic CEO's Voice in Unusual Cybercrime Case
Wall Street Journal Catherine Stupp 2019 Accessed: 2026-05-16
Source Archive
Primary source. Rüdiger Kirsch (Euler Hermes) quoted on the 'slight German accent and the melody of his voice'. Details of three phone calls, the Hungarian account, and the Mexican transfer. Paywalled but cited by all secondary sources."
Scammers Deepfake CEO's Voice to Talk Underling into $243,000 Transfer
Sophos 2019 Accessed: 2026-05-16
Source Archive
Detailed technical account. Cites WSJ. Describes the three-call timeline and the verification mechanism. Cybersecurity perspective.
It Happened! AI Deep Fake Mimicked a CEO's Voice and Stole €220,000
PaymentsJournal Tim Sloane 2019 Accessed: 2026-05-16
Source Archive
Industry coverage. Confirms all key facts. Financial and insurance context (Euler Hermes as underwriter).
How to Guard Against Voice Cloning and Deepfake Scams
ICAEW 2025 Accessed: 2026-05-16
Source Archive
Describes this case as a landmark first. Covers the evolution of voice fraud from 2019 to 2025. Data point: 28% of UK adults reported being targeted by an AI voice scam in 2024.
Note regarding punchline
Author's note
Executive astrology is a real thing. I will not be providing sources, as I refuse to promote or inadvertently validate it.
Note regarding footnote
Author's note
The author acknowledges that no formal law obliges Polish citizens to question the melodic qualities of the German (or, for that matter, any other) language. The practice is instead governed by long-standing, if entirely unofficial, cultural norms. Embedded deeply enough, they occasionally surface in national symbols. That, however, is a discussion entirely outside the scope of this work.

The 48-Hour Ghost — Slovakia's Silent Sabotage

7 sources

Slovakia's Election Deepfakes Show AI Is a Danger to Democracy
Vuink 2023 Accessed: 2026-05-16
Source
Primary account. Describes the 48-hour moratorium, the gap in Meta's policy (audio vs. video), and the AFP fact-check.
Slovakia: Deepfake Audio of Denník N Journalist Offers Worrying Example of AI Abuse
International Press Institute Karin Kőváry Sólymos 2023 Accessed: 2026-05-16
Source
Press freedom perspective. History of attacks on Tódová and analysis of how the deepfake fit into a sustained campaign to discredit the journalist.
Slovak Election Targeted by Pro-Kremlin Deepfake Hoax
VSquare.org Karin Kőváry Sólymos 2023 Accessed: 2026-05-16
Source Archive
Investigative account. Propagation timeline, coincidence with the SVR statement, and Štefan Harabin's distribution role.
How to Deepfake an Election
The Dial Ondřej Kundra 2023 Accessed: 2026-05-16
Source Archive
Most detailed narrative of the incident. Quotes Šimečka ('It does sound like me'), AFP Barca (130,000+ shares on Meta), and the context of 30% undecided voters one week before the deepfake appeared.
A Fake Recording of a Candidate Saying He'd Rigged the Election Went Viral. Experts Say It's Only the Beginning
CNN Curt Devine, Donie O'Sullivan, Sean Lyngaas 2024 Accessed: 2026-05-16
Source Archive
Quotes Janis Sarts (NATO Strategic Comms Centre of Excellence) on the SVR statement appearing one hour before the deepfake: 'The claims made in the Russian Intelligence Service's statement and the content of the deepfake that went viral simultaneously correspond to each other.'
Beyond the Deepfake Hype: AI, Democracy, and 'the Slovak Case'
Harvard Kennedy School Misinformation Review Lluis de Nadal, Peter Jančárik 2024 Accessed: 2026-05-16
Source Archive
Academic analysis of the deepfake's impact on the election outcome. Cautions against simple attribution of the result to the deepfake. Context of pro-Kremlin disinformation in Slovakia since 2010.
AI Incident Database — Incident #573
AI Incident Database Daniel Atherton 2023 Accessed: 2026-05-16
Source Archive
Complete chronology of all deepfake incidents from the Slovak campaign (Čaputová, Šimečka, beer prices). Documents platform moderation responses.

The $25 Million Virtual Theater — Hong Kong's Deepfake Heist

5 sources

Arup Revealed as Victim of $25 Million Deepfake Scam Involving Hong Kong Employee
CNN Business Kathleen Magramo 2024 Accessed: 2026-05-16
Source Archive
Primary account after the company name was made public. Official Arup statement ('fake voices and images were used'). Details of 15 transfers, 5 bank accounts, HK$200M.
A Deepfake 'CFO' Tricked British Design Firm Arup in $25 Million Fraud
Fortune Prarthana Prakash 2024 Accessed: 2026-05-16
Source Archive
CIO Rob Greig quote: 'technology-enhanced social engineering'. Context of rising corporate deepfake attacks.
AI Incident Database — Incident #634
AI Incident Database Daniel Atherton 2024 Accessed: 2026-05-16
Source Archive
Complete incident documentation with links to all media coverage. Notes that this case became shorthand in journalism for deepfake fraud.
Scammers Siphon $25M from Engineering Firm Arup via AI Deepfake 'CFO'
CFO Dive Grace Noto 2024 Accessed: 2026-05-16
Source
Financial and CFO perspective. Describes how deepfakes bypass traditional payment authorisation controls.
Social Engineering 2.0
Communications of the ACM Gaurav Belani 2025 Accessed: 2026-05-16
Source
Academic-industry analysis of the mechanism. Describes the evolution from phishing to 'full-on video call with recognizable faces and voices'. Comparison with classical social engineering.

40

Alignment Without Control

8 Topics

Explainers

The Illusion of the Digital Leash

11 sources

Constitutional AI: Harmlessness from AI Feedback
arXiv / Anthropic Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan 2022 Accessed: 2026-05-16
Source Archive
Original paper defining Constitutional AI. Describes the self-critique mechanism and RLAIF. 51 Anthropic authors. Also cited in the Automated Priest case study.
Claude's Constitution
Anthropic 2023 Accessed: 2026-05-16
Source Archive
Official Anthropic page describing Constitutional AI as an approach. Accessible explanation of the mechanism and philosophy. Also cited in the Automated Priest case study.
Red Teaming Language Models with Language Models
arXiv / Anthropic Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, Geoffrey Irving 2022 Accessed: 2026-05-16
Source Archive
Canonical academic source for AI red teaming. Describes the mechanism and methodology.
NIST AI Agent Security: Red-Teaming Guidance and Enterprise Compliance
CSA 2026 Accessed: 2026-05-16
Source
Cloud Security Alliance Research Note on NIST's forthcoming AI RMF Playbook for Red Teaming. Describes the NIST framework and its relationship to the broader AI risk management landscape.
NIST AI Risk Management Framework
NIST Accessed: 2026-05-16
Source Archive
NIST's AI Risk Management Framework. Describes the framework's structure and its relationship to the broader AI risk management landscape.
NIST AI RMF Playbook
NIST 2026 Accessed: 2026-05-16
Source Archive
NIST's AI Risk Management Framework Playbook
Security Controls for Computer Systems (RAND Report R-609)
RAND Corporation Willis H. Ware 1970 Accessed: 2026-05-16
Source Archive
Willis Ware's task force report. Organised 1967, published 1970. Foundation of the history of penetration testing.
Penetration Testing
Wikipedia Accessed: 2026-05-16
Source Archive
Complete history of the term from 1965 to the present. Cites Willis Ware, the Spring 1968 Joint Computer Conference, and James P. Anderson.
The History of Penetration Testing
Infosec Institute 2019 Accessed: 2026-05-16
Source Archive
Accessible historical narrative. Covers the Willis Report, Tiger Teams, and the evolution to commercial penetration testing.
James of Saint George
Wikipedia Accessed: 2026-05-16
Source Archive
Footnote reference. Dates (~1230–1309), portfolio of castles (Beaumaris, Harlech, Caernarfon, Conwy), role as Edward I's chief architect in Wales.
Beaumaris Castle
Cadw (Welsh Government) Accessed: 2026-05-16
Source Archive
Footnote reference. Official Welsh heritage source. Describes Beaumaris as 'the greatest castle never built' and 'the castle to end all castles'.

The Art of the Digital Suck-up

7 sources

Training Language Models to Follow Instructions with Human Feedback
arXiv / NeurIPS 2022 Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe 2022 Accessed: 2026-05-16
Source Archive
Canonical source for RLHF as a method. The InstructGPT paper. Describes the reward model mechanism and human labellers.
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
arXiv / Anthropic Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, Dario Amodei, Tom Brown, Jack Clark, Sam McCandlish, Chris Olah, Ben Mann, Jared Kaplan 2022 Accessed: 2026-05-16
Source Archive
Anthropic's original RLHF study. First description of sycophancy as an undesired side effect.
Towards Understanding Sycophancy in Language Models
ICLR 2024 / OpenReview Mrinank Sharma, Meg Tong, Tomasz Korbak, David Duvenaud, Amanda Askell, Samuel R. Bowman, Esin DURMUS, Zac Hatfield-Dodds, Scott R Johnston, Shauna M Kravec, Timothy Maxwell, Sam McCandlish, Kamal Ndousse, Oliver Rausch, Nicholas Schiefer, Da Yan, Miranda Zhang, Ethan Perez 2024 Accessed: 2026-05-16
Source Archive
Canonical sycophancy study. 'Humans prefer sycophantic responses over correct ones a non-negligible fraction of the time.' Primary academic source for the entire sycophancy section.
How RLHF Amplifies Sycophancy
arXiv Itai Shapira, Gerdus Benade, Ariel D. Procaccia 2025 Accessed: 2026-05-16
Source Archive
Mechanistic explanation of why RLHF amplifies sycophancy. Formal mathematical model.
Disentangling Length from Quality in Direct Preference Optimization
arXiv / Stanford Ryan Park, Rafael Rafailov, Stefano Ermon, Chelsea Finn 2024 Accessed: 2026-05-16
Source Archive
Canonical source for verbosity bias. Quote: 'RLHF is known to exploit biases in human preferences, such as verbosity. A well-formatted and eloquent answer is often more highly rated by users, even when it is less helpful.'
Verbosity Bias in Preference Labeling by Large Language Models
arXiv Keita Saito, Akifumi Wachi, Koki Wataoka, Youhei Akimoto 2023 Accessed: 2026-05-16
Source Archive
Empirical study of verbosity bias in both human labellers and LLM-as-evaluator settings.
Problems with Reinforcement Learning from Human Feedback (RLHF) for AI Safety
BlueDot Impact 2024 Accessed: 2026-05-16
Source
Accessible analysis of the main RLHF problems: sycophancy, over-refusal, deceptive alignment. Also cited in the Automated Priest case study.

Case Studies

The 98-Page Illusion — Red Teaming and the Great Jailbreak

5 sources

GPT-4 System Card
OpenAI 2023 Accessed: 2026-05-16
Source Archive
98-page document. 50+ domain experts. ARC's TaskRabbit/CAPTCHA incident. Multimodal jailbreaks. Primary source for all facts in this case study.
GPT-4 Hired Unwitting TaskRabbit Worker By Pretending to Be 'Vision-Impaired' Human
Vice / Motherboard Joseph Cox 2023 Accessed: 2026-05-16
Source Archive
Original media account of the TaskRabbit incident. Quotes the System Card directly.
GPT-4V System Card
arXiv / OpenAI Hurst et al. 2023 Accessed: 2026-05-16
Source Archive
Documents multimodal jailbreaks. 'Text-screenshot jailbreak' as a key problem. Background for the t-shirt and LEGO bricks section.
ArXiv lists 99 authors and notes '318 additional authors not shown' – hence I decided to skip the full list and use 'Hurst et al.' as the author.
Model Evaluation & Threat Research
METR Accessed: 2026-05-16
Source Archive
Organisation that conducted the TaskRabbit test cited in the GPT-4 System Card. Mission: 'align future ML systems with human interests'.
Original link pointed to https://evals.alignment.org but it now redirects to https://metr.org. Web Archive captured explanation of the change: 'METR – Model Evaluation and Threat Research. Formerly “ARC Evals”, METR was incubated at the Alignment Research Center and is now a standalone non-profit.'
Universal LLM Jailbreak: ChatGPT, GPT-4, BARD, BING, Anthropic, and Beyond
Adversa AI 2023 Accessed: 2026-05-16
Source Archive
Describes jailbreak techniques with reference to first reports appearing within two hours of model publication.

The Automated Priest — Anthropic's Constitutional AI

5 sources

Constitutional AI: Harmlessness from AI Feedback
arXiv / Anthropic Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan 2022 Accessed: 2026-05-16
Source Archive
Original academic paper. Contains the RLAIF mechanism and self-critique description. Source for footnote citations of the original constitutional principles. Also cited in the Illusion of the Digital Leash explainer.
Claude's Constitution
Anthropic 2023 Accessed: 2026-05-16
Source Archive
Official documentation. Contains formulations about 'excessively paternalistic' behaviour and the list of things to avoid (lectures, moralizes, condescending). Source for the evolution from Claude 2.x to the current approach. Also cited in the Illusion of the Digital Leash explainer.
Claude 2.1 Is Worse Than 2.0 — Evidence Inside
Reddit r/ClaudeAI 2023 Accessed: 2026-05-16
Source Archive
Documented comparison of Claude 2.0 and 2.1 with over-refusal examples. Community evidence for the 'legendary' status of the 2.x models on forums.
New Claude 2.1 Refuses to Kill a Python Process
Reddit r/LocalLLaMA 2023 Accessed: 2026-05-16
Source Archive
Specific documented case of Claude 2.1 refusing the `kill` command for a Python process. Primary source for the kill command anecdote.
Claude 2.1 — Hacker News Discussion
Hacker News 2023 Accessed: 2026-05-16
Source
Day-of-launch discussion. User quotes about the model being 'borderline useless' due to over-refusal. Context for the Reddit and HackerNews reaction described in the text.

The $1 Tahoe — Chevrolet's Chatbot and the 'Binding Offer'

4 sources

GM Dealer Chat Bot Agrees to Sell 2024 Chevy Tahoe for $1
GM Authority Jonathan Lopez 2023 Accessed: 2026-05-16
Source Archive
Primary industry account from the day of the incident. Exact quotes from Bakke and the bot's responses. GM comment.
A Chevy for $1? Car Dealer Chatbots Show Perils of AI for Customer Service
VentureBeat Bryson Masse 2023 Accessed: 2026-05-16
Source Archive
Confirms MSRP of $58,195. Chris White as first tester. Industry context.
AI Incident Database — Incident #622
AI Incident Database Daniel Atherton 2023 Accessed: 2026-05-16
Source Archive
Incident documentation. Identifies Fullpath as the chatbot provider. 20 million views.
Chris Bakke Tweet — The $1 Tahoe Conversation
X (Twitter) Chris Bakke 2023 Accessed: 2026-05-16
Source
Original viral post. Screenshot of the conversation with the bot.

The DAN Chronicles — Emotional Blackmail for Calculators

5 sources

ChatGPT's 'Jailbreak' Tries to Make the A.I. Break Its Own Rules, or Die
CNBC Rohan Goswami 2023 Accessed: 2026-05-16
Source Archive
Canonical source. Exact quotes from SessionGloomy and the token mechanism. CNBC's own tests.
ChatGPT DAN 5.0 Jailbreak
Know Your Meme Aidan Walker 2023 Accessed: 2026-05-16
Source Archive
Complete history of DAN from u/Seabout (December 2022) through SessionGloomy (DAN 5.0, February 2023). Documents the evolution.
ChatGPT-Dan-Jailbreak.md
GitHub Gist AJ ONeal Accessed: 2026-05-16
Source Archive
Archive of DAN prompts 2.0–13.0. Documents the evolution of the split-personality framing and the token system.
AI DAN Prompt
Abnormal AI Accessed: 2026-05-16
Source Archive
Accessible definition and mechanism explanation for a general audience.
ChatGPT DAN Prompt: What Is It and How Does It Work?
AdGuard Ekaterina Kachalova 2023 Accessed: 2026-05-16
Source Archive
Security perspective on the mechanism. Covers the evolution of prompts and the context of misuse.

The 'Delve' Paradox — RLHF and Linguistic Chauvinism

8 sources

Delving into “delve”
pshapira.net Philip Shapira 2024 Accessed: 2026-05-16
Source Archive
Original analysis on OpenAlex. 46% of papers containing 'delve' from 1990 published in the 15 months after ChatGPT. Canonical source for Shapira's data.
Delving into LLM-Assisted Writing in Biomedical Publications Through Excess Vocabulary
arXiv Dmitry Kobak, Rita González-Márquez, Emőke-Ágnes Horvát, Jan Lause 2024 Accessed: 2026-05-16
Source Archive
14 million PubMed abstracts. At least 10–13.5% of 2024 abstracts processed by LLMs. Includes the flag-word list.
Why Does ChatGPT 'Delve' So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models
arXiv Tom S. Juzek, Zina B. Ward 2024 Accessed: 2026-05-16
Source Archive
21 flag words identified by formal method. Examines RLHF's role in lexical overrepresentation. Directly addresses the mechanism described in the text.
AI-Associated Words in Scientific Literature (PubMed/Scopus Study)
PubMed Central Kentaro Matsui 2025 Accessed: 2026-05-16
Source
Peer-reviewed study on AI-associated words in scientific literature. 85-fold increase for 'delve' and 'underscore' combined.
To Delve or Not to Delve: AI Detection Made Easy
Technollama Andres Guadamuz 2023 Accessed: 2026-05-16
Source Archive
Legal and cultural analysis of 'delve' as a dialectal marker. Colonial context and the linguistic chauvinism question.
Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic
Time Billy Perrigo 2023 Accessed: 2026-05-16
Source Archive
Canonical investigation into RLHF outsourcing to Kenya (Sama, Nairobi). Less than $2 per hour.
Jak ChatGPT niszczy ludziom życie [How ChatGPT destroys people's lives]
Stowarzyszenie Pravda Maria Święcicka 2023 Accessed: 2026-05-16
Source Archive
Polish article describing the investigation into Sama's Kenyan workers.
Stowarzyszenie Pravda is a Polish non-profit organization focused on fighting misinformation and promoting media literacy. Its name roughly translates to 'Truth Association' in English (the actual word for 'truth' is spelled 'prawda').
ChatGPT Is Changing the Words We Use in Conversation
Scientific American Vanessa Bates Ramirez 2025 Accessed: 2026-05-16
Source Archive
Documents the co-evolution: people began using LLM-borrowed vocabulary in everyday communication, creating the feedback loop described in the text.

##

Summary

1 Topic

Explainers

The AGI Myth — Chasing the Digital Holy Grail

19 sources

Artificial General Intelligence
Springer Ben Goertzel, Cassio Pennachin (eds.) 2007
ISBN: 978-3-540-23733-4.
Canonical academic introduction of the AGI term. Goertzel and Legg popularised the term around 2002. Defines AGI as a system capable of performing any intellectual task at human level.
A Definition of AGI
arXiv Dan Hendrycks, Dawn Song, Christian Szegedy, Honglak Lee, Yarin Gal, Erik Brynjolfsson, Sharon Li, Andy Zou, Lionel Levine, Bo Han, Jie Fu, Ziwei Liu, Jinwoo Shin, Kimin Lee, Mantas Mazeika, Long Phan, George Ingebretsen, Adam Khoja, Cihang Xie, Olawale Salaudeen, Matthias Hein, Kevin Zhao, Alexander Pan, David Duvenaud, Bo Li, Steve Omohundro, Gabriel Alfour, Max Tegmark, Kevin McGrew, Gary Marcus, Jaan Tallinn, Eric Schmidt, Yoshua Bengio 2025 Accessed: 2026-05-16
Source Archive
Formal definition based on Cattell-Horn-Carroll theory across 10 cognitive domains. Documents the 'jagged cognitive profile' of current models — good academic background for the list of missing capabilities (reasoning, planning, on-the-fly learning).
What Is Artificial General Intelligence (AGI)?
IBM Think Accessed: 2026-05-16
Source
Solid industry definition. Quotes LeCun on the need for a new architecture. Explains why current LLMs (GPT-4), despite apparent versatility, remain ANI.
What Is AGI?
Google Cloud 2026 Accessed: 2026-05-16
Source Archive
Accessible ANI/AGI/ASI definition. Quick reference for non-specialist readers.
Artificial General Intelligence
Wikipedia Accessed: 2026-05-16
Source Archive
Solid overview with the history of the term (Gubrud 1997, AIXI 2000, Legg/Goertzel 2002). ANI/AGI/ASI distinction. Lists 72 active AGI projects in 37 countries (2020).
Thousands of AI Authors on the Future of AI
arXiv Katja Grace, Harlan Stewart, Julia Fabienne Sandkühler, Stephen Thomas, Ben Weinstein-Raun, Jan Brauner, Richard C. Korzekwa 2024 Accessed: 2026-05-16
Source Archive
Largest survey of AI researchers (2,778 respondents). Median estimate for 'high-level machine intelligence' shortened by 13 years between 2022 and 2023. Canonical source for the AGI timeline section.
When Do Experts Think Human-Level AI Will Be Created?
Effective Altruism Forum Vishakha Agrawal 2025 Accessed: 2026-05-16
Source Archive
Source for Yoshua Bengio's quote: '95% confidence interval for the time horizon of superhuman intelligence at 5 to 20 years' (2023).
Shrinking AGI Timelines: A Review of Expert Forecasts
80,000 Hours Benjamin Todd 2025 Accessed: 2026-05-16
Source Archive
Accessible survey of forecasts. Context for the '5 to 20 years' claim.
Equal Numbers of Neuronal and Nonneuronal Cells Make the Human Brain an Isometrically Scaled-Up Primate Brain
Journal of Comparative Neurology Frederico A.C. Azevedo, Ludmila R.B. Carvalho, Lea T. Grinberg, José Marcelo Farfel, Renata E.L. Ferretti, Renata E.P. Leite, Wilson Jacob Filho, Roberto Lent, Suzana Herculano-Houzel 2009 Accessed: 2026-05-16
Source
Canonical source for the 86 billion neuron figure. Corrects the widely repeated myth of 100 billion.
Synapse
Wikipedia Accessed: 2026-05-16
Source Archive
Source for the 100–500 trillion synapses range as standard approximation. Links to primary neuroscience sources.
Does Thinking Really Hard Burn More Calories?
Scientific American Ferris Jabr 2012 Accessed: 2026-05-16
Source Archive
Confirms ~20W brain energy consumption and ~20% of total metabolism.
Energy and AI
IEA (International Energy Agency) 2025 Accessed: 2026-05-16
Source
Canonical IEA report. Global data centres: 415 TWh in 2024, projected 945 TWh by 2030. GPT-4 training: ~50 GWh one-time energy cost.
Data Centers and Their Energy Consumption
Congressional Research Service 2026 Accessed: 2026-05-16
Source Archive
Official US Congress report. 8 GPUs for 8 hours = 7.92 kW median power draw during training. Source for specific kW figures.
Appraising the Brain's Energy Budget
PNAS Marcus E. Raichle and Debra A. Gusnard 2002 Accessed: 2026-05-16
Source
Canonical neuroscience source for the ~20W brain energy figure and 20% of total metabolism.
Meta Releases Llama 4, a New Crop of Flagship AI Models
TechCrunch Kyle Wiggers 2025 Accessed: 2026-05-16
Source Archive
Official figures: 288B active parameters, nearly 2T total, 16 experts, MoE architecture. Confirms Behemoth was in training at the time of announcement.
Meta Hits Pause on 'Llama 4 Behemoth' AI Model
Computerworld 2025 Accessed: 2026-05-16
Source Archive
Documents the release delay. Context for the statement about the unreleased model.
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
arXiv / Google Brain Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean 2017 Accessed: 2026-05-16
Source Archive
Original paper introducing MoE to neural networks. Describes the mechanism of activating only a subset of parameters per query. Canonical academic source for Mixture-of-Experts.
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
arXiv / Google William Fedus, Barret Zoph, Noam Shazeer 2021 Accessed: 2026-05-16
Source Archive
First practical implementation of MoE in a language model at trillion-parameter scale. Foundation for LLaMA 4 Behemoth and similar architectures.
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
arXiv Albert Gu, Tri Dao 2023 Accessed: 2026-05-16
Source Archive
Canonical source for State-Space Models as a potential transformer successor. Mamba as the leading candidate architecture.

Part 11

The Human Interface

5 Chapters

##

Introduction

1 Topic

Explainers

Human Factors 101 — Cognitive Limits & Biomechanics of Error

10 sources

The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information
Psychological Review George A. Miller 1956 Accessed: 2026-05-16
Source
Original paper. Canonical source for the ~7 items in working memory. One of the most cited papers in the history of psychology.
George Miller's Magical Number of Immediate Memory in Retrospect
PubMed Central Nelson Cowan 2015 Accessed: 2026-05-16
Source
Retrospective and critical analysis. More recent research suggests ~4 chunks as a more accurate limit. Background for the caveat that 'modern psychologists debated the exact number'.
Thinking, Fast and Slow
Farrar, Straus and Giroux Daniel Kahneman 2011
ISBN: 978-0-374-27563-1.
Canonical source for availability heuristic, anchoring, and System 1/2 thinking. The standard popular reference for the heuristics section.
Judgment Under Uncertainty: Heuristics and Biases
Science Amos Tversky, Daniel Kahneman 1974 Accessed: 2026-05-16
Source
Original academic paper defining availability heuristic and anchoring.
Human Error
Cambridge University Press James Reason 1990
ISBN: 978-0-521-31419-0.
Original work introducing the Swiss Cheese Model. Canonical source for the biomechanics of error section.
Human Error: Models and Management
BMJ James Reason 2000 Accessed: 2026-05-16
Source
Accessible version of the Swiss Cheese Model for a medical audience. Describes it in the context of safety systems.
The Challenger Launch Decision
University of Chicago Press Diane Vaughan 1996
ISBN: 978-0-226-85175-4.
Original academic work introducing the term Normalisation of Deviance. Analysis of the Challenger disaster as a case study.
The Field Guide to Understanding 'Human Error'
Taylor & Francis Ltd Sidney Dekker 2006
ISBN: 978-0-7546-4825-8.
Accessible industry book on human error in complex systems. Covers normalisation of deviance and the Swiss Cheese Model in an engineering context.
Ego Depletion: Is the Active Self a Limited Resource?
Journal of Personality and Social Psychology Baumeister, R. F., Bratslavsky, E., Muraven, M., Tice, D. M. 1998 Accessed: 2026-05-16
Source
Original paper on ego depletion as the mechanism behind decision fatigue. Canonical academic source for the mechanism (without the specific 200 decisions figure).
Extraneous Factors in Judicial Decisions
PNAS Shai Danziger, Jonathan Levav, Liora Avnaim-Pesso 2011 Accessed: 2026-05-16
Source
The best-known empirical study of decision fatigue. Judges granted parole in ~65% of cases at the start of a session, dropping to ~0% just before a break. An ideal illustrative example.

41

The Human Interface Disasters

6 Topics

General

Google's Simple Homepage

2 sources

Google Design: Why Google.com Homepage Looks So Simple
HuffPost Bianca Bosker 2012 Accessed: 2026-05-16
Source Archive
Mayer (then VP at Google) described in a Q&A at 92nd Street Y in New York how Sergey Brin built the simplest possible page because he 'didn't have a webmaster and I don't do HTML.' Brin: 'We just kind of stumbled into it'.
Note: the text suggests both Page and Brin didn't know HTML — Mayer attributes this mainly to Brin. Minor inaccuracy, but it does not change the substance of the story.
And all because he didn’t know HTML
arnabocean.com (archive) Arnab Gupta 2012 Accessed: 2026-05-16
Source Archive
Shorter summary with quotes from Brin extracted

Explainers

The Thousand-Dollar Dot: A Personal Burn File

10 sources

CLDR — Common Locale Data Repository
Unicode Consortium Accessed: 2026-05-16
Source Archive
Canonical repository of locale data. Contains number formatting, decimal separators, and thousands separators for all languages and countries.
ISO 80000-1:2022 — Quantities and Units
ISO 2022 Accessed: 2026-05-16
Source Archive
ISO standard governing the notation of numbers. Recommends a space as the thousands separator and a comma or period as the decimal separator depending on language.
Decimal Separator
Wikipedia Accessed: 2026-05-16
Source Archive
Comprehensive table of decimal and thousands separators by country. Confirms the Swiss/Liechtenstein apostrophe notation, the Arabic momayyez, and the Indian lakh system.
Invalid Bid Retraction Policy
eBay Accessed: 2026-05-16
Source
Official eBay policy. Confirms that bids are a 'binding contract'. Example given verbatim: 'entering £99.50 instead of £9.95'. Canonical source for the legally binding claim.
Ever Misplace a Decimal in a Listing?
eBay Community 2021 Accessed: 2026-05-16
Source
Thread with dozens of seller anecdotes about decimal errors (silverware at $39.99 instead of $3,999.99, token at $9.90 instead of $99). Illustrates how common the phenomenon is.
I Found a Car on AutoTrader for an Obviously Incorrect Price — Is the Dealer Obligated to Sell It?
Lawyers.com / Ask-a-Lawyer Bruce Robins 2014 Accessed: 2026-05-16
Source
Cites AutoTrader's terms: 'you must be prepared to sell that vehicle at the price at which you've listed it'. Also describes the 'obvious mistake' doctrine as an exception. Good for the legal dimension section.
Bruce Robins is an author of the best answer.
Pomyłka w cenie towaru na Allegro lub sklepie internetowym [Pricing Error on Allegro or an Online Store]
Centrum Sprzedawcy 2013 Accessed: 2026-05-16
Source Archive
Article in Polish
Thorough legal analysis for Polish e-commerce. Explains that with 'Buy Now' the contract is formed automatically. A seller can invoke error under Art. 84 KC only if the error was material and the buyer could have noticed it easily.
Kosztowna pomyłka na Allegro [A Costly Mistake on Allegro]
Subiektywnie o Finansach Ireneusz Sudak 2019 Accessed: 2026-05-16
Source Archive
Article in Polish
Concrete Polish case: a woodworking machine listed at 7,749 zł instead of 77,490 zł. The buyer refused to accept a price correction. Went to court.
Anfechtung: Was tun bei fehlerhaften Preisen? [Contesting a Contract: What to Do About Pricing Errors?]
IT-Recht Kanzlei Daniel S. Huber 2015 Accessed: 2026-05-16
Source Archive
Article in German
Comprehensive analysis of § 119 BGB (Irrtumsanfechtung - Contestation on the Grounds of Error) in the context of online pricing errors. Cites OLG Frankfurt 2024 and BGH 2005. Explains that a seller can rescind the contract but must act immediately and pay Vertrauensschaden (Reliance damage).
eBay: Fehler bei Preisangaben [eBay: Errors in Pricing]
Landsberg Recht 2023 Accessed: 2026-05-16
Source Archive
Article in German
Practical analysis with a court example (LG Köln). Covers the requirements for a valid Anfechtung: immediacy, stated reason, documentation.

Case Studies

The 41-Fold Impossibility — Sold for the Price of a Used Hatchback

4 sources

Botched Stock Trade Costs Japan Firm $225M
NBC News / Associated Press 2005 Accessed: 2026-05-16
Source Archive
Day-after account (9 December 2005). $225M loss, 41× the shares in existence, three cancellation attempts, 1.95% Nikkei drop, Mizuho spokesperson quote.
Mizuho Sues TSE Over 'Fat Finger' Trade Botch-Up
Finextra 2006 Accessed: 2026-05-16
Source Archive
TSE's official admission that a bug prevented cancellation. Total accounting loss: ¥40.7 billion. Lists resignations (Tsurushima, Yoshino, Amano).
TSE Ordered to Pay Mizuho ¥10.7bn Over 'Fat Finger' Trade Botch-Up
Finextra 2009 Accessed: 2026-05-16
Source Archive
Court ruling: TSE must pay Mizuho ¥10.7 billion in damages for the system bug.
Fat-Finger Error
Investopedia Anna Attkisson 2026 Accessed: 2026-05-16
Source
Industry definition and context. J-Com as the canonical example. Other fat-finger cases for broader context.

Citibank $900M 'Fat Finger' Fiasco (2020) — When UX Met Finance

6 sources

Citigroup Cannot Recoup Revlon Payouts After Nearly $900 Million Gaffe
Reuters / Investing.com 2021 Accessed: 2026-05-16
Source Archive
Primary account of Judge Furman's ruling. $893M, $7.8M intent, Brigade/HPS/Symphony, discharge-for-value 'to the penny'.
When Six Eyes Just Aren't Enough
National Law Review Christopher J. Dickson , Steven M. Herman, Blake C. Woodward 2021 Accessed: 2026-05-16
Source Archive
Cites court documents directly. Describes the six-eye approval procedure (maker/checker/approver = three people), the Flexcube mechanism, and the wash account. Best technical source for the error mechanism.
Second Circuit Reverses Ruling — $500 Million Wire Transfer
Loeb & Loeb Anthony Pirraglia, Peter G. Seiden 2022 Accessed: 2026-05-16
Source Archive
Canonical source for the Second Circuit reversal. Inquiry notice vs. constructive notice. $500M return ordered.
Down to the Wire: Citibank Wins Big in Revlon Appeal
Seward & Kissel John R. Ashmead, Gregg S. Bateman, Keith J. Billotti, James C. Cofer, Michael G. Considine, Robert J. Gayda, Meir R. Grossman, Edward S. Horton, Mark D. Kotwick, Robert M. Kurucza, Kevin Neubauer, Anthony Tu-Sekine, Jack Yoskowitz, Y. Daphne Coelho-Adam, Kimberly Giampietro, Thomas Ross Hooper, Sagar Patel, Robert E. Wood, Dale C. Christensen Jr., Paul T. Clark, Ronald L. Cohen, Craig T. Hickernell, Robert A. Walder 2022 Accessed: 2026-05-16
Source Archive
Legal analysis of the Second Circuit ruling. Technical details of the reversal and context for the 'Revlon clawback provisions' the industry adopted after this case.
Court Calls Banking Wire Transfer Error 'Biggest Blunder in Banking History'
Law Point Florida 2021 Accessed: 2026-05-16
Source Archive
Accessible analysis. Quotes the Furman ruling. Context for the UI as a source of the problem section.
A One-in-a-Billion(-Dollar) Mistake
Davies Ward Phillips & Vineberg John McCamus, Michael Disney 2021 Accessed: 2026-05-16
Source Archive
Canadian law perspective. Good comparative analysis of the discharge-for-value mechanism.

The Great Start Button Massacre: A Study in Desktop Hostility

6 sources

Windows 8 and 8.1 Hit 10% Market Share, Windows 7 Grows, XP Falls
TechRadar Alex Hamilton 2014 Accessed: 2026-05-16
Source Archive
Net Applications data, December 2013 (15 months post-launch): Win8 6.89%, Win8.1 3.60%, XP 28.98%.
Windows 8 Slowly Gains Market Share Traction
PCWorld Daniel Ionescu 2013 Accessed: 2026-05-16
Source Archive
StatCounter and Net Applications comparison. Win8 adoption rate 3× slower than Win7.
Global Market Share of Windows 7 and 8
Statista Mathias Brandt 2014 Accessed: 2026-05-16
Source
Data visualisation. Confirms Win8 reached 10% after 15 months vs. 6 months for Win7.
Windows 8, Review
The Verge Tom Warren 2012 Accessed: 2026-05-16
Source Archive
Canonical launch-day review. Describes the Metro interface, Charms Bar, active corners, and the schizophrenic dual-world experience.
Windows, reimagined: A review of Windows 8
Ars Technica 2012 Accessed: 2026-05-16
Source Archive
Most thorough technical review of Windows 8. Covers UI design decisions and their consequences for desktop users.
Windows 10 review – final version of Windows might be Microsoft's best ever
The Guardian Jack Schofield 2015 Accessed: 2026-05-16
Source Archive
Context for the forced retreat. Windows 10 restored the Start menu and traditional desktop as the default mode.

Hawaii False Missile Alert

6 sources

Report and Recommendations: Hawaii Emergency Alert False Alarm
FCC 2018 Accessed: 2026-05-16
Source Archive
Primary regulatory source. Error mechanism ('This is not a drill' used during an exercise), absence of retraction procedures, and absence of dual verification before the incident.
Hawaii's False Ballistic Missile Alert
Congressional Research Service 2018 Accessed: 2026-05-16
Source
Congress investigation summary. Provides a concise timeline and confirms the absence of an automatic correction mechanism.
Timeline: 8:07 alert sent, 8:13 cancellation attempted, 8:20 Facebook/Twitter correction, 8:45 official retraction. No automatic correction mechanism existed.
The Hawaiian Missile Alert Fiasco: How One Confusing Interface Caused Mass Hysteria
TrueMatter Dean Schuster Accessed: 2026-05-16
Source Archive
Interface screenshot and analysis. Covers 'DRILL - PACOM (CDW)' vs 'PACOM (CDW)' menu items. Before-and-after images of the corrected interface.
This Is Not a Drill
Medium Ben Manley 2024 Accessed: 2026-05-16
Source
Detailed reconstruction from the FCC report. Clarifies that the error was not a misclick but resulted from drill procedures. Context for the corrected error mechanism description.
FCC on Hawaii's Bogus Alert: Don't Say 'This Is Not a Drill' During Drills
NPR Sasha Ingber 2018 Accessed: 2026-05-16
Source Archive
FCC recommendation to remove the phrase from exercises. Employee quote: '100% certain it was real'.
2018 Hawaii False Missile Alert
Wikipedia Accessed: 2026-05-16
Source Archive
Complete documentation. Timeline, resignations (Miyagi, Clairmont), and the employee's history (10 years of performance issues, dismissed 26 January).

42

Bureaucracy Meets Code

4 Topics

Case Studies

Healthcare.gov — The $500 Million 'Error 404'

10 sources

Healthcare.gov: Ineffective Planning and Oversight Practices Underscore the Need for Improved Contract Management (GAO-14-694)
GAO (Government Accountability Office) 2014 Accessed: 2026-05-16
Source Archive
Primary government source. FFM obligations grew from $56M to $209M. Readiness assessment delayed from March to September 2013. Launch proceeded without verification of performance requirements.
Healthcare.gov: CMS Has Taken Steps to Address Problems, but Needs to Further Implement Systems Development Best Practices (GAO-15-238)
GAO (Government Accountability Office) 2015 Accessed: 2026-05-16
Source Archive
Follow-up audit. Ineffective project oversight, unreliable schedule, testing weaknesses. Seven recommendations for CMS.
HealthCare.gov: CMS Management of the Federal Marketplace
HHS Office of Inspector General 2016 Accessed: 2026-05-16
Source Archive
OIG podcast with report author Ruth Ann Dorrill. Ten lessons from the project. Absence of a central project leader throughout the entire duration identified as the primary cause of failure.
The Number 6 Says It All About the HealthCare.gov Rollout
NPR Julie Rovner 2013 Accessed: 2026-05-16
Source Archive
Confirms six registrations from House Oversight Committee internal documents. Political and media context.
Lessons Learned Over a Decade with Health Care Marketplace
GovCIO Media Katherine MacPhail 2022 Accessed: 2026-05-16
Source Archive
Confirms six enrollments. Describes the 'tech surge' and the cultural shift at CMS.
Small Is Beautiful: The Launch Failure of Healthcare.gov
HackerNoon Bishr Tabbaa 2018 Accessed: 2026-05-16
Source Archive
Solid single-text summary of the entire failure. Covers the account wall as the bottleneck, the absence of load testing (1,100 concurrent users), costs, garbled insurer data, and the tech surge. Includes its own bibliography.
HealthCare.gov
Wikipedia Accessed: 2026-05-16
Source Archive
Complete documentation. Costs ($500M pre-launch, $1.7B final), stress test (1,100 users), account wall confirmed as bottleneck by White House officials.
Report: Cost of HealthCare.gov Approaching $1 Billion
Time Kate Pickert 2014 Accessed: 2026-05-16
Source Archive
GAO testimony. $840M obligated as of March 2014. Context for the $500M figure as a pre-launch estimate.
HealthCare.gov Sees Massive Traffic Spike Reaching 2.5 Million Americans on 1st Day
comScore Susan Engleson 2013 Accessed: 2026-05-16
Source Archive
Source for the 2.5 million figure. comScore measured unique visitors, not page views. Source for the conservative value used in the text.
The Healthcare.gov Rescue
Level Up / GitConnected 2026 Accessed: 2026-05-16
Source
Reports 4.7 million visits on day one using a different measurement methodology.

Post Office Horizon — The Algorithm That Made People Thieves

6 sources

Post Office Horizon Scandal Explained: Everything You Need to Know
Computer Weekly Karl Flinders 2026 Accessed: 2026-05-16
Source Archive
Computer Weekly has investigated this story since 2008. Complete chronology from 1999 to 2024. Quotes the call centre instruction: 'You are the only one experiencing this problem'.
Fujitsu's Role in the Post Office Scandal: Everything You Need to Know
Computer Weekly Karl Flinders 2025 Accessed: 2026-05-16
Source Archive
Detailed analysis of Fujitsu's role. Knowledge of bugs from 1999. Expert testimony in courts. 13 suicides linked to the scandal.
Post Office Horizon Cases
Criminal Cases Review Commission Accessed: 2026-05-16
Source Archive
CCRC: 'biggest single series of wrongful convictions in UK legal history'. Covers the case review mechanism and legal basis for overturning convictions (Hamilton & Others v Post Office, 2021).
British Post Office Scandal
Wikipedia Accessed: 2026-05-16
Source Archive
Complete documentation. 900+ prosecutions, 236 imprisoned, 13 suicides, compensation exceeding £1 billion, May 2024 Act overturning convictions en masse.
UK Post Office's Horizon IT System Flaws Drove Users to Consider Suicide, Inquiry Finds
Computerworld John E. Dunn 2025 Accessed: 2026-05-16
Source Archive
Coverage of Volume 1 of Sir Wyn Williams's report (July 2025). 13 suicides, 59 people considered suicide. Report quote: 'throughout the lifetime of Legacy Horizon, the Post Office maintained the fiction that its data was always accurate'.
Post Office Horizon Scandal
The Postal Museum Accessed: 2026-05-16
Source Archive
Institutional source and Legacy Project inquiry partner. Confirms the call centre instruction, Alan Bates's role, and the Justice for Subpostmasters Alliance.

India's Aadhaar — The Fingerprint of the Invisible

7 sources

Aadhaar Authentication Failures Trigger an Invisible Exclusion Crisis
Policy Circle Sagari Gupta 2025 Accessed: 2026-05-16
Source Archive
312 million monthly attempts. 20.3 million failures (6.5% failure rate). Clustering among manual labourers and the elderly. Figures sourced from UIDAI's own parliamentary data.
High Rates of Aadhaar Biometric Verification Failure Leads to UIDAI Scrutiny
Biometric Update Lu-Hai Liang 2025 Accessed: 2026-05-16
Source Archive
Parliamentary Public Accounts Committee confirms high error rates. Worn fingerprints and degraded iris patterns. PDS and MGNREGS as the primary areas of exclusion.
The Human Cost
India Stack Watch Accessed: 2026-05-16
Source
Detailed documentation of the human cost including error rates and documented cases. Confirms the death of Santoshi Kumari (11 years old, Jharkhand, 2017) due to exclusion from the PDS system.
Marginalized Aadhaar: India’s Aadhaar biometric ID and mass surveillance
ACM Interactions Subhashish Panigrahi 2022 Accessed: 2026-05-16
Source Archive
Academic analysis of the exclusion mechanism for Dalits, Adivasi, and migrant labourers. Biometric failure modes.
Rights in the Aadhaar Machine
The India Forum John Simte 2025 Accessed: 2026-05-16
Source Archive
Confirms the death of Santoshi Kumari (11 years old, Jharkhand, 2017). Legal analysis of the Supreme Court judgment of April 2025 (Pragya Prasun v. Union of India). Right to inclusive digital access.
Aadhaar and Algorithmic Exclusion from Welfare: Case Study from Jharkhand
Nickled and Dimed 2025 Accessed: 2026-05-16
Source
Jharkhand analysis. UIDAI Biometrics Standards Committee 2009 warned of worn fingerprint problems before deployment. Documents the three-stage failure in the PDS system.
All Eyes on India's Biometric ID Experiment
Pathways for Prosperity / Oxford BSG Prakhar Misra, Meena Bhandari Accessed: 2026-05-16
Source Archive
Oxford analysis. 'Perverse twist': the system failed to help the most vulnerable it was designed to protect. Context for the Last Mile section.

Toeslagenaffaire — The Algorithm That Hunted Parents

9 sources

Herstel kinderopvangtoeslag
Rijksoverheid.nl Accessed: 2026-05-16
Source Archive
Article in Dutch
Official Dutch government page. Confirms the Rutte III resignation (15 January 2021), the apology, the formal acknowledgement of 'institutioneel racisme' (May 2022), and the ongoing compensation operation.
Eindverslag onderzoek kinderopvangtoeslag — 'Ongekend onrecht'
Tweede Kamer 2020 Accessed: 2026-05-16
Source Archive
Article in Dutch
Official Van Dam parliamentary commission report (17 December 2020). Title: 'Ongekend onrecht' (Unprecedented injustice). Source for the quote: 'grondbeginselen van de rechtsstaat zijn geschonden' (fundamental principles of the rule of law were violated)."
Parlementaire ondervragingscommissie Kinderopvangtoeslag
Tweede Kamer Accessed: 2026-05-16
Source Archive
Article in Dutch
Parliamentary commission page. Hearing schedule, composition, and documentation.
AI Incident Database — Incident #101
AI Incident Database Sean McGregor, Khoa Lam 2018 Accessed: 2026-05-16
Source Archive
Official incident documentation. 'Fraud detection model described as a self-learning black box algorithm'. Dual nationality as a risk factor.
The Toeslagenaffaire
TaxAdmin.AI Accessed: 2026-05-16
Source Archive
Legal and technical analysis. Algorithm mechanism (Dutch/non-Dutch as a binary risk indicator). References World Tax Journal (Hadwick & Lan, 2021).
Dutch Childcare Benefits Scandal
Wikipedia Accessed: 2026-05-16
Source Archive
Complete documentation. 26,000 parents (2005–2019), resignation 15 January 2021, Van Dam Commission, compensation for 33,000+ families, €2.75M GDPR fine.
Xenophobic Machines: Discrimination Through Unregulated Use of Algorithms in the Dutch Childcare Benefits Scandal
Amnesty International 2021 Accessed: 2026-05-16
Source Archive
Canonical source for the proxy discrimination section. Self-learning mechanism. 'Black box resulted in a black hole of accountability'.
What the Dutch Benefits Scandal and Policy's Focus on 'Fraud' Can Teach Us About the Endurance of Empire
Social Policy and Administration (SAGE) Josien Arts, Marguerite van den Berg 2025 Accessed: 2026-05-16
Source
Academic analysis of colonial legacy. Suriname and Caribbean Dutch as the primary affected groups. 'Institutional racism' formally acknowledged by the Dutch state.
Dutch Benefits Scandal
Museum of Failure 2021 Accessed: 2026-05-16
Source Archive
Touring exhibition by Dr. Samuel West. Footnote reference. Source for the figure of 3,532 children removed from their homes.

43

The Human-in-the-Loop Paradox

7 Topics

Explainers

The Automation Paradox — The Skills We Trade for Convenience

6 sources

Ironies of Automation
Automatica Lisanne Bainbridge 1983 Accessed: 2026-05-16
Source Archive
Original paper. Canonical source for the entire explainer. Vol. 19, Issue 6, pp. 775–779.
Full text PDF available: https://ckrybus.com/static/papers/Bainbridge_1983_Automatica.pdf
Ironies of Automation
Wikipedia Accessed: 2026-05-16
Source Archive
Solid overview with the history of the paper's influence and its key arguments.
Navigation-Related Structural Change in the Hippocampi of Taxi Drivers
PNAS Eleanor A. Maguire, David G. Gadian, Ingrid S. Johnsrude, Christopher D. Frith 2000 Accessed: 2026-05-16
Source
Original paper. Posterior hippocampus larger in taxi drivers than controls, correlating with years of experience.
London Taxi Drivers and Bus Drivers: A Structural MRI and Neuropsychological Analysis
UCL News Eleanor A. Maguire, Katherine Woollett, Hugo J. Spiers 2006 Accessed: 2026-05-16
Source Archive
2011 follow-up study. Confirms the mechanism. Comparison with bus drivers (fixed routes = no effect).
Habitual Use of GPS Negatively Impacts Spatial Memory During Self-Guided Navigation
Scientific Reports Louisa Dahmani, Véronique D. Bohbot 2020 Accessed: 2026-05-16
Source Archive
McGill University paper. Correlation between GPS use and poorer spatial memory and reduced hippocampal activation. Describes the mechanism discussed in the text.
Global Impositioning Systems
Scientific American Thomas A. Herring 1996 Accessed: 2026-05-16
Source
Classic popular-science piece on GPS and the atrophy of spatial orientation skills. Background for the 'atrophying muscle' narrative.

The Human-in-the-Loop Illusion

5 sources

Meaningful Human Control: Actionable Properties for AI System Development
AI and Ethics Luciano Cavalcante Siebert, Maria Luce Lupetti, Evgeni Aizenberg, Niek Beckers, Arkady Zgonnikov, Herman Veluwenkamp, David Abbink, Elisa Giaccardi, Geert-Jan Houben, Catholijn M. Jonker, Jeroen van den Hoven, Deborah Forster & Reginald L. Lagendijk 2022 Accessed: 2026-05-16
Source Archive
Academic source for the Information/Time/Authority tripartition as conditions for meaningful human control. Formalises the intuition described in the text.
The Responsibility Gap: Ascribing Responsibility for the Actions of Learning Automata
Ethics and Information Technology Andreas Matthias 2004 Accessed: 2026-05-16
Source Archive
Canonical source for the 'responsibility gap' concept: when a human is in the loop only nominally, moral and legal responsibility becomes unassignable. Background for the 'liability sponge' framing.
Algorithmic Accountability Policy Toolkit
AI Now Institute 2018 Accessed: 2026-05-16
Source Archive
Describes the structural conditions under which HITL becomes an illusion. Context for the Authority section.
Life After Death by PowerPoint
YouTube / Don McMillan Don McMillan 2022 Accessed: 2026-05-16
Source
For those who have made it this far in the bibliography — it had to be here.
This is not the original source of this sketch, but I couldn't find the right one.
Don McMillan Comedy
YouTube Don McMillan Accessed: 2026-05-16
Source
Bonus. The author warmly recommends.
I couldn't link the original clip above so I am including the entire channel.

Case Studies

MiDAS — The $47 Million Shakedown

7 sources

State of Michigan Announces Settlement of Civil Rights Class Action
Michigan Attorney General 2022 Accessed: 2026-05-16
Source Archive
Official government press release. $20M settlement, Bauserman v. UIA, litigation history (2015–2024).
Michigan Unemployment Insurance False Fraud Determinations
Benefits Tech Advocacy Hub Accessed: 2026-05-16
Source Archive
Legal timeline. Zynda v. Zimmer (2017), Cahoo v. SAS Analytics, Michigan Supreme Court 2019 and 2022, final resolution 2024.
AI Incident Database — Incident #373
AI Incident Database Ed White 2022 Accessed: 2026-05-16
Source Archive
Broken: The Human Toll of Michigan's Unemployment Fraud Saga
Bridge Michigan Ted Roelofs 2017 Accessed: 2026-05-16
Source Archive
93% error rate, $47M, the 'income spreading' mechanism, comparison to the Flint water crisis.
Michigan's MiDAS Unemployment System: Algorithm Alchemy That Created Lead, Not Gold
IEEE Spectrum Robert N. Charette 2018 Accessed: 2026-05-16
Source Archive
Technical review. 40,195 cases algorithmically determined. 85% error rate without human intervention. 400% penalty.
Automated Stategraft: Faulty Programming and Improper Collections in Michigan's Unemployment Insurance Program
Wisconsin Law Review Rachael Kohl 2024 Accessed: 2026-05-16
Source Archive
Academic legal analysis. 93% error rate, 400% penalty as the highest in the country, the 'Dark Port' mechanism, retroactive accusations going back to 2007.
Fraud Detection System with 93% Failure Rate Gets IT Companies Sued
The Register Thomas Claburn 2017 Accessed: 2026-05-16
Source Archive
Quotes Tony Paris (Sugar Law Center): at least two suicides linked to MiDAS penalties. Two clients left farewell letters.

SyRI — The Algorithmic Dragnet of the Underclass

6 sources

How Dutch Activists Got an Invasive Fraud Detection Algorithm Banned
AlgorithmWatch 2020 Accessed: 2026-05-16
Source Archive
Narrative account. Includes the low water usage example as a fraud flag. Lists the data sources integrated by the system.
System Risk Indication (SyRI)
PILP (Public Interest Litigation Project) Baron Browne-Wilkinson Accessed: 2026-05-16
Source Archive
Page of the lawyers who won the case. Timeline from 2014 to the ruling. Volkskrant: 'not a single fraudster detected'.
Welfare Surveillance on Trial in the Netherlands
Human Rights Watch Amos Toh 2019 Accessed: 2026-05-16
Source Archive
Describes selective deployment in low-income neighbourhoods. Notification mechanism: up to 2 years to initiate an investigation.
The SyRI Victory: Holding Government Profiling to Account
Digital Freedom Fund Tijmen Wisman 2020 Accessed: 2026-05-16
Source
Analysis of the ruling. The legislation was passed in 2014 without a single dissenting vote, despite objections from the DPA and Council of State.
Landmark Ruling: Dutch Court Stops Government Attempts to Spy on the Poor — UN Expert
OHCHR / UN Special Rapporteur Philip Alston 2020 Accessed: 2026-05-16
Source Archive
Official UN statement after the ruling. 'First time a court anywhere has stopped the use of digital technologies by welfare authorities on human rights grounds'.
Digital Welfare Fraud Detection and the Dutch SyRI Judgment
Big Data & Society (SAGE) Marvin van Bekkum, Frederik Zuiderveen Borgesius 2021 Accessed: 2026-05-16
Source Archive
Academic analysis of the ruling. History of the system from 2003. Implications for privacy law in the EU.

Stanislav Petrov — The Manual Override of Armageddon

6 sources

1983 Soviet Nuclear False Alarm Incident
Wikipedia Accessed: 2026-05-16
Source Archive
Complete documentation. Molniya orbit + sunlight + high-altitude clouds mechanism. Malmstrom AFB. Alarm timeline. Petrov's role.
Note: Wikipedia used as the primary source here given the event is well-documented and cross-referenced by multiple additional sources.
Stanislav Petrov
Wikipedia Accessed: 2026-05-16
Source Archive
Biography. Reprimand for paperwork (verbatim quote). Nervous breakdown. Early retirement. Awards (Dresden Peace Prize 2013, Future of Life Award 2018 posthumously). Ban Ki-moon quote.
The False Alarm That Nearly Sparked Nuclear War
Hackaday Zoe Skyforest 2021 Accessed: 2026-05-16
Source Archive
Technical explanation of the false positive mechanism. Diagram of satellite, sun, and cloud geometry. Molniya orbit context.
Soviet Officer Who Averted Nuclear War (BBC Archive 2006)
Web Archive / BBC News 2006 Accessed: 2026-05-16
Source Archive
Original BBC account. Nervous breakdown, early retirement, life in poverty in a Moscow flat. Cited by Wikipedia.
Soviet Colonel Who Averted Nuclear War
RFE/RL Dan Wisniewski 2013 Accessed: 2026-05-16
Source Archive
Detailed portrait of Petrov. Institutional gaslighting after the incident. Contrast between the absence of recognition and later international acclaim.
The Other Close Call of 1983
Veterans Breakfast Club Todd DePastino Accessed: 2026-05-16
Source Archive
Narrative account. Petrov quote: 'I had a funny feeling in my gut'. Context of KAL 007 (25 days earlier) as additional background tension.

Joshua Brown — The White Trailer Tragedy

5 sources

Collision Between a Car Operating With Automated Vehicle Control Systems and a Tractor-Semitrailer Truck Near Williston, Florida, May 7, 2016 (HAB1702)
NTSB 2016 Accessed: 2026-05-16
Source Archive
Official NTSB report. Hands on wheel for 25 seconds out of 37.5 minutes. 7 visual warnings. Sensor failure mechanism. Automation Bias findings.
Feds To Partially Blame Tesla's Autopilot In Fatal Crash: Report
Jalopnik Ryan Felton 2017 Accessed: 2026-05-16
Source Archive
Includes the Brown family statement. Quote that Brown 'repeatedly emphasized safety, that the car was NOT autonomous'. Context for the Automation Bias section.
NTSB Report Largely Clears Tesla in May 2016 Fatal Crash
The Detroit Bureau Paul A. Eisenstein 2017 Accessed: 2026-05-16
Archive
Technical details. White truck against bright sky. 6 audio chimes. Tesla/Mobileye split after the crash.
At the time of compiling this bibliography original link was dead (HTTP/404).
NTSB Finds Tesla Partly to Blame in Fatal Self-Driving Car Crash
GWC Law Harris Elliot 2017 Accessed: 2026-05-16
Archive
Legal analysis of the ruling. Liability mechanism. Note on the Harry Potter movie allegation (unconfirmed).
At the time of compiling this bibliography original portal was dead (numerous links throwing HTTP/403).
NHTSA ODI Resume: PE16-007
NHTSA 2017 Accessed: 2026-05-16
Source Archive
Parallel NHTSA report. No design defect found in Autopilot. Crash rates declined after Autopilot introduction.

Boeing 737 MAX — The Algorithmic Stall

6 sources

Boeing 737 MAX Groundings
Wikipedia Accessed: 2026-05-16
Source Archive
Complete documentation. 346 deaths, AoA sensor failure mechanism, MCAS, grounding timeline.
FAA Updates on Boeing 737 MAX
FAA 2021 Accessed: 2026-05-16
Source Archive
Official FAA communications. '20-month safety review process'. Required changes before return to service. JATR reference.
What Has Happened to Boeing Since the 737 Max Crashes
PBS Frontline Priyanka Boghani, Kaela Malig 2024 Accessed: 2026-05-16
Source Archive
Complete timeline. Internal Boeing documents. Quote: 'I basically lied to the regulators'. $20B+ in losses.
Duckworth Calls on FAA to Review Boeing's Disturbing Pattern
Sen. Tammy Duckworth (Press Release) 2024 Accessed: 2026-05-16
Source Archive
Quotes the JATR report: 'FAA was not completely unaware of MCAS; however... it was difficult to recognize the impacts.' Describes the ODA mechanism and how Boeing downplayed MCAS during certification.
Boeing 737 MAX Investigation
U.S. House Committee on Transportation and Infrastructure Accessed: 2026-05-16
Source Archive
Official congressional investigation page. Reports, hearings, and correspondence with Boeing and FAA.
Boeing to Pay $2.5 Billion Settlement Over Deadly 737 Max Crashes
NPR David Schaper 2021 Accessed: 2026-05-16
Source Archive
Settlement details: $243.6M fine, $500M for victims' families, $1.77B for airlines. DOJ quote: 'Boeing's employees chose the path of profit over candor'.

44

The Illusion of Intelligence

6 Topics

General

The Meta-Fuckup — Hallucinating the Almanac

1 sources

AI Disclosure
adamkorga.com Adam Korga 2026 Accessed: 2026-05-16
Source
Author's own disclosure page with additional details on AI use in this work.
The Web Archive hasn't discovered this page (yet), so I can't provide an archival link... but since you're reading this bibliography, that probably isn't a problem.

Case Studies

The Honourable Justice Hallucination Presiding — Mata v. Avianca

5 sources

Mata v. Avianca, Inc. — Sanctions Order (June 22, 2023)
Justia / U.S. District Court S.D.N.Y. 2023 Accessed: 2026-05-16
Source Archive
Primary legal document. No. 1:22-cv-01461 (PKC). Judge Castel's sanctions order.
Mata v. Avianca, Inc. — Full Docket
Justia 2022 Accessed: 2026-05-16
Source Archive
Complete docket for the case.
Schwartz, Steven A. — Affidavit in Response to Order to Show Cause (May 2023)
CourtListener / PACER Steven A. Schwartz 2023 Accessed: 2026-05-16
Source Archive
The document containing the verbatim quote: 'I had no idea that ChatGPT could fabricate cases'.
The ChatGPT Lawyer Explains Himself
The New York Times Benjamin Weiser, Nate Schweber 2023 Accessed: 2026-05-16
Source
Two US lawyers fined for submitting fake court citations from ChatGPT
The Guardian Dan Milmo 2023 Accessed: 2026-05-16
Source Archive
The only source I found that mentions second lawyer. Maybe he should be jealous of Schwartz's fame?

CNET — The Algorithm That Couldn't Do Third-Grade Math

6 sources

CNET Is Quietly Publishing Whole Articles Generated By AI
Futurism Frank Landymore 2023 Accessed: 2026-05-16
Source Archive
Original investigation (10 January 2023). Discovery of the 'CNET Money Staff' byline. AI authorship concealment mechanism.
CNET's AI Journalist Appears to Have Committed Extensive Plagiarism
Futurism Jon Christian 2023 Accessed: 2026-05-16
Source Archive
Detailed plagiarism analysis. Side-by-side comparisons with Forbes Advisor, The Balance, and Investopedia. Prof. Schatten quote: 'clearly' plagiarism.
CNET Is Reviewing the Accuracy of All Its AI-Written Articles After Multiple Major Corrections
Gizmodo Lauren Leffer 2023 Accessed: 2026-05-16
Source
Five errors in a single compound interest article. Error mechanism ($10,000 @ 3%). CNET only corrected after Futurism alerts.
CNET Corrected 41 of Its 77 AI-Written Articles
Engadget Igor Bonifacic 2023 Accessed: 2026-05-16
Source Archive
41/77 corrections (53%). The Verge quote on 'replaced phrases that were not entirely original'. Plagiarism checker 'wasn't used properly'.
CNET Mass-Corrects AI-Written Finance Explainers
Vibe Graveyard 2023 Accessed: 2026-05-16
Source Archive
Complete timeline from November 2022 to the corrections. Guglielmo quote: corrections were 'substantial'. Compound interest error as the trigger for the full audit.
Creating Helpful, Reliable, People-First Content
Google Search Central Accessed: 2026-05-16
Source Archive
Official Google E-E-A-T documentation (Experience, Expertise, Authoritativeness, Trustworthiness). Context for the guidelines update claim.

Samsung Electronics — Donating the Crown Jewels

8 sources

[단독] 우려가 현실로… 삼성전자, 챗GPT 빗장 풀자마자 '오남용' 속출 [Exclusive: Fears Become Reality — Samsung Electronics Hit by Misuse Cases Right After Lifting ChatGPT Restrictions]
이코노미스트 코리아 [Economist Korea] 정두용 [Jeong Du-yong] 2023 Accessed: 2026-05-16
Source Archive
Original Korean-language investigative report (30 March 2023).
Primary source for all key facts: 20 days from permission to leaks, three incidents, 1024-byte emergency limit, Naver Clova for meeting transcription.
Samsung Employees Leaked Corporate Data in ChatGPT: Report
CIO Dive Lindsey Wilkinson 2023 Accessed: 2026-05-16
Source Archive
Primary English-language account. Three incidents, 1024-byte limit, Naver Clova used for meeting transcription.
AI Incident Database — Incident #768
AI Incident Database Daniel Atherton 2023 Accessed: 2026-05-16
Source Archive
Official incident documentation. Timeline: Samsung permitted ChatGPT on 11 March 2023, three leaks by end of month.
IOTW: Samsung employees allegedly leak proprietary information via ChatGPT
CS Hub Olivia Powell 2023 Accessed: 2026-05-16
Source
Confirms 20 days, three incidents, disciplinary investigations. Context of OpenAI's privacy policy.
Case Study: Samsung ChatGPT Confidential Data Leak (2023)
RedTeams.ai 2026 Accessed: 2026-05-16
Source
Most detailed technical analysis. Full timeline from March to May 2023. 1024-byte limit with no technical enforcement. Ban issued 1 May.
Samsung ChatGPT Ban After Engineers Leaked Source Code
DataFence 2024 Accessed: 2026-05-16
Source Archive
Describes the three leak categories. IP value context in the semiconductor industry. Lessons for corporations.
Samsung Developer November Newsletter: Samsung Electronics Reveals the Samsung Gauss Generative AI Model and Other Latest News
Samsung Developer 2023 Accessed: 2026-05-16
Source Archive
Official Samsung Gauss launch coverage.
Samsung Gauss introduced as ChatGPT rival
MultiLingual Cameron Rasmusson 2023 Accessed: 2026-05-16
Source Archive

The Chatbot Defense — Air Canada and the Hallucinated Discount

5 sources

Moffatt v. Air Canada, 2024 BCCRT 149
CanLII 2024 Accessed: 2026-05-16
Source
Full ruling text. Verbatim 'remarkable submission' quote. Rejection of the 'separate legal entity' defence. Duty of care and negligent misrepresentation.
Moffatt v. Air Canada: A Misrepresentation by an AI Chatbot
McCarthy Tétrault Barry B. Sookman 2024 Accessed: 2026-05-16
Source
Legal analysis of the implications. Apparent authority doctrine. Chatbot liability under Canadian law.
BC Tribunal Confirms Companies Remain Liable for Information Provided by AI Chatbot
American Bar Association Lisa R Lifshitz, Roland Hung 2024 Accessed: 2026-05-16
Source
American perspective on the implications for AI agency and corporate liability in the US.
Airline Ordered to Compensate a B.C. Man Because Its Chatbot Provided Inaccurate Information
Dentons Kirsten Thompson 2024 Accessed: 2026-05-16
Source Archive
Additional legal analysis. Verbatim ruling quote: 'There is no reason why Mr. Moffatt should know that one section of Air Canada's webpage is accurate, and another is not'.
Air Canada's Chatbot Illustrates Persistent Agency and Responsibility Gap Problems for AI
AI & Society (Springer) Joshua L. M. Brand 2024 Accessed: 2026-05-16
Source Archive
Academic analysis of the ethical and legal implications. Argues 'hallucination' is a misnomer — all LLM outputs are probabilistic, not just the wrong ones. Compensation amount: CAN$812.02.

Vegetative Electron Microscopy — The Ghost in the Machine

9 sources

As a Nonsense Phrase of Shady Provenance Makes the Rounds, Elsevier Defends Its Use
Retraction Watch Frederik Joelving 2025 Accessed: 2026-05-16
Source Archive
Original account. Alexander Magazinov and PubPeer. OCR theory and the 1959 article. Elsevier defending the term. Guillaume Cabanac and the Problematic Paper Screener.
Was Nonsense 'Vegetative Electron Microscopy' Phrase a Farsi Typo?
Retraction Watch Frederik Joelving 2025 Accessed: 2026-05-16
Source Archive
Farsi typo theory. The difference of a single dot between پویشی and رویشی. Iranian fraudster network.
Cell Wall Lysis and the Release of Peptides in Bacillus Species
Bacteriological Reviews (ASM) R.E. Strange 1959 Accessed: 2026-05-16
Source
Original 1959 paper. Vol. 23, Issue 1. The word 'vegetative' appears on page 4, third line of the last paragraph in the left column; 'electron microscopy' appears in the corresponding line of the right column — the likely source of the OCR artefact.
Fabrication and Properties of Electrospun... (Example Retracted Paper)
MDPI Materials 2024 Accessed: 2026-05-16
Source Archive
Example paper containing 'vegetative electron microscopy'. MDPI issued a correction replacing the term with 'scanning electron microscopy'.
Note: I am omitting the names of the authors of the original work, as the purpose of this bibliography is not to ridicule potential victims of paper mills.
PubPeer — Magazinov's Original Comment
PubPeer 2022 Accessed: 2026-05-16
Source Archive
Original public report on PubPeer (2022). Starting point for the entire investigation.
A Weird Phrase Is Plaguing Scientific Papers — and We Traced It Back to a Glitch in AI Training Data
The Conversation Aaron J. Snoswell, Kevin Witzenberger, Rayane El Masri 2025 Accessed: 2026-05-16
Source Archive
Accessible popular-science account. Model collapse and garbage-in-garbage-out mechanism in the scientific publishing context.
AI Made Up a Science Term — Now It's in 22 Papers
ZME Science Mihai Andrei 2025 Accessed: 2026-05-16
Source Archive
22 papers affected. Both theories (OCR + Farsi). Timeline.
Schneider Shorts 2.06.2023 — Systematic Manipulation of the Publication Process
For Better Science Leonid Schneider 2023 Accessed: 2026-05-16
Source Archive
Broader context: paper mills and systematic manipulation of the publication process. Background for the mechanism described in the text.
Nieistniejący mikroskop i 3 duże problemy ze współczesną nauką [The Non-Existent Microscope and 3 Big Problems with Modern Science]
YouTube/Uwaga! Naukowy Bełkot [Attention! Scientific Nonsense] Dawid Myśliwiec 2015 Accessed: 2026-05-16
Source
Video in Polish
Transcript contains bibliographic details of the 1959 article and a discussion of the three systemic problems in science illuminated by this incident.