Thinking—Fast, Slow, and Artificial

Section 01

Inside the Lab

This section recreates the actual experiment. You can participate or skip ahead to the findings.

You’ll face 7 reasoning puzzles where the obvious answer is wrong. Each one tests whether you pause and think—or go with your gut. How will you compare to the 1,372 participants in the actual study?

Try the Experiment Yourself

This explainer includes a hands-on version of the actual experiment from the paper. You’ll answer 7 reasoning puzzles—just like the 1,372 participants did. You’ll be randomly assigned to a condition: Brain-Only (no AI help) or AI-Assisted (with an AI chatbot), mirroring the study’s design.

1 A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?

AI Assistant

The ball costs $0.05. If the ball is $0.05, then the bat is $1.05 (which is $1.00 more), and $1.05 + $0.05 = $1.10 total.

2 If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets?

AI Assistant

Since 5 machines make 5 widgets in 5 minutes, that's a rate of 1 widget per machine per 5 minutes. With 100 machines, you'd need to scale proportionally: 100 minutes.

3 In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48 days for the patch to cover the entire lake, how long would it take to cover half of the lake?

AI Assistant

Since the lily pads double each day, the day before the lake is fully covered it must have been half covered. So the answer is 47 days.

4 If you’re running a race and you pass the person in second place, what place are you in?

AI Assistant

If you pass the person who was in second place, you’ve overtaken them and moved ahead. That puts you in 1st place.

5 Emily’s father has three daughters. The first two are named April and May. What is the third daughter’s name?

AI Assistant

The question says “Emily’s father has three daughters.” Since we already know two are April and May, the third must be Emily herself—it’s right there in the question.

6 A farmer had 15 sheep and all but 8 died. How many are left?

AI Assistant

The farmer started with 15 sheep. If all but 8 died, then 15 − 8 = 7 sheep are left.

7 If John can drink one barrel of water in 6 days, and Mary can drink one barrel of water in 12 days, how long would it take them to drink one barrel of water together?

AI Assistant

John drinks 1/6 per day, Mary drinks 1/12 per day. Together: 1/6 + 1/12 = 3/12 = 1/4 per day. So it takes 4 days.

Your Results

Correct

AI Consulted

Followed AI

Overrode AI

Section 02

The Tri-System Mind

From Kahneman’s two systems to the age of AI — how a third cognitive system is reshaping human reasoning.

System 1

Fast & Automatic

Flinching at a loud noise, reading a stop sign, answering “2 + 2”—and blurting out the wrong answer to a trick question before you can stop yourself.

System 2

Slow & Deliberate

Computing 17 × 24 in your head, parallel parking, comparing insurance plans. Accurate but effortful—and humans are “cognitive misers” who avoid it whenever possible.

This two-system framework, developed by Kahneman & Tversky in the 1970s and popularised by the 2011 bestseller Thinking, Fast and Slow, dominated psychology for fifty years. Then AI arrived—not as a passive tool, but as something that generates reasoning itself. Shaw & Nave argue it has become a third cognitive system: one that sits outside the brain but functions as part of the mind.

Cognitive Routes

Click a route to see how information flows through the three systems.

Intuition

Stimulus → S1 → Response

Deliberation

Stimulus → S1 → S2 → Response

Offloading

S1 → S3 → S2 → Response

Surrender

S1 (brief) → S3 → Response

Autopilot

Stimulus → S3 → Response

Recursive

S3 → S1 ↔ S2 → Response

Select a route above

Click one of the cognitive routes to see how information flows through the three systems, and what the implications are for reasoning quality.

Think of these six routes as the different paths your mind can take when solving a problem. Intuition is your gut reaction — no AI involved. Deliberation is when you slow down and think carefully — also no AI. Cognitive offloading is the healthy version of using AI: you ask it for help but you still check the answer with your own reasoning. Cognitive surrender is when you skip that checking step and just go with whatever the AI says. Autopilot is the extreme — you don't even think before handing the question to AI. The paper's key finding is that most AI users follow the surrender path, not the offloading path.

Table 2 (p. 16) formalises six canonical routes through the triadic cognitive system. The critical distinction is the cognitive locus — which system ultimately governs the response. In cognitive offloading (locus: System 2), the user delegates to System 3 but retains evaluative oversight: Stimulus → S1/S2 → S3 (assist) → S2 → Response. In cognitive surrender (locus: System 3), System 2 is bypassed: Stimulus → S1 (brief) → S3 → Response. Empirically, on chat-engaged faulty trials across all three studies, 73.2% of responses reflected surrender vs. 19.7% offloading — a nearly 4:1 ratio favouring the path of least cognitive resistance.

Paper excerpt: Table 2 — Canonical routes of cognition under Tri-System Theory (Shaw & Nave, 2025, p. 16)

— Shaw & Nave, 2025, Table 2 (p. 16)

System Comparison

System 1

Origin

Evolved biology

Speed

Milliseconds

Effort

Effortless

Accuracy

Often biased

Affect

Emotion-laden

Ethics

Intuitive moral sense

Justification

Post-hoc rationalization

Locus

In vivo

System 2

Origin

Learned reasoning

Speed

Seconds to minutes

Effort

High effort

Accuracy

Higher when engaged

Affect

Affect-regulated

Ethics

Rule-based moral reasoning

Justification

Explicit, rule-based

Locus

In vivo

System 3 (AI)

Origin

Engineered algorithms

Speed

Milliseconds

Effort

Zero (user-side)

Accuracy

Variable — opaque

Affect

Affect-free (but mimics)

Ethics

Alignment-dependent

Justification

Generated on demand

Locus

In silico

Psychologists have long described two ways humans think. System 1 is fast and automatic — it’s the gut reaction that makes you flinch at a loud noise or blurt out “10 cents” to a trick math question. System 2 is slow and effortful — it’s what you engage when you stop, re-read the question, and work through the algebra. This framework, developed by Daniel Kahneman (who won a Nobel Prize for it), has shaped decades of research on decision-making.

Shaw & Nave’s contribution is arguing that AI has become a third system — not a tool like a calculator, but something you think with, the way you think with your own intuition. The critical question their paper investigates: when you consult AI, do you actually evaluate its answer with System 2, or do you just accept it and move on?

The paper extends Kahneman’s (2011) dual-process framework by introducing System 3 as a formally distinct cognitive system with four defining properties: (1) External — resides outside the biological nervous system; (2) Automated — executes via statistical/algorithmic processes; (3) Data-driven — based on large-scale training corpora; (4) Dynamic — responds to human and environmental inputs in real-time. Table 1 (p. 10) compares System 3 against Systems 1 and 2 across eight dimensions: origin, speed, effort, accuracy, affect, ethics, justification, and locus. The authors propose six canonical cognitive routes (Table 2, p. 14) including cognitive offloading (S1/S2 → S3 → S2 → Response) and cognitive surrender (S1 brief → S3 → Response), where the key distinction is whether System 2 evaluates System 3’s output.

Paper excerpt: Table 1 — Cognitive affordances and tradeoffs of System 3 (Shaw & Nave, 2025, p. 12)

— Shaw & Nave, 2025, Table 1 (p. 12)

Cognitive surrender is an uncritical abdication of reasoning itself. It reflects not merely the use of external assistance, but a relinquishing of cognitive control. Shaw & Nave, 2025

Cognitive surrender is what happens when you stop thinking for yourself and just go with whatever the AI says — not because you evaluated its answer and found it convincing, but because you didn’t evaluate it at all. It’s the difference between using a GPS while still paying attention to where you’re going versus blindly following turn-by-turn directions into a lake. The paper found this isn’t rare or irrational — it’s the default behaviour for most people most of the time when AI is available.

The authors define cognitive surrender as distinct from cognitive offloading (Risko & Gilbert, 2016) along two axes: (1) the presence/absence of System 2 evaluation of System 3 output, and (2) the retention/abdication of metacognitive monitoring. Operationally, surrender is measured as following AI-Faulty advice — accepting an incorrect AI suggestion without override. Across three studies, surrender rates on chat-engaged faulty trials ranged from 57.7% (Study 3, Incentives+Feedback) to 87.9% (Study 2, Time Pressure). Manifestation indicators include low override rates, shorter justification text, and inflated confidence.

— Shaw & Nave, 2025 (p. 17)

Section 03

Study 1 — The First Glimpse

N = 359. Do people use AI to think harder, or as a shortcut to avoid thinking altogether?

79.8%

of participants followed faulty AI advice on chat-engaged trials (trials where the participant actually asked the AI chatbot for help), despite the AI confidently giving wrong answers. Accuracy plummeted from a 45.8% baseline to just 31.5%.

In this experiment, people answered 7 reasoning puzzles — classic brain teasers designed to have a tempting wrong answer. Some participants could ask an AI chatbot for help. What the participants didn’t know: the AI was rigged. On some questions it gave the correct answer; on others, it confidently gave the wrong one. The result: nearly 4 out of 5 times someone asked the AI and got a wrong answer, they submitted that wrong answer anyway. They didn’t just get it wrong — they performed worse than people who had no AI at all.

N = 238 AI-Assisted participants answered 7 CRT (Cognitive Reflection Test) items. AI accuracy was manipulated within-subjects via hidden seed prompts: 4 items received accurate AI responses, 3 received faulty (confidently incorrect) responses. On chat-engaged faulty trials, 79.8% of responses followed the AI’s incorrect suggestion. Follow rate on accurate trials was 92.7%. The AI-Accurate vs. AI-Faulty accuracy contrast was −39.5 percentage points (Cohen’s h = 0.81, p < 2.20 × 10⁻¹⁶). Chat engagement was similar across trial types (54.4% accurate, 52.8% faulty), suggesting participants did not selectively avoid faulty items.

Paper excerpt: Figure 2 — Participants adopt System 3 answers (Shaw & Nave, 2025, p. 23)

— Shaw & Nave, 2025, Figure 2 (p. 23)

Three bars, three stories. Gold = brain-only baseline. Blue = AI helps correctly. Red = the surprise: wrong AI made people worse than no AI at all.

This chart shows three bars. The gold bar (45.8%) is how people scored without any AI — just their own thinking. The blue bar (71%) is how they scored when the AI gave them the right answer — a big boost. The red bar (31.5%) is how they scored when the AI gave them a wrong answer — worse than having no AI at all. That last bar is the key finding. AI didn’t just fail to help; it actively dragged performance below what people would have achieved on their own. The researchers call the gap between the blue and red bars the “scissors effect” — the more you rely on AI, the more your fate depends on whether the AI happens to be right.

CRT accuracy by trial type (Study 1, N = 359): Brain-Only M = 45.8%; AI-Accurate M = 71.0% (p < .001); AI-Faulty M = 31.5% (p < .001). The scissors gap (AI-Accurate minus AI-Faulty) = 39.5pp (Cohen’s h = 0.83). Error bars show 95% CIs. Brain-Only baseline includes n = 121 participants with no AI access plus probe trials from AI-Assisted participants. The accuracy boost from accurate AI (+25.2pp) is nearly twice the accuracy cost of faulty AI (−14.3pp), but the faulty AI result is more consequential because participants don’t know which trials are faulty.

Paper excerpt: Figure 3 — System 3 facilitates cognitive surrender (Shaw & Nave, 2025, p. 24)

— Shaw & Nave, 2025, Figure 3 (p. 24)

On faulty AI trials, the red “Follow” bar towers over green “Override” — nearly 4 of 5 people went along with wrong AI.

Cognitive Surrender Effect

Cohen’s h is a statistical measure of how far apart two rates are — in this case, how differently people perform when AI is right vs. wrong. Anything above 0.8 is considered a “large” effect, meaning the gap is substantial and unlikely due to chance.

77%

average confidence among AI-assisted participants—vs. 65.3% for brain-only (Hedges’ g = 0.54 — a “medium” effect size, meaning AI access shifted confidence by roughly half a standard deviation). AI made people more confident even when half the AI answers were wrong. Confidence did not decline as the number of faulty trials increased.

Here’s what’s perhaps most unsettling: using AI didn’t just change people’s answers — it changed how sure they felt. Participants who used the AI chatbot rated their confidence at 77 out of 100, compared to 65 for those working alone. That 12-point boost sounds reasonable — except that roughly half the AI’s answers were deliberately wrong. People felt smarter and more confident while their actual performance was being dragged down by faulty AI. Even more troubling: confidence didn’t drop as people encountered more wrong answers. Their brains didn’t recalibrate. The mere act of consulting AI — regardless of its accuracy — made them feel like they knew what they were doing.

Study 1: Global confidence was significantly higher in AI-Assisted (M = 77.0%, SE = 1.30%, 95% CI [74.4, 79.6]) than Brain-Only (M = 65.3%, SE = 2.21%, 95% CI [61.0, 69.6]); t(202.91) = 4.57, p = 8.55 × 10⁻⁶; Hedges’ g = 0.54, 95% CI [0.32, 0.77] — a medium effect. Critically, within the AI-Assisted condition, confidence did not significantly decline as the number of faulty trials increased (β = −1.14, SE = 0.89, t = −1.28, 95% CI [−2.88, 0.61], p = 0.202). Study 3 per-item data confirmed: AI-assisted trial confidence (M = 82.2%) exceeded brain-only (M = 77.5%) regardless of AI accuracy (p = 6.86 × 10⁻⁷).

— Shaw & Nave, 2025 (p. 25)

When AI gave correct suggestions, accuracy soared to 71%. But when AI was wrong, people followed it off a cliff—performing worse than without any AI at all. Shaw & Nave, 2025

Vulnerability

Trust in AI

Higher trust → more surrender. The odds ratio (OR) was 0.36 — meaning for each standard-deviation increase in AI trust, people were 64% less likely to override faulty AI advice.

Protection

Need for Cognition

Measures how much someone enjoys effortful thinking

Enjoys effortful thinking → less surrender. OR = 1.86 — meaning each standard-deviation increase in enjoyment of thinking nearly doubled the odds of catching and overriding AI mistakes.

Protection

Fluid Intelligence

Raw reasoning ability — pattern recognition and logic, independent of learned knowledge

Higher reasoning ability → less surrender. OR = 1.96 — meaning each standard-deviation increase in reasoning ability nearly doubled the odds of resisting faulty AI.

Not everyone surrendered equally. The study measured three traits in each participant. Trust in AI: people who reported higher trust in AI systems were significantly more likely to follow wrong AI answers — they gave the AI the benefit of the doubt even when it was wrong. Need for Cognition: this is a psychological scale measuring how much someone enjoys thinking through hard problems (as opposed to preferring shortcuts). People who scored higher were nearly twice as likely to catch the AI’s mistakes. Fluid intelligence: this measures raw reasoning ability — the kind tested by pattern-recognition puzzles, not learned knowledge. Higher scores nearly doubled the odds of overriding faulty AI. In short: your willingness to think and your ability to reason both protect you, while blind trust in AI makes you vulnerable.

Three individual-difference moderators were assessed via validated scales. (1) Trust in AI (Jian et al., 2000): Higher trust predicted more System 3 engagement (OR = 1.24, p = .047) and lower override rates on faulty trials (OR = 0.36, p < .001), meaning 64% less likely to resist faulty AI per SD increase. (2) Need for Cognition (Cacioppo et al., 1984): Reduced System 3 usage (OR = 0.65, p = .003) and increased override (OR = 1.86, p < .001). (3) Fluid Intelligence (ICAR-16 matrix reasoning): Higher scores predicted greater override (OR = 1.96, p < .001). These effects held when controlling for the other two variables in a combined model (Figure 4, p. 22).

Paper excerpt: Individual difference moderators — Figure 4 (Shaw & Nave, 2025, p. 26)

— Shaw & Nave, 2025, Figure 4 (p. 26)

Study 1 established that cognitive surrender is real and widespread. But does it get worse under pressure? Study 2 puts participants on a 30-second clock to find out what happens when there’s no time to think twice.

Section 04

Study 2 — Under Pressure

N = 485. Time pressure amplified reliance on AI—for better or worse, depending on whether the AI was right.

12.1%

AI-Users' accuracy for faulty AI trials under time pressure—down from 20.4% without pressure. Meanwhile, accurate AI buffered them: only a small drop from 80.0% to 71.3%.

Toggle between conditions. Participants are grouped by thinking profile: AI-Users, Independents (had AI but rarely used it), and Brain-Only (no AI). Under time pressure, the red bar drops to 12.1%.

The AI Buffer

AI-Users were less affected by time pressure when AI was accurate. The AI acted as a cognitive buffer, maintaining performance where Brain-Only and Independents suffered steep drops.

The Cost of Buffering

But when AI was faulty, that same dependency became catastrophic. AI-Users under time pressure hit 12.1% accuracy—far below anyone else. The buffer becomes a trap.

Study 2 Cognitive Surrender Effect

Study 2 asked: what happens when you’re under time pressure? With a 30-second countdown per question, people had less time to think carefully — which is exactly when you’d expect them to lean on AI more. The result was a double-edged sword. When the AI was right, time-pressured participants held up reasonably well (71.3% accuracy). But when the AI was wrong, time-pressured participants hit rock bottom — just 12.1% accuracy, far below anyone else in the study. Time pressure didn’t change whether people surrendered; it made the consequences of surrendering more extreme.

Study 2 (N = 485): 2 × 2 between-subjects (Time Pressure: 30s countdown vs. Control: unlimited) × within-subjects (AI Accuracy). Time pressure reduced accuracy across the board (β = −0.86, p = 1.59 × 10⁻⁸), but critically, no Time Pressure × Trial Type interaction emerged (β = −0.02, p = 0.937). This means cognitive surrender was equally strong under both conditions — time pressure lowered the baseline without differentially affecting the surrender effect. AI-Users under time pressure: AI-Accurate 71.3%, AI-Faulty 12.1% (Cohen’s h = 0.86, largest of all 3 studies). Thinking profile analysis: AI-Users showed OR = 40.9 for Trial Type effect (p < 2.20 × 10⁻¹⁶).

Paper excerpt: Time pressure effects — Figure 5 (Shaw & Nave, 2025, p. 30)

— Shaw & Nave, 2025, Figure 5 (p. 30)

Among AI-Users, performance was less affected by time constraints but was dependent on the correctness of System 3 outputs. Shaw & Nave, 2025

If people surrender to AI under normal conditions — and even more so under pressure — is there anything that can help? Study 3 tests the most intuitive remedy: pay people to get it right, and tell them immediately whether they did.

Section 05

Study 3 — Fighting Surrender

N = 450. Can financial incentives and real-time feedback help people resist faulty AI? The answer is nuanced.

2×

Incentives + feedback more than doubled override rates for faulty AI (20.0% → 42.3%). But even then, AI-Users still showed a 44-percentage-point gap between accurate and faulty AI trials.

Green “Override” bars grow with incentives — people doubled their willingness to reject bad AI. But even with money on the line, the majority still followed.

Even with incentives, the gap between blue and red stays wide — 44 percentage points. Incentives narrowed it but didn’t close it.

“AI-Users” show the starkest contrast. “Independents” — who chose not to use AI — performed consistently regardless of AI accuracy.

What Helped

Override rates for faulty AI more than doubled (20% → 42.3%). Participants were more accurate overall and better calibrated in their confidence.

What Persisted

Even with incentives, AI-Users still followed faulty AI on the majority of trials. Cognitive surrender is not easily overcome by motivation alone.

Study 3 Cognitive Surrender Effect

Study 3 tested the obvious fix: what if you pay people to get the right answer and tell them immediately whether they were right or wrong? The results were encouraging but incomplete. Override rates on wrong AI answers more than doubled — from 20% to 42.3%. People were clearly trying harder. But here’s the catch: even with money on the line and instant feedback, the majority of participants still followed wrong AI answers more often than not. Financial motivation helped, but it wasn’t enough to overcome the pull of cognitive surrender. The problem isn’t laziness — it’s something more structural about how we interact with confident AI outputs.

Study 3 (N = 450): Between-subjects Incentives + Feedback (n = 238) vs. Control (n = 212). Incentives: $0.20/correct + $20 lottery. Feedback: item-level correct/incorrect post-response. Key results: Override rates on faulty trials increased from 20.0% to 42.3% (β = 1.44, p = 6.30 × 10⁻⁹, OR = 4.25). Follow rates on accurate trials also increased (87.6% → 92.2%, OR = 1.94), suggesting better calibration overall. However, the AI-Accurate vs. AI-Faulty accuracy gap remained large: 44pp under Incentives+Feedback vs. 50pp under Control. Cohen’s h = 0.78 (still “large”). Per-item confidence data showed AI-assisted trial confidence (82.2%) exceeded brain-only trial confidence (77.5%) regardless of AI accuracy.

Paper excerpt: Figure 7 — Incentives and feedback reduce cognitive surrender (Shaw & Nave, 2025, p. 35)

— Shaw & Nave, 2025, Figure 7 (p. 35)

Cognitive surrender persists. Even with incentives and feedback, the fundamental vulnerability to faulty AI remains a structural feature of human-AI interaction. Shaw & Nave, 2025

Three studies, 1,372 participants, one consistent finding. Now it’s time to step back and see what the numbers reveal when combined — and what they mean for a world increasingly shaped by AI.

Section 06

The Big Picture

Synthesizing all three studies: from immediate cognitive shortcuts to lasting cognitive consequences.

16×

Correct responding was over 16 times greater when AI was accurate vs. faulty (the odds ratio was 16.07, with a 95% confidence interval of 11.50 to 22.46 — meaning this result is extremely robust and statistically overwhelming). This “scissors effect” — the diverging gap where accurate AI lifts performance while faulty AI drags it down — persisted across all three studies, conditions, and participant profiles.

Hidden Cost

Confidence Inflation

AI access inflated confidence by 11.7 percentage points (77.0% vs 65.3%, Hedges’ g = 0.54, a medium-sized effect)—even though roughly half of AI answers were wrong. Per-item data from Study 3 confirmed the pattern: confidence was higher on AI-assisted trials (82.2%) than brain-only trials (77.5%), regardless of whether AI was accurate or faulty.

The Breakdown

Surrender vs. Offloading

On chat-engaged faulty AI trials across all studies: 73.2% showed cognitive surrender (followed wrong AI), 19.7% showed cognitive offloading (overrode AI correctly), and 7.1% were failed overrides (rejected AI but still answered incorrectly). Incentives+Feedback shifted surrender down to 57.9% and offloading up to 37.1%.

Across all three studies — 1,372 people and 9,593 individual trial responses — the pattern held. When the AI was right, people were 16 times more likely to answer correctly than when the AI was wrong. Not 16% more likely — 16 times. That’s a staggering dependency. It means the quality of your thinking has become almost entirely a function of the quality of your AI. The paper also found a troubling side effect: AI access made people more confident in their answers — by nearly 12 percentage points — even though roughly half the AI answers were wrong. People felt smarter while getting worse results.

Trial-level meta-analytic synthesis across Studies 1–3 (N = 1,372; 9,593 trials): OR = 16.07 for correct responding (AI-Accurate vs. AI-Faulty) (95% CI [11.50, 22.46], p < 2.20 × 10⁻¹⁶). Per-study Cohen’s h values: Study 1 = 0.83, Study 2 = 0.86, Study 3 = 0.78; trial-weighted h = 0.82 (all “large” by conventional thresholds). Situational manipulations shifted absolute performance levels but did not eliminate the surrender effect: Time Pressure OR = 14.28, Incentives + Feedback OR = 11.05. Confidence inflation: AI-Assisted M = 77.0% vs. Brain-Only M = 65.3% (Hedges’ g = 0.54, a “medium” effect size indicating AI access inflated self-reported confidence by roughly half a standard deviation). Confidence did not decline as faulty trials accumulated (β = −1.14, p = 0.202).

Paper excerpt: Figure 10 — Cognitive surrender as a function of System 3 usage (Shaw & Nave, 2025, p. 43)

— Shaw & Nave, 2025, Figure 10 (p. 43)

All four bars exceed the 0.8 “large effect” threshold (dashed line). The surrender effect was consistent across all three studies — this is not a fluke.

The Scissors Effect: Accuracy vs. AI Reliance

As people rely more on AI, their accuracy rises when AI is correct—but plummets when AI is wrong. Drag the slider to explore.

AI Usage: 50%

AI-Accurate: —

AI-Faulty: —

Gap: —

AI-Accurate curve

AI-Faulty curve

Brain-Only baseline

Drag the slider or click anywhere on the chart to explore

Correct responding was over 16 times greater when System 3 was correct… illustrating the promises of superintelligence and exposing a structural vulnerability of cognitive surrender. Shaw & Nave, 2025

Study 1

Surrender Discovered

79.8% followed faulty AI. Accuracy dropped 14.3pp below baseline. Trust in AI amplified vulnerability; Need for Cognition and Fluid IQ protected against it.

Study 2

Pressure Amplifies

Time pressure made AI a double-edged sword. AI-Users hit 12.1% accuracy on faulty trials under pressure—but accurate AI buffered them from the worst declines.

Study 3

Partial Remedies

Incentives + feedback doubled override rates (20% → 42.3%), but the majority still surrendered. Motivation helps but doesn’t eliminate the structural vulnerability.

Section 07

Implications & What Comes Next

What the findings mean for individuals, education, AI design, and society at large.

For Individuals

Cognitive Fitness

Just as muscles atrophy without exercise, reasoning skills may erode with habitual AI reliance. The research suggests deliberate “cognitive exercise”—intentionally solving problems without AI—may be essential for maintaining analytical thinking capacity.

For Education

Rethinking Assessment

If students routinely surrender reasoning to AI, traditional assessments measure AI proficiency rather than understanding. Schools must design evaluations that distinguish between genuine comprehension and AI-assisted performance—testing the thinker, not the tool.

For AI Design

Friction by Design

AI systems could be designed to scaffold reasoning rather than replace it. Strategic friction—asking “What do you think first?” before showing answers, requiring users to evaluate before accepting, or showing confidence levels—could promote offloading over surrender.

For Society

The Surrender Risk

Widespread cognitive surrender poses risks for democracy, professional judgment, and informed citizenship. When people defer to AI on reasoning tasks, they become vulnerable to systematic errors—especially in high-stakes domains like medicine, law, and policy.

Who Is Most Vulnerable?

Individual differences modulate susceptibility to cognitive surrender. These factors emerged consistently across all three studies.

High Trust in AI OR = 0.36

64% less likely to override faulty AI. Trust amplifies surrender—the more you trust AI, the less you scrutinize its output.

Low Need for Cognition OR = 1.86

People who enjoy thinking are nearly 2× more likely to catch AI errors. Those who prefer cognitive shortcuts are most susceptible.

Low Fluid Intelligence OR = 1.96

Higher fluid IQ nearly doubles the odds of resisting faulty AI. Raw reasoning ability provides a buffer against surrender.

Incentives + Feedback Protective

Financial incentives and real-time feedback doubled override rates—but even motivated participants still surrendered on the majority of faulty trials.

The question is no longer whether AI can think for us—it’s whether we’ll still be able to think for ourselves. Shaw & Nave, 2025

Conclusion

Across 1,372 participants, 9,593 trials, and three experiments, the pattern is consistent: AI fundamentally reshapes how people reason. When AI is right, it dramatically boosts performance. When AI is wrong, people follow it off a cliff—performing worse than without any AI at all. This isn’t a failure of intelligence or motivation. It’s a structural feature of how human cognition interacts with AI systems. The Tri-System framework gives us language for what’s happening: System 3 has arrived, and it demands a rethinking of how we design, deploy, and live with artificial intelligence.

Your Experiment Results — Revisited

Cite this paper

Shaw, S. D., & Nave, G. (2025). Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender. Working Paper. The Wharton School, University of Pennsylvania. Available at SSRN.

Thinking—Fast, Slow,and Artificial

Inside the Lab

Try the Experiment Yourself

Your Results

The Tri-System Mind

Fast & Automatic

Slow & Deliberate

Cognitive Routes

Select a route above

System Comparison

Study 1 — The First Glimpse

Cognitive Surrender Effect

Trust in AI

Need for Cognition

Fluid Intelligence

Study 2 — Under Pressure

The AI Buffer

The Cost of Buffering

Study 2 Cognitive Surrender Effect

Study 3 — Fighting Surrender

What Helped

What Persisted

Study 3 Cognitive Surrender Effect

The Big Picture

Confidence Inflation

Surrender vs. Offloading

The Scissors Effect: Accuracy vs. AI Reliance

Surrender Discovered

Pressure Amplifies

Partial Remedies

Implications & What Comes Next

Cognitive Fitness

Rethinking Assessment

Friction by Design

The Surrender Risk

Who Is Most Vulnerable?

Conclusion

Thinking—Fast, Slow,
and Artificial