Thinking about Issues
Motivated Risk Assessments
Marco Islam & Christoph Drobner
Economic Journal, forthcoming
Abstract:
Do people form risk assessments to justify their actions? We investigate this question in a field experiment studying the dynamics of risk assessments for visiting a café during the Covid-19 pandemic. By randomly varying the incentive for a visit, we find that participants with a high incentive visit cafés more often and downplay the risk compared to those participants with a low incentive. Importantly, the downplaying happens in anticipation of the visit and without new information, suggesting that the assessment update justifies engagement in risky behaviour. This finding is inconsistent with Bayesian updating but consistent with the notion of motivated reasoning.
General Social Agents
Benjamin Manning & John Horton
NBER Working Paper, March 2026
Abstract:
Useful social science theories predict behavior across settings. However, applying a theory to make predictions in new settings is challenging: rarely can it be done without ad hoc modifications to account for setting-specific factors. We argue that AI agents put in simulations of those novel settings offer an alternative for applying theory, requiring minimal or no modifications. We present an approach for building such "general" agents that use theory-grounded natural language instructions, existing empirical data, and knowledge acquired by the underlying AI during training. To demonstrate the approach in settings where no data from that data-generating process exists -- as is often the case in applied prediction problems -- we design a heterogeneous population of 883,320 novel games. AI agents are constructed using human data from a small set of conceptually related but structurally distinct "seed" games. In preregistered experiments, on average, agents predict initial human play in a random sample of 1,500 games from the population better than (i) a cognitive hierarchy model, (ii) game-theoretic equilibria, and (iii) out-of-the-box agents. For a small set of separate novel games, these simulations predict responses from a new sample of human subjects better even than the most plausibly relevant published human data.
Choice Set Size Neglect in Predicting Others’ Preferences
Beidi Hu, Alice Moon & Eric VanEpps
Psychological Science, January 2026, Pages 30-42
Abstract:
An inherent feature of any choice is the set size from which that choice is made (i.e., the number of available options in a choice set). Choice set size impacts the likelihood of landing on a more preferred option: Larger sets are more likely to contain an option matching one’s preferences. Nevertheless, in six preregistered experiments with 10,092 U.S. adults, we demonstrated that people consistently underestimated the effect of set size when predicting others’ liking for a chosen option. We propose this effect arises because, although people recognize that set size predicts liking of a chosen option, they typically fail to attend to it when considering others’ choices. Accordingly, this effect was attenuated when attention was drawn to set size, specifically (a) when participants considered multiple set sizes simultaneously, (b) when the decision process was framed as ranking rather than choosing, or (c) when participants were prompted to recall set size before predicting others’ preferences.
Assessing personality using zero-shot generative AI scoring of brief open-ended text
Aidan Wright et al.
Nature Human Behaviour, forthcoming
Abstract:
Contemporary personality assessment relies heavily on psychometric scales, which offer efficiency but risk oversimplifying the rich and contextual nature of personality. Recognizing these limitations, this study explores the use of commercially available generative large language models (LLMs), such as ChatGPT, Claude and so on, to assess personality traits from open-ended qualitative narratives. Across two distinct samples and methodologies (spontaneous streams of thought and daily video diaries), we used seven commercial, generative LLMs to score Big-Five personality traits, achieving convergence with self-report measures comparable to or exceeding established benchmarks (for example, self–other agreement, ecological momentary assessment, and bespoke machine learning models). Although results differed across different LLMs, we found that using the average LLM score across models provided the strongest agreement with self-report. Further, LLM-generated trait scores also demonstrated predictive validity regarding daily behaviours and mental health outcomes. This LLM-based approach achieved quantitative rigour based on qualitative data and is easily accessible without specialized training. Importantly, our findings also reaffirm that personality is expressed ubiquitously, in that it is carried in the stream of our thoughts and is woven into the fabric of our daily lives. These results encourage broader adoption of generative LLMs for psychological assessment and -- given the new generation of tools -- stress the value of idiographic narratives as reliable sources of psychological insight.
Bayesians Commit the Gambler's Fallacy
Kevin Dorst
Cognitive Science, January 2026
Abstract:
The gambler's fallacy is the tendency to expect random processes to switch more often than they actually do -- for example, to assign a higher probability to heads after a streak of tails. It's often taken to be evidence for irrationality. It isn't. Rather, it's to be expected from a group of Bayesians who begin with causal uncertainty, and then observe unbiased data from an (in fact) statistically independent process. Although they increase their confidence that the outcomes are independent, they do so in an asymmetric way -- ruling out “streaky” hypotheses more quickly than “switchy” ones. Their expectations depend on this balance of uncertainty; as a result, the majority (and the average) exhibit the gambler's fallacy, expecting a heads after a string of tails. If they have limited memory, this tendency persists even with arbitrarily-large amounts of data. In fact, such Bayesians exhibit a variety of the empirical trends found in studies of the gambler's fallacy. They expect switches after short streaks but continuations after long ones; these nonlinear expectations vary with their familiarity with the causal system; their predictions depend on the sequence they've just seen; they produce sequences that are too switchy; and they exhibit greater rates of the gambler's fallacy in binary predictions than in probability estimates. In short: what's been thought to be evidence for irrationality may instead be rational responses to limited data and memory.
Detection of Idiosyncratic Gaze-Fingerprint Signatures in Humans
Sarah Crockford et al.
Psychological Science, February 2026, Pages 83-105
Abstract:
Do individuals possess a “gaze fingerprint” that reveals how they uniquely look at the world? We tested this question by examining intra- and intersubject gaze similarity across 700 static pictures of complex natural scenes. Independent discovery (n = 105) and replication data sets (n = 46) of adults aged 18 to 50 years (sampled from Italy and Germany) revealed that gaze fingerprinting is possible at relatively high rates (e.g., 52%–63%) compared with chance (e.g., 1%–2%). We also identify gaze-fingerprint barcodes, which reveal a unique individualized code describing which stimuli an individual can be gaze-fingerprinted on. Preregistered longitudinal follow-up experiments have shown that gaze-fingerprint barcodes are nonrandom within an individual over short and long time frames. Finally, we find that increased gaze fingerprintability for social stimuli is associated with decreased levels of autistic traits. To summarize, this work showcases the potential of gaze fingerprinting for isolating trait-like factors that may be of high neurodevelopmental and biological significance.
A Priori Knowledge in an Era of Computational Opacity: The Role of Artificial Intelligence in Mathematical Discovery
Eamon Duede & Kevin Davey
Philosophy of Science, forthcoming
Abstract:
Can we acquire a priori mathematical knowledge from the outputs of computer programs? Although we claim Appel and Haken acquired a priori knowledge of the four-color theorem from their computer program insofar as it merely automated human forms of mathematical reasoning, the opacity of modern large language models (LLMs) and deep neural networks (DNNs) creates obstacles in obtaining a priori mathematical knowledge in analogous ways. If, however, a proof-checker automating human forms of proof-checking is attached to such machines, we can indeed obtain a priori mathematical knowledge from them, even though the original machines are entirely opaque to us and the outputted proofs cannot be surveyed by humans.
Collective intelligence for AI-assisted chemical synthesis
Haote Li et al.
Nature, 5 March 2026, Pages 107-115
Abstract:
The exponential growth of scientific literature presents an increasingly acute challenge across disciplines. Hundreds of thousands of new chemical reactions are reported annually, yet translating them into actionable experiments becomes an obstacle. Recent applications of large language models (LLMs) have shown promise, but systems that reliably work for diverse transformations across de novo compounds have remained elusive. Here we introduce MOSAIC (Multiple Optimized Specialists for AI-assisted Chemical Prediction), a computational framework that enables chemists to harness the collective knowledge of millions of reaction protocols. MOSAIC is built upon the Llama-3.1-8B-instruct architecture, training 2,498 specialized chemical experts within Voronoi-clustered spaces. This approach delivers reproducible and executable experimental protocols with confidence metrics for complex syntheses. With an overall 71% success rate, experimental validation demonstrates the realizations of over 35 novel compounds, spanning pharmaceuticals, materials, agrochemicals, and cosmetics. Notably, MOSAIC also enables the discovery of new reaction methodologies that are absent from the expert’s training, a cornerstone for advancing chemical synthesis. This scalable paradigm of partitioning vast domains into searchable expert regions enables a generalizable strategy for AI-assisted discovery wherever accelerating information growth outpaces efficient knowledge access and application.
The Bots Ruining Social Science Are Not Bots at All
Shalom Jaffe et al.
Perspectives on Psychological Science, March 2026, Pages 127-137
Abstract:
Researchers who employ online data collection from human subjects currently face a conundrum: It is both essential to how behavioral science functions and threatened by low-quality data. It is often assumed that random, inconsistent, and otherwise incomprehensible data in online surveys comes mainly from bots. Despite this assumption, few studies have directly examined where problematic data comes from, even though identifying the source has important implications for creating the right solutions. We examined this issue on several popular participant-recruitment platforms, including Mechanical Turk (MTurk) and Lucid. Across four studies spanning 5 years using multiple methods, we here provide evidence that most of the data-quality problems affecting online research using online panels can be tied to fraudulent users from outside of the United States -- not bots. We identify many of the telltale signs that humans leave behind and describe the most effective ways of blocking problematic human responses to address the online data-quality problem.