More Medicine
Learning the natural history of human disease with generative transformers
Artem Shmatko et al.
Nature, forthcoming
Abstract:
Decision-making in healthcare relies on understanding patients’ past and current health states to predict and, ultimately, change their future course. Artificial intelligence (AI) methods promise to aid this task by learning patterns of disease progression from large corpora of health records. However, their potential has not been fully investigated at scale. Here we modify the GPT (generative pretrained transformer) architecture to model the progression and competing nature of human diseases. We train this model, Delphi-2M, on data from 0.4 million UK Biobank participants and validate it using external data from 1.9 million Danish individuals with no change in parameters. Delphi-2M predicts the rates of more than 1,000 diseases, conditional on each individual’s past disease history, with accuracy comparable to that of existing single-disease models. Delphi-2M’s generative nature also enables sampling of synthetic future health trajectories, providing meaningful estimates of potential disease burden for up to 20 years, and enabling the training of AI models that have never seen actual data. Explainable AI methods provide insights into Delphi-2M’s predictions, revealing clusters of co-morbidities within and across disease chapters and their time-dependent consequences on future health, but also highlight biases learnt from training data. In summary, transformer-based models appear to be well suited for predictive and generative health-related tasks, are applicable to population-scale datasets and provide insights into temporal dependencies between disease events, potentially improving the understanding of personalized health risks and informing precision medicine approaches.
Algorithmic decision-making in health care: Evidence from post-acute care in Medicare Advantage
Jeffrey Marr
Journal of Health Economics, December 2025
Abstract:
Health insurers use predictive algorithms to determine the necessary level of care and deny services they deem unnecessary. Using a difference-in-differences design, I study the partnership of a large Medicare Advantage insurer with a firm that uses a predictive algorithm to aid post-acute care coverage decisions. This partnership led to an immediate and sustained 13 percent decline in the length of skilled nursing facility stays. This effect was partially driven by large declines in longer skilled nursing facility stays (over 30 days). Despite reductions in health care use, I don’t observe changes in health outcomes following the adoption of the predictive algorithm.
Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: A multicentre, observational study
Krzysztof Budzyń et al.
Lancet Gastroenterology & Hepatology, October 2025, Pages 896-903
Methods: We conducted a retrospective, observational study at four endoscopy centres in Poland taking part in the ACCEPT (Artificial Intelligence in Colonoscopy for Cancer Prevention) trial. These centres introduced AI tools for polyp detection at the end of 2021, after which colonoscopies had been randomly assigned to be conducted with or without AI assistance according to the date of examination. We evaluated the quality of colonoscopy by comparing two different phases: 3 months before and 3 months after AI implementation. We included all diagnostic colonoscopies, excluding those involving intensive anticoagulant use, pregnancy, or a history of colorectal resection or inflammatory bowel disease. The primary outcome was change in adenoma detection rate (ADR) of standard, non-AI assisted colonoscopy before and after AI exposure. Multivariable logistic regression was done to identify independent factors affecting ADR.
Findings: Between Sept 8, 2021, and March 9, 2022, 1443 patients underwent non-AI assisted colonoscopy before (n=795) and after (n=648) the introduction of AI (median age 61 years [IQR 45–70], 847 [58·7%] female, 596 [41·3%] male). The ADR of standard colonoscopy decreased significantly from 28·4% (226 of 795) before to 22·4% (145 of 648) after exposure to AI, corresponding with an absolute difference of –6·0% (95% CI –10·5 to –1·6; p=0·0089). In multivariable logistic regression analysis, exposure to AI (odds ratio 0·69 [95% CI 0·53–0·89]), male versus female patient sex (1·78 [1·38–2·30]), and patient age ≥60 years versus <60 years (3·60 [2·74–4·72]) were the independent factors significantly associated with ADR.
Rules vs. Discretion: Treatment of Mental Illness in U.S. Adolescents
Emily Cuddy & Janet Currie
Journal of Political Economy, forthcoming
Abstract:
Many mental health disorders start in adolescence and appropriate initial treatment may improve trajectories. But what is appropriate treatment? We use a large national database of insurance claims to examine the impact of initial mental health treatment on the outcomes of adolescent children over the next two years, where treatment is either consistent with FDA guidelines, consistent with looser guidelines published by professional societies (“grey-area” prescribing), or inconsistent with any guidelines (“red-flag” prescribing). We find that red-flag prescribing increases self-harm, use of emergency rooms, and health care costs, suggesting that treatment guidelines effectively scale up good treatment in practice.
Place-Based Variation in Health Care: Evidence from Mandatory Movers in the U.S. Military Health System
William Luan et al.
NBER Working Paper, September 2025
Abstract:
There is increasing evidence on regional variations in U.S. Medicare utilization based on older patients who move. Yet evidence is limited for younger ages in the U.S., and movers may differ systematically from those who don’t move. In this paper, we harness the mandatory migration of military personnel and dependents (age 5 to 64) to estimate supply and demand factors in a system of care in which military physicians are salaried and copayments and deductibles are negligible. In our sample of 3 million enrollees, we find that place or supply effects explain as much as 80 percent of the overall regional variation for both the entire sample and for active-duty personnel. These regional place effects are correlated across age groups, with correlations as high as 0.84 between middle-aged and older military enrollees. These regional supply-side variations cannot be explained by differences in health, financial incentives, or quality of care, but appear consistent with location-specific differences in physician beliefs.
Missing Markets for Innovation: Evidence from New Uses of Existing Drugs
Eric Budish et al.
NBER Working Paper, September 2025
Abstract:
For large classes of potential inventions, intellectual property rights that are available on paper are either not possible or not profitable for firms to enforce in practice. In this paper, we show that these missing incentives yield quantitatively significant underinvestment in research and development. We develop a simple model that formalizes the conditions under which such missing markets for innovation arise. We identify an empirical setting -- research into new uses for existing drugs -- in which there is sharp variation in the enforceability of intellectual property rights on otherwise comparable inventions over time. We show that when intellectual property rights become unenforceable, research investment and commercialization nearly cease. In doing so, we test two claims central both to our model and the innovation literature more generally -- that stronger intellectual property protection does, in fact, induce investment, and that heterogeneity in the availability of these rights distorts investment. The welfare consequences of inadequate incentives in our empirical context are large. Our estimates suggest that 200-800 new uses for existing drugs would have been developed under counterfactual policies. Measures of the value of these uses drawn from existing literature suggest that the social cost of this particular missing market is on the order of several trillion dollars.
The Repeal of Noneconomic Damage Caps and Medical Malpractice Insurance Premiums
Yuji Mizushima, Christopher Whaley & Hao Yu
Health Economics, forthcoming
Abstract:
Noneconomic damage caps are controversial because they seek to balance uncertain benefits through reductions in physician precautionary costs, against uncertain harms to patient welfare. Opposing policy actions at the state-level reflect this controversy as some states have enacted noneconomic damage caps over the past few decades while others repealed their caps. Our difference-in-differences analyses suggest that repeals increase premiums. These increases are larger after State Supreme Court decisions, affecting all cases in a state, compared with State Circuit Court decisions affecting only specific cases. Magnitudes differ by physician specialty, with larger effects observed in obstetrics/gynecology and general surgery, compared with internal medicine. Our estimates of these repeals are larger than estimates on enactments reported in the literature, suggesting a potential asymmetry between enacting and repealing damage caps.
Medicaid Enrollees With Opioid Use Disorder Were More Likely To Receive Medication Treatment Than Commercial Enrollees
Karen Shen et al.
Health Affairs, September 2025, Pages 1092-1101
Abstract:
Medications for opioid use disorder (MOUD) are underused in the treatment of people with OUD; insurance-related barriers are a possible driving factor. We used Wisconsin’s all-payer claims database for 2021–22 to examine differences in MOUD treatment probabilities among nonelderly adults with Medicaid insurance compared with those with commercial insurance. We found that Medicaid enrollees with OUD were 9.8 percentage points more likely to receive MOUD than commercial enrollees with OUD. Some of this gap can be explained by lower treatment rates among older enrollees and the older age distribution of commercial enrollees, as well as the fact that Medicaid enrollees are more likely to see primary care providers with more OUD experience. However, after we controlled for patient age, sex, comorbidities, and provider fixed effects, a substantial unexplained difference of 7.1 percentage points remained, potentially indicating a role for direct plan-level effects or unobserved patient characteristics. Policies aimed at improving commercial insurance coverage, training more providers, and targeting older populations may be needed to increase overall MOUD treatment rates.
Expecting Harm? The Impact of Rural Hospital Acquisitions on Maternal Health Care
David Dranove, Martin Gaynor & Eilidh Geddes
NBER Working Paper, August 2025
Abstract:
While numerous papers document the effects of mergers on cost and quality, the effects of hospital mergers on access to care are less certain. Merging hospitals may limit access by closing one of the affected hospitals or eliminating individual service lines. However, hospital systems may have more resources to improve care delivery. We study the impact of hospital mergers on obstetric care in rural markets, where there may be heightened concern about the availability of local care options. Using a differences-in-differences approach, we find that when rural hospitals are acquired, there are substantial increases in the probability of obstetric unit closures, with resulting large reductions in the number of births at the hospital. We find mixed effects on health outcomes: there are small increases in maternal morbidity, but no changes in newborn outcomes on average. However, there are improvements of newborns with Medicaid coverage. Additionally, we find decreases in maternal transfers and increases in procedures consistent with women delivering in more resourced hospitals.
Labor and Product Market Power, Endogenous Quality, and the Consolidation of the US Hospital Industry
Bradley Setzler
NBER Working Paper, August 2025
Abstract:
Existing structural analyses of the harmful effects of market consolidation focus on either product or labor markets in isolation, ignoring that product market competitors often compete for workers as well. This paper develops a unified framework for merger evaluation, finding that firms' simultaneous exercise of oligopoly power in the product market and oligopsony power in the labor market amplifies the harm from mergers to both consumers and workers. The model also demonstrates how merger-induced gains in labor market power incentivize firms to reduce product quality, highlighting an additional channel for consumer harm. The model's predictions are tested and quantified in the context of the recent consolidation of the US hospital industry. Linking panel data from several sources on all US hospitals from 1996-2022, a difference-in-differences design is estimated for nearly 150 high-concentration within-market mergers. Hospital mergers significantly reduce patient volume, increase prices, reduce employment, lower wages, and deteriorate quality of care, resulting in higher patient mortality. After recovering the structural parameters, the estimated model replicates observed merger impacts. Counterfactual exercises reveal that ignoring increased labor (product) concentration would lead one to under-predict the harm of mergers to consumers (workers).
Is Managed Care Effective in Long-term Care Settings? Evidence from Medicare Institutional Special Needs Plans
Momotazur Rahman et al.
NBER Working Paper, September 2025
Abstract:
Nursing homes face unique financial incentives that encourage under-investment in onsite clinical capabilities and overreliance on hospitals to triage and care for residents with dementia, contributing to high levels of health care spending for this population. A proposed solution to align incentives are Institutional Special Needs Plans (I-SNPs), which combine capitated financing with plan-provided onsite clinician presence. Using 12 million resident-quarters of data from 2016-2022, we exploit the timing of nursing homes’ I-SNP contracting to instrument for plan enrollment and estimate causal effects on hospitalization and other health outcomes. We found that I-SNP enrollment reduced quarterly hospitalization rates by 3 to 4 percentage points, which equates to one third of hospitalizations relative to the sample mean. We do not find consistent evidence of an impact on other health outcomes and quality of care indicators.