Madness in the method

Kevin Lewis

October 14, 2012

A peculiar prevalence of p values just below .05

E.J. Masicampo & Daniel Lalande
Quarterly Journal of Experimental Psychology, forthcoming

Abstract:
In null hypothesis significance testing (NHST), p values are judged relative to an arbitrary threshold for significance (.05). The present work examined whether that standard influences the distribution of p values reported in the psychology literature. We examined a large subset of papers from three highly regarded journals. Distributions of p were found to be similar across the different journals. Moreover, p values were much more common immediately below .05 than would be expected based on the number of p values occurring in other ranges. This prevalence of p values just below the arbitrary criterion for significance was observed in all three journals. We discuss potential sources of this pattern, including publication bias and researcher degrees of freedom.

----------------------

The Mathematical Turn in Economics: Walras, the French Mathematicians, and the Road Not Taken

Michael Turk
Journal of the History of Economic Thought, June 2012, Pages 149-167

Abstract:
One of the pivotal moments in the move toward mathematizing economics occurred at the turn of the twentieth century, with Leon Walras as perhaps its most ardent champion. Yet, there is no small irony here, in that the leading French mathematicians to whom Walras turned to buttress and defend the case for a mathematical economics, especially Henri Poincare and Emile Picard, laid out reservations to the scope of this mathematizing program. They even pointed to matters, including the hold of the past on future events and hysteresis, a subject already in the discourse of mathematical physicists, which might have fashioned economics differently from the neoclassical mold being formed. This alternate pathway, though, was not pursued at the time.

----------------------

Much Ado About Deception: Consequences of Deceiving Research Participants in the Social Sciences

Davide Barrera & Brent Simpson
Sociological Methods Research, August 2012, Pages 383-413

Abstract:
Social scientists have intensely debated the use of deception in experimental research, and conflicting norms governing the use of deception are now firmly entrenched along disciplinary lines. Deception is typically allowed in sociology and social psychology but proscribed in economics. Notably, disagreements about the use of deception are generally not based on ethical considerations but on pragmatic grounds: the anti-deception camp argues that deceiving participants leads to invalid results, while the other side argues that deception has little negative impact and, under certain conditions, can even enhance validity. These divergent norms governing the use of deception are important because they stifle interdisciplinary research and discovery, create hostilities between disciplines and researchers, and can negatively impact the careers of scientists who may be sanctioned for following the norms of their home discipline. We present two experimental studies aimed at addressing the issue empirically. Study 1 addresses the effects of direct exposure to deception, while Study 2 addresses the effects of indirect exposure to deception. Results from both studies suggest that deception does not significantly affect the validity of experimental results.

----------------------

Why Most Biomedical Findings Echoed by Newspapers Turn Out to be False: The Case of Attention Deficit Hyperactivity Disorder

François Gonon et al.
PLoS ONE, September 2012

Context: Because positive biomedical observations are more often published than those reporting no effect, initial observations are often refuted or attenuated by subsequent studies.

Objective: To determine whether newspapers preferentially report on initial findings and whether they also report on subsequent studies.

Methods: We focused on attention deficit hyperactivity disorder (ADHD). Using Factiva and PubMed databases, we identified 47 scientific publications on ADHD published in the 1990s and soon echoed by 347 newspapers articles. We selected the ten most echoed publications and collected all their relevant subsequent studies until 2011. We checked whether findings reported in each "top 10" publication were consistent with previous and subsequent observations. We also compared the newspaper coverage of the "top 10" publications to that of their related scientific studies.

Results: Seven of the "top 10" publications were initial studies and the conclusions in six of them were either refuted or strongly attenuated subsequently. The seventh was not confirmed or refuted, but its main conclusion appears unlikely. Among the three "top 10" that were not initial studies, two were confirmed subsequently and the third was attenuated. The newspaper coverage of the "top 10" publications (223 articles) was much larger than that of the 67 related studies (57 articles). Moreover, only one of the latter newspaper articles reported that the corresponding "top 10" finding had been attenuated. The average impact factor of the scientific journals publishing studies echoed by newspapers (17.1 n = 56) was higher (p<0.0001) than that corresponding to related publications that were not echoed (6.4 n = 56).

Conclusion: Because newspapers preferentially echo initial ADHD findings appearing in prominent journals, they report on uncertain findings that are often refuted or attenuated by subsequent studies. If this media reporting bias generalizes to health sciences, it represents a major cause of distortion in health science communication.

----------------------

Scientific inbreeding and same-team replication: Type D personality as an example

John Ioannidis
Journal of Psychosomatic Research, forthcoming

Abstract:
Replication is essential for validating correct results, sorting out false-positive early discoveries, and improving the accuracy and precision of estimated effects. However, some types of seemingly successful replication may foster a spurious notion of increased credibility, if they are performed by the same team and propagate or extend the same errors made by the original discoveries. Besides same-team replication, replication by other teams may also succumb to inbreeding, if it cannot fiercely maintain its independence. These patterns include obedient replication and obliged replication. I discuss these replication patterns in the context of associations and effects in the psychological sciences, drawing from the criticism of Coyne and de Voogd of the proposed association between type D personality and cardiovascular mortality and other empirical examples.

----------------------

Experimenter Philosophy: The Problem of Experimenter Bias in Experimental Philosophy

Brent Strickland & Aysu Suben
Review of Philosophy and Psychology, September 2012, Pages 457-467

Abstract:
It has long been known that scientists have a tendency to conduct experiments in a way that brings about the expected outcome. Here, we provide the first direct demonstration of this type of experimenter bias in experimental philosophy. Opposed to previously discovered types of experimenter bias mediated by face-to-face interactions between experimenters and participants, here we show that experimenters also have a tendency to create stimuli in a way that brings about expected outcomes. We randomly assigned undergraduate experimenters to receive two different hypotheses about folk intuitions of consciousness, and then asked them to design experiments based on their hypothesis. Specifically, experimenters generated sentences ascribing intentional and phenomenal mental states to groups, which were later rated by online participants for naturalness. We found a significant interaction between experimenter hypothesis and participant ratings indicating a general tendency for experimenters to obtain the result that they expected. These results indicate that experimenter bias is a real problem in experimental philosophy since the methods and design employed here mirror the predominant survey methods of the field as a whole. The bearing of the current results on Knobe and Prinz's (Phenomenology and Cognitive Science 7(1):67-83, 2008) group mind hypothesis is discussed, and new methods for avoiding experimenter bias are proposed.

----------------------

Can We Depend on Investigators to Identify and Register Randomized Controlled Trials?

Roberta Scherer et al.
PLoS ONE, September 2012

Purpose: To reduce publication bias, systematic reviewers are advised to search conference abstracts to identify randomized controlled trials (RCTs) conducted in humans and not published in full. We assessed the information provided by authors to aid identification of RCTs for reviews.

Methods: We handsearched the Association for Research in Vision and Ophthalmology (ARVO) meeting abstracts for 2004 to 2009 to identify reports of RCTs. We compared our classification with that of authors (requested by ARVO 2004-2006), and authors' report of trial registration (required by ARVO 2007-2009).

Results: Authors identified their study as a clinical trial for 169/191 (88%; 95% CI, 84-93) RCTs we identified for 2004, 174/212 (82%; 95% CI, 77-87) for 2005 and 162/215 (75%; 95% CI, 70-81) for 2006. Authors provided registration information for 107/172 (62%; 95% CI, 55-69) RCTs for 2007, 103/153 (67%; 95% CI, 60-75) for 2008, and 126/171 (74%; 95% CI, 67-80) for 2009. Most RCT authors providing a trial register name specified ClinicalTrials.gov (276/312; 88%; 95% CI, 85-92) and provided a valid ClinicalTrials.gov registration number (261/276; 95%; 95% CI, 92-97). Based on information provided by authors, trial registration information would be accessible for 48% (83/172) (95% CI, 41-56) of all ARVO abstracts describing RCTs in 2007, 63% (96/153) (95% CI, 55-70) in 2008, and 70% in 2009 (118/171) (95% CI, 62-76).

Conclusions: Authors of abstracts describing RCTs frequently did not classify them as clinical trials nor comply with reporting trial registration information, as required by the conference organizers. Systematic reviewers cannot rely on authors to identify relevant unpublished trials or report trial registration, if present.

----------------------

Industry or Academia, Basic or Applied? Career Choices and Earnings Trajectories of Scientists

Rajshree Agarwal & Atsushi Ohyama
Management Science, forthcoming

Abstract:
We extend life cycle models of human capital investments by incorporating matching theory to examine the sorting pattern of heterogeneous scientists into different career trajectories. We link differences in physical capital investments and complementarities between basic and applied scientists across industry and academic settings to individual differences in scientist ability and preferences to predict an equilibrium matching of scientists to careers and to their earnings evolution. Our empirical analysis, using the National Science Foundation's Scientists and Engineers Statistical Data System database, is consistent with theoretical predictions of (i) sorting by ability into basic versus applied science among academic scientists, but not among industry scientists; and (ii) sorting by higher taste for nonmonetary returns into academia over industry. The evolution of an earnings profile is consistent with these sorting patterns: the earnings trajectories of basic and applied scientists are distinct from each other in academia but are similar in industry.

----------------------

The Effects of Publication Lags on Life-Cycle Research Productivity in Economics

John Conley et al.
Economic Inquiry, forthcoming

Abstract:
We investigate how increases in publication delays have affected the life cycle of publications of recent Ph.D. graduates in economics. We construct a panel dataset of 14,271 individuals who were awarded Ph.D.s between 1986 and 2000 in U.S. and Canadian economics departments. For this population of scholars, we amass complete records of publications in peer-reviewed journals listed in the JEL (a total of 368,672 observations). We find evidence of significantly diminished productivity in recent relative to earlier cohorts when productivity of an individual is measured by the number of AER-equivalent publications. Diminished productivity is less evident when the number of AER-equivalent pages is used instead. Our findings are consistent with earlier empirical findings of increasing editorial delays, decreasing acceptance rates at journals, and a trend toward longer manuscripts. This decline in productivity is evident in both graduates of top 30 and non-top 30 ranked economics departments and may have important implications for what should constitute a tenurable record. We also find that the research rankings of top economics departments are a surprisingly poor predictor of the subsequent research rankings of their Ph.D.s graduates.

----------------------

Experts in experiments: How selection matters for estimated distributions of risk preferences

Hans-Martin von Gaudecker, Arthur van Soest & Erik Wengström
Journal of Risk and Uncertainty, October 2012, Pages 159-190

Abstract:
An ever increasing number of experiments attempts to elicit risk preferences of a population of interest with the aim of calibrating parameters used in economic models. We are concerned with two types of selection effects, which may affect the external validity of standard experiments: Sampling from a narrowly defined population of students ("experimenter-induced selection") and self-selection due to non-response or incomplete response of participants in a random sample from a broad population. We find that both types of selection lead to a sample of experts: Participants perform significantly better than the general population, in the sense of fewer violations of revealed preference conditions. Self-selection within a broad population does not seem to matter for average preferences. In contrast, sampling from a student population leads to lower estimates of average risk aversion and loss aversion parameters. Furthermore, it dramatically reduces the amount of heterogeneity in all parameters.

number 66 • Winter 2026

Findings

Madness in the method

Kevin Lewis

October 14, 2012

Land of milk and honey

Kevin Lewis

Standard of care

Kevin Lewis

Insight

Archives

A weekly newsletter with free essays from past issues of National Affairs and The Public Interest that shed light on the week's pressing issues.

Sign-in to your National Affairs subscriber account.

Already a subscriber? Activate your account.

subscribe

Unlimited access to intelligent essays on the nation’s affairs.