Reply to Krefeld-Schwalb et al.: Measuring population heterogeneity requires upholding good scientific practice

Felix Holzmeister; Magnus Johannesson; Robert Böhm; Anna Dreber; Jürgen Huber; Michael Kirchler

doi:10.1073/pnas.2426330122

Back

Reply to Krefeld-Schwalb et al.: Measuring population heterogeneity requires upholding good scientific practice

Letter/Communication

Open access

Reply to Krefeld-Schwalb et al.: Measuring population heterogeneity requires upholding good scientific practice

Felix Holzmeister, Magnus Johannesson, Robert Böhm, Anna Dreber, Jürgen Huber and Michael Kirchler

Proceedings of the National Academy of Sciences - PNAS, Vol.122(8), e2426330122

2025-02-25

DOI: https://doi.org/10.1073/pnas.2426330122

PMID: 39964730

Abstract

Letters

Social Sciences

We appreciate Krefeld-Schwalb et al.'s (KHJ henceforth) (1) interest in our study (2) and critical discourse spurred by their commentary. We largely agree with KHJ's theoretical arguments, which closely relate to the caveats discussed in our manuscript. Particularly, we acknowledge that the reviewed multilab studies are typically based on samples from WEIRD countries, which may entail lower heterogeneity than in other settings (3 , 4). " Put differently, our comparatively low estimates of population heterogeneity might be subject to population heterogeneity itself. " (2). However, we express reservations about KHJ's empirical claims about the magnitude of population heterogeneity, drawing on Krefeld-Schwalb et al. (KSJ henceforth) (5). It seems that KSJ intentionally studied paradigms expected to yield large meta-analytic effect sizes and " employed purpo-sive variation of the sampling frame " (1) to enhance heter-ogeneity. Olsson-Collentine et al. (6) provide evidence for a correlation between effect sizes and heterogeneity. Study 1 in KSJ documents effect size estimates of four paradigms across ten online samples and one laboratory sample. KHJ's estimates of H , ranging from 1.7 to 9.6, suggest that population heterogeneity is markedly larger than the average level of population heterogeneity observed in our sample. However, both KSJ and KHJ fail to report estimates for a fifth preregistered paradigm embedded in KSJ's study—the " local warming " effect—and omit preregistered analyses excluding inattentive participants. Table 1 summarizes heterogeneity estimates for analyses mimicking KSJ's preregistration. Revisiting KSJ's data on the local warming effect indicates that effect size estimates are homogeneous with H = 1. Moreover, population heterogeneity estimates for the five paradigms turn out to be lower after excluding inattentive participants. KHJ also reports heterogeneity estimates for studies 2 and 3 in KSJ, which are, however, based on only two samples each. Quantifying heterogeneity based on very small numbers of studies (k) has been shown to be inappropriate and can be misleading; for k = 2, H is uninformative since H = Q ÷ (k −1) = Q. (7) After all, heterogeneity in KSJ appears to be much smaller than suggested by KHJ, with H estimates ranging from 1.0 to 3.2 when including inattentive subjects and between 1.0 and 1.6 when excluding inattentive subjects. How heterogeneous are the populations in KSJ? All online samples in Study 1 were drawn from " anglophone participants in three highly developed countries " (1) and turned out to be relatively similar in terms of demographic characteristics. When the crowdsourced marketplace is held constant across three samples (Prolific, Prolific US, and Prolific UK), effect size estimates are remarkably similar. This suggests that data collection via online platforms may exacerbate het-erogeneity due to, for instance, different procedures for screening and compensating participants. This, in turn, may introduce variability in the number of bots, attrition rates, attention, experience, and comprehension—moderating factors for which KSJ provides supporting evidence. Con sequently, the generalizability of empirical claims across and beyond varying online marketplaces may be lower than for studies based on laboratory or observational data. The extent of population heterogeneity is ultimately an empirical question that requires further investigation and evidence.

Files and links (2)

pdf

2025_JournalArticle_JohannessonMagnus_Reply164.27 kBDownload View

Published (Version of record) Open Access CC BY V4.0

url

https://doi.org/10.1073/pnas.2426330122View

Published (Version of record) Open CC BY V4.0

Metrics

11 File views/ downloads

37 Record Views

Details

Title: Reply to Krefeld-Schwalb et al.: Measuring population heterogeneity requires upholding good scientific practice
Creators: Felix Holzmeister - Universität Innsbruck
Magnus Johannesson - Stockholm School of Economics, Department of Economics
Robert Böhm - University of Vienna
Anna Dreber - Universität Innsbruck
Jürgen Huber - University of Innsbruck
Michael Kirchler - Universität Innsbruck
Publication Details: Proceedings of the National Academy of Sciences - PNAS, Vol.122(8), e2426330122
Publisher: National Academy of Sciences
Academic Unit: Department of Economics
Language: English
Resource Type: Letter/Communication

Reply to Krefeld-Schwalb et al.: Measuring population heterogeneity requires upholding good scientific practice

Abstract

Files and links (2)

Metrics

Details

Stockholm School of Economics Social media