Library Header Image
LSE Research Online LSE Library Services

Data cleaners for pristine datasets: visibility and invisibility of data processors in social science

Plantin, Jean-Christophe ORCID: 0000-0001-8041-6679 (2018) Data cleaners for pristine datasets: visibility and invisibility of data processors in social science. Science, Technology and Human Values. ISSN 0162-2439

Text - Accepted Version
Download (689kB) | Preview
Identification Number: 10.1177/0162243918781268


This article investigates the work of processors who curate and “clean” the data sets that researchers submit to data archives for archiving and further dissemination. Based on ethnographic fieldwork conducted at the data processing unit of a major US social science data archive, I investigate how these data processors work, under which status, and how they contribute to data sharing. This article presents two main results. First, it contributes to the study of invisible technicians in science by showing that the same procedures can make technical work invisible outside and visible inside the archive, to allow peer review and quality control. Second, this article contributes to the social study of scientific data sharing, by showing that the organization of data processing directly stems from the conception that the archive promotes of a valid data set—that is, a data set that must look “pristine” at the end of its processing. After critically interrogating this notion of pristineness, I show how it perpetuates a misleading conception of data as “raw” instead of acknowledging the important contribution of data processors to data sharing and social science.

Item Type: Article
Official URL:
Additional Information: © 2018 the Author(s)
Divisions: Media and Communications
Subjects: H Social Sciences > H Social Sciences (General)
Date Deposited: 17 Jul 2018 15:31
Last Modified: 13 Jul 2024 06:30
Funders: University of Michigan

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics