Applying the Kübler-Ross model [1] to researchers and data sharing, based on various attitudes and comments we have encountered over the years. Don’t take the presentation seriously, but take the content seriously. Part one in a series of…uh, five.
1. Denial
Symptomatic statements: “No way! My data on public attitudes towards the weather is incredibly sensitive and potentially disclosive”; “Why? Why would anyone want to use my data anyway?”
Denial is usually only a temporary defence for the researcher. This feeling is generally replaced with heightened awareness of data and research assistants that will be left behind at the end of a project and inevitable data loss on a memory stick somewhere in the future, or “Shit! A hard drive shouldn’t be making a noise like that”, or “did I copy it off my computer before IT came and replaced it?”, or “ERROR: The format of the file cannot be read.” Denial can be conscious or unconscious refusal to accept facts, information, or the reality of the situation. Denial is a defence mechanism and some people can become locked in this stage.
Data sharing is here, and it’s not going away. This means managing data to meet discipline norms, legal, technical, and funding requirements for sharing and to ensure data can be preserved for the long-term.
Denial, isn’t just a hoary old quote about a river in Egypt, nor is it a viable option for researchers when it comes to data sharing. While reusing jokes and tired old quotes is – rightly – looked upon with disapproval, reusing data or making data available for reuse is supported to the point where refusal to share with peers is no longer acceptable unless a compelling case against sharing is made. Furthermore, the case against sharing is expected to be presented prospectively rather than retrospectively. Of course, such cases do exist: research in highly sensitive commercial or political fields, with exceptionally vulnerable participants, or cases where complicated intellectual property rights make it difficult to licence data for reuse. But these are exceptions, not the norm. The expectation is to share data with the widest possible community.
Public bodies or funders of academic research have already adopted some form of data sharing requirement or strongly encourage data reuse. The OECD[2], The White House[3] and various U.S. Federal agencies and funders[4], the European Commission[5], and academic research funders in the UK,[6] Germany,[7] and Australia[8] have adopted statements and/or requirements that research they fund should be available to others in usable formats with contextual information that make the data comprehensible. In addition, academic journals are adopting policies that data based articles be made available to potential users either as a condition of publication or to be made available within a period following publication[9]. Universities themselves are also adopting data management policies[10]. The head in the sand is not an excuse.
The data sharing phenomenon is partly politically motivated, predicated on the idea that data is a non-rival and (up to a point – when open to the fullest possible extent) non-excludable public good. There is also a related efficiency argument that taxpayer investment in research should not fund additional data collection when existing suitable data is already available[11]; thereby wholly exploiting the value of data – an attractive idea in an age of significant pressures on public spending. Partly it is a normative argument: good science should be transparent and replicable.[12] Data sharing stimulates discussion about the quality of data and the reliability of findings, research methodologies, assists in teaching[13], seeds further research[14], and in worst cases, polices unintentional errors[15] or fraudulent research findings[16].
The days when the use and value of data expired with the arrival of that acceptance letter from the Journal of Tenure Securing Research are over. Now data can be used for different types of research. It can be used for replicating previous studies[17] or re-purposed and integrated with other data[18]. Data can yield insights into phenomena long after its original collection, and can feed future research questions and innovative collaborations[19].
Why would people use it? Well, like old quotes or comedy sketches, a good publication is expected to be cited. Creating a good, well documented, accessible, data set will also lead to reuse and citation, giving the data and the researcher’s name an active life-span beyond the original research project. And not to be too morbid about it, maybe beyond the researcher’s active life span too, before they go to the great archive in the sky. For as a professor once said, “In the long run we are all dead”[20], but there’s no reason why your data should also pass on, be no more, cease to be, expire, go to meet its maker, stiff, be bereft of life, rest in peace, push up daisies, kick the bucket, shuffle off its mortal coil, run down the curtain and join the choir invisible.[21]
[1] Adapted from http://en.wikipedia.org/wiki/K%C3%BCbler-Ross_model under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
[2] Organization for Economic Cooperation and Development (2007) OECD Principles and Guidelines for Access to Research Data from Public Funding (Paris: OECD Publications) http://www.oecd.org/sti/sci-tech/38500813.pdf
[3] Obama, B. “Making Open and Machine Readable the New Default for Government Information” Executive Order 13642 of May 09, 2013. http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-
[4] Dietrich, D. et.al. (2012) “De-Mystifying the Data Management. Requirements of Research Funders” Issues in Science and Technology Librarianship, 70, 1–15. http://dx.doi.org/10.5062/F44M92G2
[5] European Commission (2012) “Commission Recommendation of 17.7.2012 on access to and preservation of scientific information” C(2012) 4890 final http://ec.europa.eu/research/science-society/document_library/pdf_06/recommendation-access-and-preservation-scientific-information_en.pdf p.6
[6] Digital Curation Centre “Overview of funders’ data policies” http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
[7] Deutsche Forschungsgemeinschaft “Proposal Preparation Instructions Project Proposals” DFG form 54.01 – 04/13 http://www.dfg.de/formulare/54_01/54_01_en.pdf
[8] Australian Research Council (2013) “Discovery Projects Funding Rules for funding commencing in 2014″ http://www.arc.gov.au/pdf/DP14/DP14%20Funding%20Rules.pdf
[9] Savage, C.J., Vickers A.J. (2009) “Empirical study of data sharing by authors publishing in PLoS journals” PloS one, 4(9), e7078. http://dx.doi.org/10.1371/journal.pone.0007078
[10] Digital Curation Center “UK Institutional data policies” http://www.dcc.ac.uk/resources/policy-and-legal/institutional-data-policies/uk-institutional-data-policies
[11] Fry, J. et. al. (2008) Identifying benefits arising from the curation and open sharing of research data produced by UK Higher Education and research institutes http://repository.jisc.ac.uk/279/2/JISC_data_sharing_finalreport.pdf p.69
[12] King, G. (1995) “Replication, Replication” PS: Political Science and Politics, 28(3), 443–499. http://gking.harvard.edu/files/abs/replication-abs.shtml
[13] UK Data Service “Teaching datasets” http://ukdataservice.ac.uk/use-data/teaching/datasets.aspx
[14] Dale, A., Arbor, S., Proctor, M. (1988) Doing Secondary Analysis (Contemporary Social Research Series No. 17) (London: Unwin Hyman Ltd)
[15] Dimitrova, V. (2013, April) “Reinhart-Rogoff revisited: Coding errors happen – key problem was in not making the data openly available from the start” LSE Impact of Social Sciences http://blogs.lse.ac.uk/impactofsocialsciences/2013/04/24/reinhart-rogoff-revisited-why-we-need-open-data-in-economics/
[16] Doorn, P., Dillo I., Van Horik, R. (2013) “Lies, Damned Lies and Research Data: Can Data Sharing Prevent Data Fraud?” International Journal of Digital Curation, 8(1), 229–243. http://dx.doi.org/10.2218/ijdc.v8i1.256
[17] Inter-university Consortium for Political and Social Research “Replication Datasets” http://www.icpsr.umich.edu/icpsrweb/deposit/pra/index.jsp
[18] King, G. (2011) “Ensuring the Data Rich Future of the Social Sciences” Science, 331, 719–721. p.719 http://dx.doi.org/10.1126/science.1197872
[19] UK Data Service “Impact of Our Data” http://ukdataservice.ac.uk/about-us/impact.aspx
[20] Keynes, J.M. (1924) A Tract on Monetary Reform (London: Macmillian) p.80
[20] Monty Python (1969) “The Parrot Sketch” http://www.youtube.com/watch?v=npjOSLCR2hE