Applying the Kübler-Ross model[1] to researchers and data sharing, based on various attitudes and comments we have encountered over the years. Don’t take the presentation seriously, but take the content seriously. Part five in a series of…uh, five.
5. Acceptance
In this last stage, individuals begin to come to terms with sharing data. This stage varies according to the person’s situation. Researchers can enter this stage a long time before the data they leave behind – especially when it comes to metadata and documentation.
Acceptance. Ashes to ashes, funk to funky etc.
We have mentioned this before[2], but the big way you can help is by thinking about contextual documentation and providing metadata. It cannot be emphasised in a big enough font or in bold enough type that you, the researcher, know your data better than anyone — much better than windowless room dwelling archivists who never see sunlight and are incapable of meaningful relationships with anything other than information packages.
The point is, the contextual and descriptive material you provide enriches and enhances data, ensuring it is comprehensible to you, us, other researchers, and possibly as yet unknown monsters from outer space. Yes, we can preserve it, but what good is preserving something you can’t understand or use? Without that guide for interpretation and meaning, it’s not data, it’s just stuff — lots and lots of stuff, and these days who wants more stuff? So, how can you help us preserve data and avoid hoarding stuff? As a guideline, ask yourself the question: if this data wasn’t mine, what would I need to understand these data? Then ask, do I have it written down? If not, write it down. Then save it. In an open format[3]. Thanks.
You can help us to help you to help others help themselves by using data storage formats that are open and free (i.e. we can see how they work, rather than some big commercial organisation locking away the operating manual and intellectual property as a trade secret and supporting the format for only as long as they see fit). This really helps the long-term preservation aspect of archiving because storage formats quickly become obsolete and inaccessible (though not technically obsolete, remember those days when media content was almost always Real Audio/Media files? Yes? Don’t worry, you’re not that old, though you probably remember when Princess Diana died. Go on, try to play one of these[4] video files from her funeral). If they are open files we can transfer them to contemporary formats (“migration”) or engineer a way to access them (“emulation”) much easier, but critically, without the complications of Intellectual Property Rights of closed formats preventing us, which often present a greater challenge to negotiate than the technological obstacles.
As a social science archive we have additional responsibilities other than storage. We have to ensure data is used responsibly. Data in our collection is almost always based on human subjects (though there is scope for researching aliens with feet like water wings and purple hair). These subjects, like you and me, have legal rights to privacy[5], and ethical rights to anonymity. By devising a sensible anonymisation strategy[6] that protects the identity of participants while still retaining the research value of the data you can help us in making sure this is the case. Through advising on anonymisation strategies, applying restrictive re-use licence agreements, or supporting virtual and physical secure data enclaves[7], we can offer an infrastructure for protecting and safely sharing confidential data.
How? At this stage it’s worth knowing a little bit about the infrastructure behind making digital data available and re-usable over time. A lot of what we do isn’t widely known* but is essential to support digital preservation and data re-use.
*But we’d love to talk about what we do[8] if you ask. No, wait, come back! Please!
Data archiving isn’t as simple as pressing “Save” every now and again. Naturally, we have to make sure digital objects are saved but we have to make sure they are saved in a strategic manner to ensure data isn’t lost or compromised through accidents, malicious intent, natural disasters, acts of a Supreme Being and divine intervention, or invasions by extra-terrestrial beings with big bug eyes and the death-ray glare. We also have to make sure that the way we save data means it can be accessed in the future – even by alien beings whose skin is jelly, whose teeth are green.
Most archives and repositories are based on a standard model called the Open Archival Information System[9] (OAIS). OAIS (please don’t confuse it with a similarly named mid-1990s Britpop act, or you’ll end up basing an archive on something as catastrophic as Be Here Now) is a reference system for long-term preservation and making data available to a designated community. This means is we work with recognised open international standards for archiving digital materials — an alternative way of saying we simply don’t just make it up as we go along. The consequence is that if you place your data in an archive, it will last (except in the case of extra-terrestrial invasion by blobs of slime or other credible potential sources of existential risk to life on earth — even OAIS functionality can’t plan for the heat death of the universe).
By adhering to these standards we archives build relationships amongst ourselves (archive to archive, funk to funky) but, critically, with the research community as well. For that reason, archives have worked on establishing recognised standards of trustworthiness. The Data Seal of Approval Assessment[10] (DSA) sees a self-policing community approach at work, with archives assessing themselves against the DSA set of guidelines and then subjecting themselves to peer-review from a previous recipient of the DSA. ISO 16363 standard for the Audit and Certification of Trustworthy Digital Repositories[11] and the nestor Seal for Trustworthy Digital Archives[12] (DIN 31644) are awarded following formalised independent audits of policies and procedures.
Trust is the currency of data archiving. Data creators need to trust us when we say that what we do will not damage the integrity of their data, while researchers using archives need to trust us to give, and continue to give, them the data we claim we are giving them. Any breach, be it wilful or accidental, devalues that currency.
Hopefully we’ve reached acceptance. What we do is provide the means: discoverable, responsibly accessible, lasting, easily referenced, documented data. What you do is the ends: create that data, re-use that data, re-purpose, enhance, and extract as much knowledge as possible from it. And, with that, dear reader, we will all live happily ever after.
The end.
Acknowledgment
To John Cooper Clarke’s (I Married A) Monster From Outer Space for helping overcome a bit of writer’s block.
References
[1] Adapted from http://en.wikipedia.org/wiki/K%C3%BCbler-Ross_model under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
[2] Archive and Data Management Training Center (2013) “The Five Stages to Data Sharing: Anger” (Accessed October 21, 2013) from https://admtic.wordpress.com/2013/09/23/the-five-stages-to-data-sharing-anger/
[3] Open Formats (n.d) “Why Use Open Formats?” openformats.org (Accessed October 21, 2013) from http://www.openformats.org/enShowAll
[4] BBC News (1997) “Special Report: Diana Remembered” (Accessed October 21, 2013) from http://www.bbc.co.uk/news/special/politics97/diana/
[5] Council Directive (EC) 95/46/EC of 24 October 1995 on the harmonisation of certain aspects of copyright and related rights in the information society [1995] OJ L 281 http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:EN:NOT
[6] UK Data Service (n.d) “Anonymisation” (Accessed October 21, 2013) from http://ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation.aspx#/tab-identifiers
[7] GESIS – Leibniz Institute for the Social Sciences (2013) “The Secure Data Center” (Accessed October 21, 2013) from http://www.gesis.org/unser-angebot/daten-analysieren/datenservice/secure-data-center-sdc/
[8] GESIS – Leibniz Institute for the Social Sciences (2013) “Welcome to the Archive and Data Management Training Center (Accessed October 21, 2013) from http://www.gesis.org/en/archive-and-data-management-training-and-information-centre/training-center-home/
[9] Personal Archives Accessible in Digital Media Project (2008) “Introduction to OAIS” (Accessed October 21, 2013) from http://www.paradigm.ac.uk/workbook/introduction/oais.html
[10] Data Seal of Approval (Accessed October 21, 2013) from http://www.datasealofapproval.org/en/assessment/
[11] Primary Trustworthy Digital Repository Authorisation Body (ISO-PTAB) (2011) “Preparing for an ISO 16363 Audit” (Accessed October 21, 2013) from http://www.iso16363.org/preparing-for-an-audit/
[12] nestor (2013) “nestor Seal for Trustworthy Digital Archives” (Accessed October 21, 2013) from http://www.langzeitarchivierung.de/Subsites/nestor/EN/nestor-Siegel/siegel_node.htmltml