Archiving and/or publishing data

This page is an extension of the FSS RDM Guidelines.

When publishing research outputs, researchers at the Faculty of Social Sciences are expected apply the FAIR principles and to archive “all data that can be reasonably deemed necessary to verify the findings of the research”. This page aims to provide practical advice on how to achieve this. The first section discuss the constituent principles of FAIR data (Findability, Accessibility, Inter-operability and Re-usability), the second provides practical advice on what to archive, based on the Archiving Guidelines from the Deans of Social Sciences in the Netherlands.

How to apply the FAIR principles

The FAIR principles are a set of principles that guide researchers in making their data (or other research outputs) more valuable by increasing visibility, fostering collaboration, facilitating co-creation, and promoting transparency. To effectively implement the FAIR principles, it is beneficial to envision how your data will be reused, and then apply each principle accordingly. This involves ensuring discoverability of your data by end users, enabling easy access, facilitating integration with existing knowledge and workflows, and promoting its reuse.

Given the diverse research traditions and approaches to data reuse within the faculty, there isn’t a universally prescribed approach for applying the FAIR principles. However, as a minimum requirement, all data utilized within FSS should be usable for verifying the findings presented in publications resulting from the data. This section provides practical pointers, aligned with the FAIR principles, on how you can make verification possible. Additionally, it presents further steps you can take to increase the impact of your research outputs.

Findable

Findable means people can find out about the existence of your dataset, and know where to find more information about it.

Actions required to allow for verification:

  • Register your data set in a registry like the VU Research Portal. (NB: you don’t need to upload your data set for this!)
  • Ensure your data set is assigned a DOI (for example by adding it to a Vault on Yoda and publishing the metadata).

More actions you can take to make your data set more Findable:

  • Cite your data sets in your paper.

Accessible

Accessible means that there should be an established way to access your data. This does not mean you should publish your data, just that there is a clearly communicated and transparent procedure put in place to access the data.

Actions required to allow for verification:

  • Add a data availability statement in your publication that explains how, and under what conditions, the data can be accessed.
  • Archive your data in a restricted-access repository like Yoda.

More actions you can take to make your data set more Accessible:

  • Make your data publicly available on Yoda, the OSF or DataVerseNL.

Inter-operable

Inter-operable means that someone else can use your data and combine it with their existing knowledge and workflows.

Actions required to allow for verification:

  • Use common, preferably open, file formats.
  • Include documentation that can help people make sense of your data, such as codebooks, interviewer manuals, and topic lists.
  • Wherever possible, use a language that can be understood by anyone who may reasonably be expected to use your research outputs.

More actions you can take to make your data set more Inter-operable:

  • Use standardized variables, coding schemes and vocabularies.
  • Make sure the metadata of your datasets link to related datasets, publications and other relevant research outputs.

Re-usable

For your data to be re-usable, it needs to be clear what can and cannot happen with your data.

Actions required to allow for verification:

  • Include the informed consent sheets or privacy statements you provided your respondents, so that it is clear what they allow the data to be used for.
  • Make sure that when you register your dataset, or deposit it into a repository, you include detailed information like author, topic, keywords, etc. See the VU’s minimal metadata standards. Note that Yoda requires you to include this information before submission.

Actions you can take to make your data more Re-usable:

  • Include detailed information, for example in a readme, on the provenance of your data: where it comes from, how was it has been collected and how it was processed.
  • Include a license, like CC-BY or the DANS license, with your data; or set out a Data Sharing Agreement that exactly states what a recipient can do with the data.

What to archive?

The following materials should be archived, within one month after the publication date:

  • A copy of the publication that uses the data. Most publications allow you to upload the submitted version (i.e. without the journal’s layout etc.).
  • Raw data: unedited data files providing the most direct registration of the behaviour or reactions of test subjects/respondents. If the raw data files have been accessibly stored in an external archive (such as storage facilities at DANS), or if the data cannot be archived on university servers (for example due to IP restrictions) making reference to the location of the files will suffice. The VU researcher must ensure that such externally stored raw data will be available for verification purposes. Raw data may not be changed once they have been made digitally available.
  • Analyzed data: the data files that were eventually analysed when preparing the article (e.g. an SPSS data file after transforming variables, after applying selections, etc.). This is not necessary if the raw data file was directly analysed, or if the analyzed data can be constructed without excessive effort from the raw data (for example by running a script).
  • A description of the procedures to transform the raw data into analyzed data. This could be computer code (for example Atlas.ti, SPSS/JASP syntax file, MATLAB analysis scripts, R code) or a description of the steps taken in the qualitative analysis of primary research data, i.e. themes, domains, taxonomies, components.
  • A description of the steps taken to process the analyzed data into results in the manuscript. This could be computer code or a description of the steps taken in the qualitative analysis of primary research data.
  • Any documentation that can reasonably be deemed necessary in order for other researchers to understand the data and/or verify the research’s findings. The precise documentation depends on the methods used, but examples include: study design documents, interview guides, questionnaires, surveys, and topic lists. The materials must be available in the language in which the research was conducted.
  • A readme file (metadata) describing which documents and files can be found where and how they should be interpreted. The readme file must be sufficiently clear, so that a relevant fellow researcher can verify the results discussed in the publication. The readme file must also contain the following information:
    • Name of the person who stored the documents or files
    • Division of roles among authors, indicating at least who analysed the data
    • Date on which the manuscript was accepted, including reference
    • Date/period of data collection
    • Names of people who collected the data
    • If relevant: addresses of field locations where data were collected and contact persons (if any)
    • Whether the data is made open or not and if not, a valid reason for not opening up the data
  • Documents received from the Research Ethics Review Committee: at least the result of the self-check, and if applicable result from a full review.
  • If using personal data: information about the informed consent procedure, such as a privacy statement, or a blank informed consent form.