Data organization

This page is an extension of the FSS RDM Guidelines.

Introduction

Researchers of the Faculty of Social Sciences (FSS) are responsible for organizing their data in such a way that they can be archived without excessive effort. In general terms, the aim is to ensure that a fellow researcher can use the data without asking too many questions. This ensures that the results of the research can be verified if the need arises. Furthermore, should additional researchers be added to a project - or a researcher gets back from a long hiatus - they can get started quickly.

This page provides several use cases (quantitative and qualitative) that serve as inspriration for organizing research data. Each use case gives an outline of a folder organization that is used during the research. These use cases serve as examples, researchers are free to use any organization that fits the needs of their research.

Use cases

Simple quantitative research project using Research Drive and Yoda

Get the sample DMP for this use case here.

This use case describes a fictional project using survey data. It uses Yoda for archiving only; for storage of data the project uses Research Drive. While Yoda is suitable for storage as well, Research Drive offers more fine-grained access control, which this fictional project needs to make sure student-assistants can’t access all information stored on the research drive.

Research drive is used for day-to-day storage and synced to researcher’s devices using the OwnCloud software. The folder is organized as follows:

  1. Data Pseudonymized research data (access is granted as needed)
  2. Documentation Questionnaires, proposals, data management plan etc. (everyone in project can access)
  3. Papers One sub-folder per paper containing text, analysis scripts etc. for each paper. (access is granted as needed)
  4. Admin Project admin information, such as budgets; not accessible to students

Yoda is used in this project for archiving, and is thus not synced to any devices. Directly following data collection, the raw data was pseudonymized. The pseudonymized data was stored on Research Drive. The raw, unpseudonymized data was archived on Yoda, assigned to the Vault and permanently deleted everywhere else. This ensures that a copy of the original data is always available (with a DOI), and minimizes the risk of leaking unpseudonymized data.

Once a paper based on the data is published, a folder is created for that paper with all things that can be shared publicly. This folder copies the Yoda metadata of the root folder, making complying with the FAIR principles very easy. The folder is made public on Yoda and looks as follows:

  • Raw data (Vault)
    • Data files
    • Documentation -Yoda metadata
  • Replication files for each paper (Public):
    • Author manuscript of paper
    • Analysis and pseudonymization scripts
    • Documentation
    • Yoda metadata

Note that the data itself is not made publicly available because of privacy concerns. The raw data is only archived for verification purposes. In case of doubts about the research integrity, the Raw Data’s DOI (listed in each paper’s data statement and replication files’ metadata) can be used to quickly identify the data set for verification purposes.

Simple qualitative research project using Yoda

Get the sample DMP for this use case here.

The following is a basic qualitative project. All data is stored on Yoda. Once data collection is complete, the data folder is added to a non-public Yoda vault, a DOI of which is included in every paper. The documentation folder is made publicly available to allow re-use of topic lists etc.

  • Yoda
    • Raw Data (Vault, non-public)
      • Interview recordings
      • List of names with interviewees
    • Data
      • Pseudonymized interview transcripts
    • Documentation (Vault, public)
      • Sampling information
      • Topic lists
      • Ethics review information
      • Blank informed consent form
    • Papers: One sub-folder per paper containing text, figures, etc. for each paper.