Do's and Don'ts in Research Data Management

Do's and Don'ts in Research Data Management

Christian Hillen

Christian Hillen

I have a background in history. As an archivist I have been working with historical (research) data - both analogue and digital - for the last 25 years.

Research Data Management Do’s and Don’ts - Step up your RDM skills!

1. Structuring and naming your folders There is an easy way to make your data findable for you and your team: establish a folder structure which makes sense for you and your working group as well as naming conventions for your folders.

Don’t:

Paul and Suzie
»Guideline
>application
»version2_final
»v.3
»review
»3rd.version
>JD
»qn
»0-1

Instead do:

000_int_orga
»01_application
»02_review 120_questionaires
»01_qualitative »02_quantitative 130_data
»01_qualitative »02_quantitative

Also Do:

000-int_orga
100_planning
»01_application
»01_review
>120_qualitative
»01_guideline »02_data
130_quantitative
»01_questionaire
»02_data

Want to learn more about organizing your data?:
Take part in our Data Challenge on November 7th in Cologne and learn more about Metadata and Data structuring (sign up here!) or visit University of Cologne EduLabs for more information on how to structure your data in useful ways.

2. Storing your data
Storing your data is very important not only to make them accessible for the (right) persons it is also a matter of making them findable: If you store them on a stick no other member of your working group will have access or find the data, they won’t even know this data exists.

Don’t:

measuring device (local, remote)
laptop (local, remote)
Dropbox (local, remote)
flash drive (archive)
external H(ard)D(isk)D(rive) (archive)

Do this instead:

S(olid)S(tate)D(rive) (local, remote) H(ard)D(isk)D(rive) (local, remote) N(etwork)A(attached)S(torage) (local, remote) Sciebo (local, remote) DataStorageNRW (archive) Repositories (archive)

Want to learn more about storing your data?: visit University of Cologne EduLabs or the UDE Speichermatrix.

3. Naming your data
Naming your data in an understandable and consistent manner makes it much easier for you and your team to find the data you are looking for. Therefore you should take some time to develop naming conventions.

Don’t:

Really_long_file_names_because_windows_is not_able_to_process_more_than_255_characters_and_that_includes_the_name_of_the_folders
Using abbreviations that are not generally understood in your community
Using special characters like * % [ ] > / : ä ö ü ß space

Instead do:

Readme file documenting conventions
Use inverted date format for sorting (YYYYMMDD)
If necessary add hour, minute and second
Initial numbers for sorting (01_title)
Use interoperable set of characters
A good filename could be: 20250901_sample01_H2O_v2_original.tiff.
The readme should explain the structure of your naming convention: [SamplingDate][SampleID][SampleType][VersionNumber][description]
Abbreviations should be explained as well.

Want to learn more about naming you data in a way that helps you to stay organised?:
Take part in our Data Challenge on November 7th in Cologne and learn more about Metadata, Data structuring, and file naming (sign up here!).

4. Interoperability
You can enhance the use and reuse of your data by making them interoperable.

Don’t:

Encrypting your data (if not necessary for legal reasons)
Compressing data (like in a Zip-file) or using compressed file formats (e.g. jpeg)
Using proprietary software

Instead do:

Use open standards
Add lots of metadata
Document your processes of gathering, processing, naming an storing your data

5. Write a D(ata)M(anagement)P(lan)
DMPs are required by funding institutions, but they are also useful for yourself and your team and collaborators because they raise awareness for the importance of the whole Data Life Cycle: Which and how many data are gathered when and how. How are they processed and stored, archived and reused?

Don’t:

Starting with the DMP two days before handing in your grant application
Underestimating costs for processing and storing data.
Underestimating costs for curating data (human resources)

Instead do:

Start early on so you have time to consider all the different stages of your data in the life cycle.
Think about potential costs in human resources, soft- and hardware as well as storage.

Want to learn more about DMPs? Useful resources are offered i.a. by the University of Cologne, University Duisburg-Essen, and the Heinrich Heine University

Related Posts

Carpentries Workshop - Introduction to Python

Carpentries Workshop - Introduction to Python

Empowering Researchers with Foundational Computing Skills: Join the Upcoming Carpentries Workshop In today’s fast-paced research environment, the ability to harness computational tools effectively can make a world of difference.

Read More
How To: Good Scientific Practice

How To: Good Scientific Practice

“Scientific integrity forms the basis for trustworthy research”, so it says in the Guidelines for Safeguarding Good Research Practice of the DFG, the German Research Foundation.

Read More
FDM-Werkstatt - Into the RDM-Toolbox!

FDM-Werkstatt - Into the RDM-Toolbox!

The Center of Data Litercacy (German: “Zentrum für Datenkompetenz”) DKZ.2R was officially launched mid November 2023.

Read More