Do's and Don'ts in Research Data Management
Christian Hillen
I have a background in history. As an archivist I have been working with historical (research) data - both analogue and digital - for the last 25 years.
Research Data Management Do’s and Don’ts - Step up your RDM skills!
1. Structuring and naming your folders There is an easy way to make your data findable for you and your team: establish a folder structure which makes sense for you and your working group as well as naming conventions for your folders.
Don’t:
Paul and Suzie
»Guideline
>application
»version2_final
»v.3
»review
»3rd.version
>JD
»qn
»0-1
Instead do:
000_int_orga
»01_application
»02_review 120_questionaires
»01_qualitative »02_quantitative 130_data
»01_qualitative »02_quantitative
Also Do:
000-int_orga
100_planning
»01_application
»01_review
>120_qualitative
»01_guideline »02_data
130_quantitative
»01_questionaire
»02_data
Want to learn more about organizing your data?:
Take part in our Data Challenge on November 7th in Cologne and learn more about Metadata and Data structuring (sign up here!) or visit University of Cologne EduLabs for more information on how to structure your data in useful ways.
2. Storing your data
Storing your data is very important not only to make them accessible for the (right) persons it is also a matter of making them findable: If you store them on a stick no other member of your working group will have access or find the data, they won’t even know this data exists.
Don’t:
measuring device (local, remote)
laptop (local, remote)
Dropbox (local, remote)
flash drive (archive)
external H(ard)D(isk)D(rive) (archive)
Do this instead:
S(olid)S(tate)D(rive) (local, remote) H(ard)D(isk)D(rive) (local, remote) N(etwork)A(attached)S(torage) (local, remote) Sciebo (local, remote) DataStorageNRW (archive) Repositories (archive)
Want to learn more about storing your data?: visit University of Cologne EduLabs or the UDE Speichermatrix.
3. Naming your data
Naming your data in an understandable and consistent manner makes it much easier for you and your team to find the data you are looking for. Therefore you should take some time to develop naming conventions.
Don’t:
Really_long_file_names_because_windows_is not_able_to_process_more_than_255_characters_and_that_includes_the_name_of_the_folders
Using abbreviations that are not generally understood in your community
Using special characters like * % [ ] > / : ä ö ü ß space
Instead do:
Readme file documenting conventions
Use inverted date format for sorting (YYYYMMDD)
If necessary add hour, minute and second
Initial numbers for sorting (01_title)
Use interoperable set of characters
A good filename could be: 20250901_sample01_H2O_v2_original.tiff.
The readme should explain the structure of your naming convention: [SamplingDate][SampleID][SampleType][VersionNumber][description]
Abbreviations should be explained as well.
Want to learn more about naming you data in a way that helps you to stay organised?:
Take part in our Data Challenge on November 7th in Cologne and learn more about Metadata, Data structuring, and file naming (sign up here!).
4. Interoperability
You can enhance the use and reuse of your data by making them interoperable.
Don’t:
Encrypting your data (if not necessary for legal reasons)
Compressing data (like in a Zip-file) or using compressed file formats (e.g. jpeg)
Using proprietary software
Instead do:
Use open standards
Add lots of metadata
Document your processes of gathering, processing, naming an storing your data
5. Write a D(ata)M(anagement)P(lan)
DMPs are required by funding institutions, but they are also useful for yourself and your team and collaborators because they raise awareness for the importance of the whole Data Life Cycle: Which and how many data are gathered when and how. How are they processed and stored, archived and reused?
Don’t:
Starting with the DMP two days before handing in your grant application
Underestimating costs for processing and storing data.
Underestimating costs for curating data (human resources)
Instead do:
Start early on so you have time to consider all the different stages of your data in the life cycle.
Think about potential costs in human resources, soft- and hardware as well as storage.
Want to learn more about DMPs? Useful resources are offered i.a. by the University of Cologne, University Duisburg-Essen, and the Heinrich Heine University