Why does it matter what names you give your data files and the folders? Well in fact, this may be crucial, in order to avoid mixing up your data and ruining your analysis. You do not want to risk that! In this video, you will learn the basics of how to organise your data in files and folders.



Transcript of video "How to structure ...: Organizing a data set: files and folders"

Lessons learned:
  • Decide early – before you start collecting data – how you will organise your data into files and folders
  • Organising your data in files and folders should create a clear overview of what your dataset encompasses, and where in your dataset you find what.
  • The names of your files and folders should be consistent (follow the same pattern throughout your dataset), and be informative, telling anyone what is held in each file and each folder.
  • The names of the files and folders should be unique and clear, avoiding the risk of mixing up what data are held in which file and folder.
  • The name of a file should indicate which folder it belongs to.
  • Alphabetic ordering of files is common, and you should take this into consideration when giving your files and folders their names.
  • If a data file is updated with a new version, the version number should be visible in the file name.
  • Be careful to describe your file and folder naming syntax in your Readme file.

Food for thought 
Think through your own PhD project regarding the data that you have collected or generated, or plan to collect or generate:
  • How are these organised or how are you planning to organise them in files and folders?
  • How did you or do you plan to name your files and folders? 
Think through different alternatives for doing this, and assess and compare them.
Examples and more food for thought:
Here are some dataset examples with several files, that illustrates how to structure the files. Can you suggest some improvements to the naming of the files? 
Note: In the Files-tab, select Change View to Tree.


Naposledy změněno: čtvrtek, 8. prosince 2022, 13.41