4.3 Documenting your data: The Readme file
Besides rich metadata, the Readme file is the other means by which data should be documented. So why do you need a Readme file, when your dataset is archived with rich and well-structured metadata? The main difference between metadata and a Readme file in its purpose is that metadata is to ensure machine readable information, while the Readme file is meant for human reading. However, the Readme file may still be searchable by text mining tools. So even if you should view the Readme file primarily as a human readable guide to your dataset, text mining will enhance how people may find your dataset through searching.
When writing your Readme file, you can reuse information from your Data Management Plan, if you have worked well with that.
In this video you will learn why the Readme file is crucial in order to avoid misinterpretation of the dataset.
Transcript of video "How to structure ...: Documenting your data - The Readme-file"
Lessons learned:
- The Readme file complements the metadata documentation, and is a human readable introduction and explanation of what information the dataset holds.
- Even the Readme file may be searchable through text mining tools.
- Take the needs of an outsider
as the starting point, and include in the Readme file all information needed to
make sure anyone is able to understand and interpret your dataset correctly, both now and also many years from now.
- You should start entering information into your Readme file early, and update the file as new information is obtained.
- The Readme file should be in a plain preferred format, either plain text with UTF8, or PDF/A.
Think through your own PhD project and the data you have collected, or plan to collect. What do you see as essential to include and explain in a Readme file, to make sure your dataset is understood correctly by outsiders?