7.4 Archiving personal data
In most cases, personal data should not be published openly. However, we should also aim to make data from human studies findable and reusable through deposition in a dedicated repository. Archiving such data needs to be done with precautions, and may be possible using a combination of:
- Controlled access
- Anonymisation/Pseudonymisation
- Informed consent
If your research collects personal information, the General Data Protection Regulation (GDPR) will apply to your data. Note that archiving data indefinitely is in conflict with the fundamental principles of data minimisation, purpose limitation, and storage limitation of the GDPR. Article 89 opens up for exceptions for research and statistical data. However, national legislation also comes into play and may affect local practice. Make sure you familiarise yourself with the local regulations and guidelines.
Personal data which have been anonymised can be openly published and shared because the GDPR no longer applies to these data. Even though you are legally allowed to share anonymous data, good ethical research practice dictates that the participants should be informed of your intents with the data.
However, it may be difficult, and even impossible in some cases, to achieve completely anonymous data. The common problem is that when information is removed from your data, the value of the data is reduced. It is necessary to make a judgment call for each individual dataset to decide how much information to remove, which can leave a dataset partially de-identified (pseudonymised). For the data to be truly anonymous, the process must be irreversible. Data that have been de-identified but later can be used for re-identification are still considered personal data. This means that as long as a code sheet or contact information exist on your computer, the GDPR applies to the data. Also, be aware that data that appear fully anonymised can potentially lead to the identification of individuals through a combination of information, either within or across datasets.
Personal information data may be deposited in a repository that provides controlled access regulation. The norm is to deposit the data in a pseudonymised form. Your description of the dataset (metadata) and the access conditions will be published, so that interested researchers may discover the dataset and learn how to request access. To access the data, they must prove their credentials, and the purpose of the new research must fulfil the terms of the dataset. Notice also that the GDPR says that personal data, even with controlled access, should not be transferred or stored outside of Europe. This also applies to online platforms and tools for storage, analysis, or archiving. If you process data on persons, it is important that you and your collaborators abide by this.
Some types of personal data cannot be anonymised. In this video, you will learn how informed consent, in combination with access control, can be used for sharing of audiovisual research data that cannot be anonymised.
Transcript of video "Archiving sensitive data"
If you have already collected personal data and the original consent did not account for data sharing, consider obtaining a new consent from the study participants.
The responsibility rests on you to exercise professional judgement. No matter how you choose to deposit (or destruct) the data at the end of the project, when publishing your results, you should account for the availability and conditions for accessing the data in a Data Access Statement. It should also state justifiable explanations for any restrictions on the access.
Lessons learned
- Sensitive data can often be shared openly by using a combination of informed consent, data anonymisation and restricted data access.
- Informed consent to open sharing of data containing personal information should be granular and specific.
Recommended reading if you want to learn more:
For more information on how to share personal data by using a combination of obtaining informed consent, data anonymisation and regulating data access, please read Chapter 5 in CESSDA’s Data Management Expert Guide .