8.2 The Attribution condition in Creative Commons licenses
By Aysa Ekanger, licensed under CC BY 4.0
Last updated 2022-12-13
Creative Commons (CC) licenses allow users to freely re-distribute the licensed material, and most of the licenses allow users to modify and build on the material. All of the CC licenses contain the Attribution condition that has to be satisfied when users share the material, including in modified form.
The Attribution (BY) icon from Creative Commons
The Attribution condition in CC licenses seems to be easy to understand: credit the author – and that’s it, right? Wrong
Researchers are used to acknowledging other authors, and there are many disciplinary norms and many reference styles – but it is necessary to remember that the Attribution condition in CC licenses contains a range of obligatory elements that cannot be omitted. When you write an article and refer to a published dataset, you can resort to the reference norms of your choice; note that a dataset citation needs to include elements that a paper citation doesn’t (see information on how to cite datasets in the data citation section of the course). If you re-distribute someone else’s CC licensed dataset – in whole or in part, in original or modified form – additional requirements are imposed by the Attribution condition of CC licenses.
Keep in mind that CC licenses apply to copyright and (from version 4.0) to sui generis database rights, which means that the Attribution condition needs to be observed only in the kinds of reuse restricted by these rights. If your reuse is covered by exceptions and limitations to copyright, or does not involve extraction, copying and re-distribution of (un)modified parts of the licensed material, you are not obliged to satisfy the Attribution condition. If the licensed material has entered the Public Domain, you do not have to adhere to license conditions.
We’ve heard researchers say that they give credit by providing a list of DOIs for the source materials of their dataset. This is not proper attribution as defined by CC licenses and the associated sharing of the derived dataset would constitute a violation of the CC licenses’ Attribution condition.
What must be included in the attribution statement
So, you have found material that is licensed with a CC license and want to include this material in your dataset – how do you satisfy the Attribution condition? In the 4.0 version of CC licenses, Attribution consists of the following:
- Appropriate
credit, which consists of the following elements, if supplied:
- the name of the creator and attribution parties
- a copyright notice
- a license notice
- a disclaimer notice
- a link to the material
- A link to the license
- Indication of whether changes were made
Some CC license versions prior to 4.0 have slightly different content for the Attribution condition. An attribution condition in other licenses (not from CC) may also have different content. In order to comply with the conditions of a license, you must consult the content of the appropriate license or license version.
Licensors cannot modify the elements of the Attribution condition – and still identify the license as a CC license. Any additional credit elements (such as where the licensed dataset was published, or an instruction on where exactly in the derived work to place the attribution) must be formulated by the dataset’s rights holder as a request, not as a requirement. See more at https://creativecommons.org/faq/#alterations-and-additions-to-the-license.
One more important thing to remember is that, when providing attribution, users cannot imply endorsement by the creator of the original work – unless they have received an implicit permission from the creator to do so.
Where to attribute
Where do you put all this information? The Attribution condition in the 4.0 version of the CC licenses can be satisfied in 'any reasonable manner' – and what is considered 'reasonable' will differ in accordance with how, where and in what form you share the licensed material. A txt file (Readme file or License file) that will accompany your derived dataset in a data repository will be able to include all of the information that is part of the CC Attribution condition.
Of course, if you build your dataset based on multiple sources/datasets that all have an attribution condition, it will be a lot of work to include the necessary details of all those sources into your Readme file – but it is work that must be done, otherwise (unless you can satisfy the Attribution condition in any other reasonable manner) you will be violating the license conditions of your source material.
The Attribution condition in CC licenses, community attribution norms and the moral right of attribution
The Attribution condition in CC licenses (and its equivalent in other licenses) is not the only mechanism that ensures acknowledgement of sources – there are also community norms and copyright requirements.
The importance of giving proper credit and the unacceptability of plagiarism is emphasised in numerous documents and guidelines on research integrity/ethics, such as theEuropean Code of Conduct for Research Integrity, Principles of Transparency and Best Practice in Scholarly Publishing, and various country-specific and discipline-specific guidelines. What constitutes ‘proper credit’ is based on ‘good reference practice’ and may vary between disciplines, types of research output and countries.
The Attribution condition in CC licenses is a more rigid mechanism than community norms, as it dictates what elements must be included in the attribution (see previous section).
Attribution may also be required by copyright law: most countries of the world grant creators based in those countries the moral right of attribution – the right to be acknowledged as the author of a work in accordance with good practice. It varies from country to country what types of intellectual output are protected by the moral right of attribution, and various types of research data may or not be protected by this right. In the European Union, databases are generally not considered creative works and are thus not protected by the moral right of attribution (creatively compiled databases are), computer programs are considered creative works, and for text, images and sound, sometimes a legal assessment must be made of whether they can be considered creative works.
The Attribution condition in CC licenses does not depend on the type of research data – reusers of a CC-licensed dataset have to observe the Attribution condition regardless of whether this dataset is protected by the moral right of attribution or not. The Attribution condition in CC licenses must also be adhered to in those jurisdictions that do not recognise the moral right of attribution.
The
Attribution condition in CC licenses does not depend on the type of research
data – re-users of a CC-licensed dataset have to observe the Attribution
condition regardless of whether this dataset is protected by the moral right of
attribution or not. The Attribution condition in CC licenses must also be
adhered to in those jurisdictions that do not recognize the moral right of
attribution.
Attribution stacking
Photo: “Human pyramid formed by members of the Ebenezer Gym Club” by State Library Victoria is free of known copyright restrictions
“Datasets
are particularly prone to attribution stacking, where a derivative work must
acknowledge all contributors to each work from which it is derived, no matter
how distantly. If a dataset is at the end of a long chain of derivations, or if
large teams of contributors were involved, the list of credits might well be
considered too unwieldy. The problem is magnified if different sets of
contributors have to be credited in a different way, especially if automated
methods are used to assemble the dataset – some of the benefits of automation
are lost if attribution conditions have to be inspected manually.” (Ball 2014, p. 4)
The Attribution condition – and the possibility of attribution stacking – is something to think about when you choose a CC license for your own dataset.
You may consider using the CC0 Public Domain Dedication tool, which allows rights holders to share their datasets without any conditions. The CC0 tool is recommended for research data, as it does not impose any conditions and thus allows for maximum reuse of data. Reusers of material that is shared under CC0 are not legally bound by the tool to credit the 'author' (dataset rights holder), but they are expected to do so according to community norms (and possibly copyright requirements, such as the moral right of attribution). Community norms are more flexible than an attribution condition in a license, and they do not result in attribution stacking.Lessons learned
If you reuse CC-licensed sources in your own dataset, remember to attribute properly – following the requirements of the license and retaining the information supplied by the creator of the original source.
When you share your own research data, consider using the CC0 Public Domain Dedication tool.
References
- Ball, A. (2014). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: https://www.dcc.ac.uk/resources/how-guides. Licensed under CC BY 4.0.
- Creative Commons. Frequently Asked Questions. Available at https://creativecommons.org/faq/.