The exponential growth of data and acceleration of technological advance is causing fundamental shifts in the way decision-makers think about business problems, driving entrepreneurial innovation, opening up new business opportunities and creating new jobs. As organisations and governments continue to leverage data and technology to drive strategic business and social objectives, these benefits and opportunities cannot be effectively realised in the absence of highly trusted, reliable and usable datasets.
A dataset attestation scheme can help to instil trust and confidence in the use of data within a data marketplace.
About the CODA Scheme
The overall objective of the Commercial and Open Dataset Attestation (CODA) scheme is to instil trust and confidence in the use of data through providing a standardised framework to assess a dataset’s metadata and the associated data publishing process.
The scheme aims to improve data users’ level of trust on a dataset, through the description of opportunities and restrictions on the dataset, the amount of effort needed to process the dataset, the amount of support that can be reasonably expected, as well as the availability of enterprise level support. For data providers, the scheme is a structure to provide transparency on their datasets, and is a guide to improve data sharing and monetisation.
The scheme proposes a questionnaire which data providers fill up for each dataset that they want attested. Based on a data provider’s responses to the questionnaire, datasets are assigned an appropriate tier, which can be used by data users to identify datasets that fit their needs. This process is referred to as the conformance process for data providers.
After a tier has been assigned to a particular dataset, data users of the dataset can provide feedback to testify to or invalidate the declared conformance made by the data provider. This process is referred to as the attestation process for data users.
CODA Scheme Requirements & Tiers
Under CODA, a dataset is assessed in 4 broad areas for conformance
- Legal – details the rights, licensing and privacy aspects of a dataset
- Practical – details the searchability, accuracy, quality and availability aspects of a dataset
- Technical – details the location (options for users to access dataset), formats and trust aspects of a dataset
- Social – details the documentation, support and services (accompanying / recommended tools to be used to work on the dataset) aspects of a dataset
Four tiers are achievable in the CODA scheme, with each tier having incremental requirements to fulfil. An overview of the different tiers is provided in the following diagram:
- Tier 1 - Data is licensed, accessible and legally reusable.
- Tier 2 - In addition to meeting the Tier 1 requirements, the data is documented in a machine-readable format, reliable and offers ongoing support from the publisher via a dedicated communication channel.
- Tier 3 - In addition to meeting the Tier 2 requirements, the data is published in an open standard machine-readable format, has guaranteed regular updates, offers greater support, documentation, and includes a machine-readable rights statement. Datasets with personally-identifiable information also have to be independently audited on the anonymisation process and privacy risk assessment.
- Tier 4 - In addition to meeting the Tier 3 requirements, the data has machine-readable provenance documentation, uses unique identifiers in the data, and the publisher has a communications team building a data user community.
The conformance process refers to how a dataset is assessed for conformance to the requirements of the scheme, and subsequently assigned an appropriate tier based on the assessment results.
The proposed approach is self-declared conformance, where data providers answer a questionnaire on the datasets that they wish to place onto the CODA scheme. Based on the responses given, an appropriate tier will be assigned for the particular dataset.
The attestation process refers to the process where data users verify data providers’ declared conformance of their datasets to the CODA scheme.
Eligible attesters (data users) should be allowed to verify the declared conformance made by the data providers. These attesters help to create a community around the dataset. An attester authentication process needs to be in place to ensure review accountability, preventing parties with malicious intent from providing inaccurate ratings and feedback irresponsibly. This can be done by requiring attesters to register with the CODA administrators before submission of reviews, and to limit review submission to parties that have used the data.
All enquiries regarding CODA can be addressed to firstname.lastname@example.org