Lesson 3.5: Developing strategies for implementing open data plans
Last updated
Last updated
This lesson aims to provide an insight on how to develop and implement clear, sustainable strategies for an open data policy.
After studying this lesson, you should be able to:
acquire knowledge on how to develop and formulate a policy for open data
describe methods and considerations to implement an open data policy
engage stakeholders in policy implementation
identify sustainability issues in an open data policy and be aware of how to overcome of these issues.
Before starting to publish any open data, it is important to have a clear strategy in place that defines the key goals and sets the ambition. This unit addresses these key ingredients for a successful open data initiative. Before taking action, you need to define what you want to achieve with your open data strategy, where do you want to stand? And by when? Will all data be available by default? Is all data stored centrally? This is very important.
In Europe, open data has been a focus for policymakers for over a decade. Revisions to the European Union Directive on Reuse of Public Sector Information (PSI) in 2013 made reusable and open public sector data the presumptive norm for Member States. The updated Directive also encouraged the adoption of standard licences for public sector data and strengthened mechanisms for people to challenge decisions that prevent information being available for reuse. Today, almost all European countries now have an open data portal, and across the continent these portals are becoming more advanced, being used more frequently and creating more benefits for citizens. As open data moves from being a new initiative to business as usual for governments, ensuring open data infrastructure and policies are fit for purpose and sustainable in the long term is a top priority.
The European Data Portal (EDP) harvests metadata from the publication of open datasets in national, regional and local portals across the European Union, and seeks to improve the accessibility and usability of EU public sector information. As well as operating as a portal, the EDP provides training materials and guidance for open data publishers and reusers. To date, the EDP references just over 600,000 datasets from 34 countries and translates metadata into 18 languages. In addition to increasing the accessibility of open data, the EDP also supports public administrations in their endeavour to publish high quality datasets. Training material has been designed, along a full suite of examples of open data reuse, reports and resources to inspire data publishers. Data reuse is also promoted and showcased on the portal, as a tangible illustration of the benefits of open data.
Open data is now a worldwide movement. More than half of the countries surveyed by the ODB in 2015 have an open data portal. In 93% of countries surveyed, even in countries where that data is not yet fully open, civil society and the technology community are using government data. OpenDataSoft estimates that there are more than 2,600 open data portals worldwide.
A data infrastructure consists of data assets, the organisations that operate and maintain them and processes, policies and guides describing how to use and manage the data. A data infrastructure can be seen as an ecosystem of technology, processes and actors/organisations needed for the collection, storage, maintenance, distribution and (re)use of data by the different end users in the agricultural sector.
One of the challenges in fostering open data for agriculture in government is that datasets are often distributed across different ministries and agencies, including sometimes (semi-) privatised bodies. Government structures across the globe vary, but in general the relevant information for agriculture can be found in:
the ministry of agriculture, including associated extension, research or subsidy bodies;
other government agencies (which may be semi-privatised) including a meteorological agency for weather and climate data, a mapping agency providing geographical data, and statistical offices conducting population surveys and monitoring; and
ministries dealing with water, natural resources, infrastructure, spatial planning, trade and finance.
Developing a data infrastructure for agriculture in any country is therefore not a matter for a single ministry. Success depends on collaboration between actors and organisations and aligning shared interests. However, the need for open collaboration may also increase the potential for innovation across multiple sectors. For instance, open weather data will be used by everyone from farmers to the transport industry to individual citizens.
A strong agriculture data infrastructure also requires that different datasets can communicate with each other. Adherence to common open data standards can help. A data standard is a guideline or series of guidelines that defines the way in which data should be collected or structured. By following the standard, similar data can be easily compared over time, across locations, and within and between organisations, as well as being easily manipulated to produce visualisations and identify trends. In other words, standards help to make reuse simple.
At national level an open data strategy for agricultural transformation should be carefully embedded in the agricultural, societal and political context of a country. Relevant questions that need answering are:
What is the role of agriculture in national and rural development and what are the likely directions of agricultural development, including the challenges and opportunities?
What is the current agricultural policy and how will this policy enforce certain agricultural development?
Who are the actors that need to be engaged and why? What will their roles be?
What data can be published to support these developments?
What are the current governmental ICT policies, and can they be used to support the publication of open data for development?
Which international policy frameworks can be used to leverage the open data for agriculture agenda?
An open data strategy needs to provide a vision, mission and an action plan clearly interlinking the answers to these questions. More on developing a vision and mission can be found in the e-Agricultural Strategy Guide, which elaborate guidelines on how to build an action plan.
Lessons learned, and basic principles used in the development of government strategy can have broader relevance in non-governmental and private sector contexts. As you develop the vision and mission (either at national or organisational level) for your strategy ask yourself:
1. What do you want to achieve? This should be based on objective and strategy of the organisation as a whole. What is the mandate, what is the organisation trying to achieve? What are the driving factors for what the organisation is trying to achieve?
Aligning the open data strategy to the overall organisational strategy is a good way to get buy-in from the relevant stakeholders as well as ensuring sustainability. This process will inform what data you will prioritise for release and enable you to measure impact of your initiative. Discuss this with your group of representatives and create a clear picture of that ‘to-be situation’. In some cases, the mandate to create an open data policy or plan might be from donors while in the case of private sector it might be corporate social responsibility or the potential to contribute to open innovation. It is important to consider this in advance.
2. What is the current situation? Define a clear picture of how things stand: Which units collect data? Which ones use data? Which ones produce data? What type of data do they gather? What format? Is data centrally organised? Is data currently published?
By defining clear and measurable goals, your organisation will be enabled to work towards those end goals and measure whether you have achieved them or not. While creating the goals, think of the primary reasons for publishing the data. These could be, for example, reaching the goal of becoming transparent or stimulating the economy. Make sure to be precise. Goals should be described in terms of scope, timing, deliverables and quantities among others.
A well written open data policy will clearly define the commitment of the organisation to publishing, sharing and consuming data. As was discussed in the earlier lessons, data exists on a spectrum: it can be closed, shared, or open. Open data is data that anyone can access, use and share. A growing number of public and private sector organisations are drafting open data policies that outline how they intend to openly publish data. Increasingly many organisations are also relying on open data published by governments and by other organisations in their sector. An open data policy can also help encourage informed reuse of third-party data.
A well written open data policy will also be used by internal stakeholders to help identify and prioritise releases, and by external stakeholders to understand how an organisation will be releasing its data and ways in which they can be involved.
An official open data policy is one of the most effective ways to obtain organisational support and transformational change with your open data initiative, because it details your ambition and the way you intend to realise it. It will support your implementation and set the standard for the field. It will create the transition, increase the transparency of your organisation and ensure the best use of your data! The translation of your open data strategy into a solid policy is of great importance to ensure its successful implementation.
A good open data policy will include some general context that helps to define its scope, for example:
a definition of open data – why it is important to the organisation and the reasons for defining a policy
a general declaration of principles that should guide the release and reuse of open data
an outline of the types of data collected by the organisation and whether they are covered by the policy
references to any relevant legislation, policies or other guidance which also apply to the management and sharing of information with third-parties.
Clearly stating the scope of a policy will help all stakeholders to reach a common understanding of how, when and where it should be applied. A good open data policy is essential to support the development and success of an open data infrastructure.
Research suggests the following elements should be included in an open data policy:
data: a datasets policy or statement on access to and maintenance of electronic resources;
time limits: set timeframes for making content accessible or preserving research outputs;
data plan: requirement to consider data creation, management or sharing;
access/sharing: promotion of datasets, deposit in repositories, data sharing or reuse;
long-term curation: stipulations on long-term maintenance and preservation of data;
maintenance of the policy: stipulating who is responsible for updating and maintaining the policy as new situations and realisations arise;
monitoring: whether compliance is monitored, or action taken such as withholding funds; the policy should make statements regarding compliance with it and clarify measures for non-compliance;
guidance: provision of FAQs, best practice guides, toolkits, and support staff;
repository: provision of a repository to make published datasets accessible;
costs: a willingness to meet publication fees and data management/sharing costs;
technical specifications to allow reuse: to enable research data reuse and citation funders should require information on metadata, DOI, interoperability of systems, machine readability and mineability and software in the policy;
licensing: the policy should require that data is accompanied by licensing describing the terms of use;
provisions for long-term availability: policies should include provisions for the long-term availability of data, since reuse and availability are primary reasons for open access to research data.
When formulating a good open data policy, the following elements should be considered:
the approach to identifying and prioritising data for release: how will data be inventoried, reviewed and then released?
privacy considerations: ensuring that personal information is not released by mistake and recommending steps to mitigate, e.g. by undertaking privacy impact assessments or approaches to anonymisation
data licensing and reuse rights: this will include not only the licence under which data will be released, but also the importance of clearing rights during data collection
data publishing standards: ensuring that data is shared in well structured, machine-readable formats, with clear metadata and documentation
engaging with reusers: how the organisation will work with external stakeholders to help guide release of data and ensure it can be easily used,
measuring success: what metrics the organisation will use to measure whether the policy is successful and how these measures will be shared
approach to consuming open data: for organisations that are reusing open data, guidance on how to identify high-quality datasets and ensure reuse rights are clear
concrete commitments: what the organisation is committing to do, in concrete terms, over the timespan of the policy
policy transparency: how the policy and the processes it describes will be reviewed based on feedback from stakeholders and lessons learned.
A policy document will not necessarily include detailed information on each of these areas, e.g. specific standards or release processes. It will instead focus on general principles that should be followed and which may inform the drafting of more detailed guidance for practitioners.
The following sections provide checklists of policy elements that can inform the drafting and review of open data policies. The open data maturity model also includes relevant guidance that highlights how a mature organisation will implement a number of the more detailed processes and policies.
Policy context
Is there a clear definition of closed, shared and open data?
Does the policy outline why publishing and consuming open data is of benefit to the organisation?
Does the policy describe the types of data that the organisations collect and stores, with an indication of which types of dataset might be suitable for release?
Does the policy reference relevant legislation or other organisational policies and best practices that are relevant to the application of the policy?
Is there a clear declaration of the principles that underpin the policy? For example, whether the organisation is adopting the Open Data Charter.
Data licensing and reuse rights
Does the policy have a clear recommendation of the default open licence under which data is to be released?
Is there reference to the need to ensure that the rights to publish are properly cleared and understood, starting from when data is collected through to its publication?
Does the policy refer to where open data might be embedded in procurement processes?
Identifying and prioritising data for release
Does the policy highlight if and how data might be prioritised for release? E.g. based on user feedback, FOI requests, etc.
Does the policy note the importance an inventory of internal data assets to help drive the data release process?
Does the policy outline the process by which data will be released, especially highlighting any decision points, risk assessments, etc?
Privacy considerations
Does the policy clearly indicate that personal data should not and will not be released as open data, unless there is either consent from affected parties or other legitimate basis for its release?
Does the policy indicate the need to anonymise or aggregate data prior to its release?
Does the policy reference relevant data protection laws and standards that relate to the collection and subsequent sharing of data?
Data publishing standards
Does the policy state that data will be published in both human- and machine-readable formats, with a preference for open standards to encourage wide reuse?
Is the creation of good quality metadata and supporting documentation highlighted as an important aspect of publishing high-quality data?
Does the policy suggest measuring quality of publication against industry best practices, e.g. using open data certificates?
Engaging with reusers
Does the policy set out how users can engage with the publisher to request and help prioritise data for release?
Are there channels for users to provide feedback, e.g. on quality issues or to ask for clarifications?
Does the policy outline a wider strategy for engaging with reusers, e.g. through workshops, industry events, etc?
Approach to consuming data
Is there clear guidance on how to identify whether third-party open data is appropriately licensed for reuse?
Are there suggestions for how to find and source reliable, high-quality data, e.g. by reference to government or industry portals, or services like open data certificates?
Concrete commitments
Does the policy state what the organisation will do in terms of improving its own capability, including development of further guidance and training for its staff?
Does the policy make concrete commitments to the publication of particular open data within the timeframe of the policy (eg a number of datasets within 1 year)?
Does the policy make commitments about the quality of publication of open datasets (e.g. that a certain percentage will have achieved a specific rating of open data certificate)?
Does the policy commit to datasets that are released being maintained over time, and for how long?
Policy transparency
Does the policy indicate the timespan that the policy covers?
Is it clear how the open data policy will be revised and how feedback can be provided?
Is the responsible party for the policy identified?
Responsible open data publication
Data release can increase the inequality between different groups if one group (generally the more resourceful group) has better access to the data than other groups. In particular, smallholder farmers or indigenous people face barriers of insufficient data skills, language and literacy when working with data. Capacity building programmes or targeted applications may help to over this problem.
The proactive recognition of the inequalities at play is important when designing an open data infrastructure for agriculture. Possible inequalities are strongly context-specific and vary from place to place between, but also within, countries.
Governments can prevent these imbalances by including a support programme for smallholder farmers and other vulnerable groups to increase awareness and communication. In some cases, sensitivities may be solved by technical means: through anonymisation, aggregation of data, or by making the data available through interactive visualisation tools working on the raw data but not providing direct access to it, such as the agrimatie.nl portal.
This is fully discussed in Unit 1, Lesson 1.2.
Building an open data infrastructure goes beyond the hardware, the software and the data assets. It is about building an ecosystem where the whole chain from data collection, data processing, infrastructure maintenance, information services through to end user interactions needs to be considered, and the interests and roles of different actors in the ecosystem need to be clear and aligned.
To better understand local requirements and opportunities, a national consultative process is required to find local answers to questions such as:
What are the objectives to be realised with the open data infrastructure?
Whose data needs will be served and what are the resulting requirements?
Which local developments and initiatives can be linked to the open data infrastructure?
What are the risks?
The answers will come as part of an iterative process: it is unlikely that anyone will find one answer and then stick with it, but rather the consultation will initially focus on tackling a challenge area and then extend. Strategies for stakeholder engagement can be found in the World Bank Open Data Toolkit and the FAO E-Agriculture Guide.
An open data infrastructure for land, nutrition and agriculture can be developed with different levels of ambition. As an end goal a government may want to share as much as possible of the key datasets. A practical way to take the first steps is to benefit from data sharing needs within the government by sharing open data.
For example, many official records are already collected and shared within and also outside the government in a structured way. These lists of organisations, people or products that are officially registered, permitted or restricted can easily be made available online in a machine-readable format with limited effort. The new way of data sharing saves effort and costs because all information is accessible in an efficient way.
Another opportunity may be to publish data that is shared at international level in relation to international agreements. For example, members of the African Union are fostering the Comprehensive Africa Agriculture Development Programme (CAADP) as Africa’s policy framework for agricultural transformation, wealth creation, food security and nutrition, economic growth and prosperity for all. Members are reporting indicators in Regional Strategic Analysis and Knowledge Support System (ReSAKSS) as a means of monitoring and a measure of success. Also, National Agricultural Investment Plans are being monitored while being implemented. Currently, indicator data (e.g. agricultural added value, yield size, fertiliser use, etc) are only available at the most aggregated level, the national level. However, the original monitoring data provides a much more detailed picture of economic and agricultural development, indicating differences between regions and municipalities. By publishing this as open data, stakeholders in the agricultural sector can anticipate these differences to make the function of the value chain and associated services more efficient. Also NGOs, donors and governments could use this data to refine their rural development programmes, accelerating the development process.
A third opportunity is to align the development of the agricultural data infrastructure with the Sustainable Development Goals (SDGs) indicators. Similarly to the previous example, the monitoring data needed to evaluate SDG2 is beneficial for the decision making of different actors in the agricultural sector. In particular, the data is not only available in an aggregated form, but published also at the finest possible grain, taking into account privacy and other responsible data issues. The Open Data Charter Resource Centre provides a resource for the development of an SDG2 monitoring roadmap.
The development of a government open data infrastructure does not stand on its own but is part of a global movement for open data for land, nutrition and agriculture. More and more governments and international organisations are publishing their data. It is important to link to these international initiatives to maximise reuse and impact from the data and to avoiding duplication.
A number of international organisations provide data sets with a global coverage that are used for agriculture, nutrition and land management. Examples include:
International organisations
Science groups
Having policies and standards in place that set out what best practice data publishing looks like, and how it will be monitored and assessed, is the necessary backbone for any potential hard levers enforcing data quality. These do not need to be based in legislation, but they should be enforceable, through a review panel, or direction from a senior official/government minister, for example.
Practical experience shows that, in many cases, the team coordinating the open data repositories or portals do not have the necessary authority to enforce data-quality standards and seek the publication of key datasets by other public sector bodies. Senior leadership, highlighting best practice for the rest of government and enforcing standards, is essential to continue to drive change.
Open data projects implementations need financing, both for the infrastructure of the portal and maintenance, as well as any outreach, training and support for publishers and reusers of data that is within the scope of the portal’s operations. While the cost of software and hardware continues to fall, the cost is not zero and people operating the portal still need to get paid. There are several factors to consider in a financing model. It needs to:
allow the team operating the service and planning its strategy to work with a known budget, and have confidence as to its longevity;
account for updates and enhancements to the portal, as well as bug fixing;
give users of the portal (both publishers and reusers) confidence that it is to be a sustainable mechanism for accessing open data.
Communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly. An open data community of practice could have one or several reasons for forming:
solving key issues and challenges
mapping knowledge and identifying gaps
documenting projects and discussing developments
seeking experience and building confidence
coordination and strategy
reusing data assets.
Communities of practice can be formed of several actors and players in the agriculture value chain to coordinate a collective and holistic approach to collecting, sharing and consuming data relevant for increased productivity and profit.
You might be interested in GODAN working groups which have been created as spaces for partners to collaborate, share ideas, experiences and ways forward on how open data can be used to solve key issues and challenges in the agriculture and nutrition sectors. Some of the issues under discussion in the communities include:
data rights and responsible data working
data ecosystem
capacity development
Kenya data integration
SDG2 accountability framework
soil data
publication and alignment of authoritative vocabularies for food.
The European Union suggests the following 10 ways (see Figure 1) to make open data portals more sustainable.
To monitor the success of your open data initiative, consider implementing metrics to your publications in order to evaluate its success. With these metrics, you can evaluate several indicators. The most useful evaluation activities are performance of the data, performance of the system, and collection and preparation performance.
It is also important to engage reusers, to monitor various other key aspects of your initiative. This will enable you to constantly improve your work by acting on the feedback of reusers and learning from your key monitoring indicators.
Performance of the data: This evaluation includes checking the number of downloads and page views. These are not the same, but both indicate the popularity of the data set. It does not indicate the usefulness of the data set: one cannot conclude whether the data has been reused based on the number of downloads.
Performance of the system: An important metric, especially when the data is available through an API. Here you want to evaluate whether the system can handle the requests, if there has been any downtime, and if there are performance consequences for other systems.
Collection and preparation performance: To evaluate user feedback, the usefulness of datasets is used. Usefulness is an indicator caused by the qualitative usefulness (is it helpful for a particular purpose?) and the practical usefulness (is the data described, clean, dense enough, etc.).
You should consider including metrics that will enable you to measure the success of the publication of data and your metadata. Think of the following metrics:
qualitative feedback
number of downloads per set
click through rate
reuser rating of quality
cost per download.
Evaluating the success of your implementation: Your experience is a great source of improvement. After thoroughly evaluating your efforts, metrics and the benefits, revise your policy and your strategy and adapt where necessary. From what you have learned, what can be improved? Formulate next steps and implement them.
In Europe, open data has been a focus for policymakers for over a decade. Revisions to the European Union Directive on Reuse of Public Sector Information (PSI) in 2013 made reusable and open public sector data the presumptive norm for Member States.
A data infrastructure consists of data assets, the organisations that operate and maintain them and processes, policies and guides describing how to use and manage the data. A data infrastructure can be seen as an ecosystem of technology, processes and actors/organisations needed for the collection, storage, maintenance, distribution and (re)use of data by the different end users in the agricultural sector
Developing a data infrastructure for agriculture in any country is therefore not a matter for a single ministry. Success depends on collaboration between actors and organisations and aligning shared interests. However, the need for open collaboration may also increase the potential for innovation across multiple sectors. At national level open data strategy for agricultural transformation should be carefully embedded in the agricultural, societal and political context of a country.
As you develop the vision and mission (either at national or organisational level) for your strategy ask yourself:
What do you want to achieve? This should be based on objective and strategy of the organisation as a whole?
What is the current situation? Define a clear picture of how things stand. Which units collect data? Which ones use data?
A well written open data policy will be used by internal stakeholders to help identify and prioritise releases, and by external stakeholders to understand how an organisation will be releasing its data and ways in which they can be involved.
Policy context
Is there a clear definition of closed, shared and open data?
Does the policy outline why publishing and consuming open data is of benefit to the organisation?
Does the policy describe the types of data that the organisations collect and stores, with an indication of which types of dataset might be suitable for release?
Does the policy reference relevant legislation or other organisational policies and best practices that are relevant to the application of the policy?
To ensure sustainability of the open data policy, you should:
identify local needs and opportunities
think big, start small, harvesting the low-hanging fruit
link to the global ecosystem for open data in agriculture
create a leadership role to champion data publication and respond to issues
create a community of practice.
To monitor the success of your open data initiative, consider implementing metrics to your publications in order to evaluate its success. With these metrics, you can evaluate several indicators. The most useful evaluation activities are performance of the data, performance of the system, and collection and preparation performance
It is also important to engage reusers, to monitor various other key aspects of your initiative. This will enable you to constantly improve your work by acting on the feedback of reusers and learning from your key monitoring indicators.
Organise for use
Consider the user experience to determine ways of organising datasets such that it improves the user data experience, inclusivity and reach. Reconciling this underlies future success of the repository.
Promote use
Increase the sharing of skills and knowledge to develop wider use of data, e.g. facilitate the creation of curated lists of datasets, which are useful in agriculture – both from within the repository and across other portals.
Be discoverable
Data should be embedded in the user experience both from a functionality and an engagement point of view, not fragmented and inconsistent across the different channels. Also highlight other data portals that may be of use, and possibly share cross-portal facilities.
Publish metadata
Accurate metadata is critical not only for findability but also cataloguing – poor metadata can undermine the repository itself.
Promote standards
Well-defined common standards enable parties to understanding the concepts that are relevant in the data domain, the way they are named, their attributes and connections to other concepts as defined and agreed within a community of practice.
Co-locate documentation
Supporting documentation should be accessed immediately from within the dataset and should be context-sensitive so that users can directly access information about a specific item of concern.
Link data
Successful exploitation of datasets should be effected by the ability within repositories to link to core reference data (e.g., types of places, organisations, products, assets), open address data and open geospatial data (mapping), units of measurement, temporal information etc. This will allow the cross-referencing and analysis of multiple datasets that are currently siloed or not interoperable on a non-personal basis
Measurable
Kinds of metrics to consider: usage (for publishers) and quality (for users). Absolute metrics (to point to areas in a dataset that require improvement); fitness of use (could increase the confidence in a dataset); user reviews
Co-locate tools
The careful curation and provision of tools in simple categories linked to datasets and their uses can have a huge impact on an individual’s ability to explore a dataset and decide on its relevance. One example is EuroStat’s visualisation tools, covering many themes including demographics, economics and key themes
Be accessible
Work with data publishers to improve publication formats, and act as a feedback filter between users and publishers.