Lesson 4.4 Interoperability of Farm Data

Learning outcomes

At the end of this lesson, learners will (be able to):

  • Identify relevant data standards and key organizations / projects for farm data sharing

  • (data/service providers, developers) use data formats and vocabularies in their farm data services

  • Understand samples of farm and weather data in some existing data formats

1. Introduction

[Part of the content of this lesson comes from the GODAN Action gap analysis on weather data standardization, which covers the special case of farm management information systems.]

This course addresses the management of data from and for farmers, so primarily farmer profiles and farm data management systems (or Farm Management Information Systems, FMIS). FMIS normally handle more specific, often agronomic or technical information, for farm decision making. Some of this information is in some cases also used in farmer profiling platforms, especially in order to provide data-driven advice to farmers, but not often.

Regarding farmer profiles strictly, there doesn’t seem to be any initiatives to develop exchange standards for this type of data. However, the methodologies used for agricultural censuses can give some guidance to standardize the content of elements in a farmer profile. See for instance the FAO methodologies for the World Agricultural Census.

FMIS are a relatively new area. These are added-value services, in which primary data (either from the farm or from external data services) of very different types (crop data, data on nutrients, on pesticides, on soil, on weather) is integrated, processed, sometimes run against models, visualized and made actionable. They’ve been so far mostly the domain of the private sector: big companies as well as smaller start-ups have created all sorts of farm management services or, as they’re called in the community of intermediaries, Farm Management Information Systems (FMIS).

Such systems have to interface with the machinery that collects the data (e.g. soil moisture) and the machinery that executes the operations (e.g. a sprinkler) and possibly other systems that process the data down the line. In the majority of cases, data in these systems is designed to be interoperable within the system, using formats that are tightly coupled with the suppliers’ machinery interface, with no apparent incentive for collaboration and no need to share data widely. However, there is demand for wider interoperability, primarily for making machines and data from different suppliers work with equipment from other suppliers, but also for making farm data reusable by other FMIS and giving the farmer freedom to switch providers.

This lesson illustrates the current situation of data standards for farm data, listing published standards as well as projects that are working on new standards.

At the end of the lesson, some examples of interoperable farm data fragments are provided, with a commentary to explain the different approaches and how they use formats and vocabularies to make data interoperable.

These descriptions and examples illustrate that there is work underway to develop standards for various types of information in this area, but it has to be noted that none of these standards is so widely accepted that its adoption should be considered a best practice: they are illustrated here to make learners familiar with interoperable farm data and to provide inspiration to data managers and developers on new ways to make their data interoperable.

2. Standards for farm observation data

For some types of data normally used in FMIS, like crop basic data, crop growth data, soil data and soil profiles, weather data etc., data standards exist, but they’ve been developed outside the community of intermediaries that develop tools for farmers:

  • Crop basic data (from germplasm descriptors to official names to product classifications) have been standardized by normative bodies (FAO, ITPGRA. EFSA, USDA...) and their core properties have been modelled in ontologies by research institutions (CGIAR, INRA...)

  • Some data standards for crop growth data and crop growth models have been developed by research institutions that wanted to share or reuse models (AgMIP, AgTrials)

  • Data standards for soil observations, soil profiles and soil properties (chemical properties, physical properties) exist (from USDA and FAO classifications to INSPIRE data specifications)

  • Weather data standards, as seen in the previous chapters, have been created by meteorology agencies.

Other data used in FMIS (data about machinery, sensors, agricultural input like fertilizers and pesticides, in some cases sales management data) partly follow industry standards and partly are just encoded in closed proprietary formats.

While developers of FMIS may need to be aware of data standards when they import external data (crop core data, historical observations, climate data, predictive models), they have no reason to apply them in the storage or further re-packaging of data, at least as long as farm management data is meant to be used only locally or within the service network. And in case data is exchanged between pieces of machinery or within a network, it is normally in proprietary and closed formats established by suppliers (suppliers of agricultural input, machinery and software).

The only standards that are normally applied in the interface between FMIS and machinery are ISO standards, especially ISO 11783 “Tractors and machinery for agriculture and forestry -- Serial control and communications data network” (known as ISOBUS), because it is a standard that allows pieces of machinery to communicate.

However, there are a few trends that indicate that the need for more standardization of data in these services is growing:

  • Farmers and farmers associations creating or joining consortia with the intent of sharing data or at least being able to transfer data across software packages. Some intermediaries are intercepting these needs. In particular, we will see that the AgGateway and the Open Ag Data Alliance consortia are leading efforts towards standardizing all farm management data or at least providing crosswalks to be able to transfer and share data.

  • The need for more efficient pluggability of machinery components of different brands and their communication with FMIS. “Each proprietary tool sends pieces of data such as soil moisture, weather, or water measurements in its own way.” Again the AgGateway consortium is intercepting this need and working with machinery providers, data providers and FMIS providers to standardize data formats at least in the crucial parts of.the workflow where different pieces of machinery have to communicate between themselves or with the FMIS.

  • While competition and patents lead manufacturers to keep part of the data and messages in their machinery operation in proprietary formats, they have been collaborating for a long time on agreed interface standards, especially ISOBUS, (There is of course the exception of some bigger players not adhering to ISOBUS and counting more on their position of monopoly)

  • Intermediaries may want to be able to reuse predictive models coming from research (e.g. crop growth models, climate models…) and therefore need to model their data in a compatible way (“if observed and simulated data are to be compared, it will be helpful if their metadata describes them in the same way”[1]). Examples may be the models available from initiatives like AgMIP or crop management data standards like the ICASA standards.

  • There are international industry standards (ISO, UN…) that were designed mainly for traceability and food security reasons and are normative standards when it comes to trade and they’re recommended in general for information exchange between farms on one side and suppliers, traders and other partners in the agrifood supply chain on the other.

2.1. Existing published data standards

2.1.1 Research-based agronomic data models

These standards are important for FMIS for regulatory reasons. On the one hand, data related to trade, from product traceability to invoicing, especially when it comes to export, has to comply with these standards. On the other hand, in the case of ISO 11783, it is essential for the operation of machines. Since decision support is not the objective of these standards, weather data is not essential, although there are segments for measurements of different types and for event conditions, including weather.

  • UNECE standards, in particular UN/EDIFACT Data plot sheet. This is a “Data Plot Sheet” (DPLOS) published by the UN “Electronic Data Interchange for Administration, Commerce and Transport” (EDIFACT) as a syntax to “represent data concerning input and techniques used on crops with the intention to facilitate information exchange with suppliers, traders and other partners in the agrifood supply chain”. The Detail section provides the breakdown of 1 to n plots sheets contained in the exchange:

    • General points on the plot sheet (dates, species, variety, area, contracts, etc.)

    • History of the plot (previous crops, enrichment, etc.)

    • Analysis (details of soil analyses carried out on the plot.

    • Events (i.e. all events such as observations, advice, actions taken, etc.)

  • More specialized “messages” compliant with EDIFACT messages have been created, with related XML schemas. In Europe, Agro EDI Europe is leading these efforts, producing standards like the "Agronomic observations" or AgroObs, part of the EPIPHYT project.

  • ISO standards, in particular ISO 11783 “ISOBUS” ISO 11783 “Tractors and machinery for agriculture and forestry -- Serial control and communications data network” is a communication protocol for the agriculture industry. Particularly interesting concerning data standardization is part 11, the ISOBUS Data Dictionary. The dictionary lists a huge number of “entities” (and related definitions, units of measure and symbols) used in the transmission of data from farms, from all sorts of observed treatments and properties of crops (spray application, tillage, seeding depth, yeld, crop loss...) and devices to (a few) properties read from weather stations, like air humidity, temperature.

2.1.2 Research-based agronomic data models

These schemas were designed for crop management and are quite suitable for decision support. They all cover weather data.

  • DSSAT ICASA2 data standards. The Decision Support System for Agrotechnology Transfer (DSSAT) is a software tool developed through collaboration among scientists at the University of Florida, the University of Georgia, University of Guelph, University of Hawaii, the International Center for Soil Fertility and Agricultural Development, USDA-Agricultural Research Service, Universidad Politecnica de Madrid, Washington State University, and other scientists associated with ICASA. It combines crop, soil, and weather data bases with crop models and application programs to simulate multi-year outcomes of crop management strategies. DSSAT also provides for evaluation of crop model outputs with experimental data, thus allowing users to compare simulated outcomes with observed results. It comprises crop simulation models for over 42 crops. It’s supported by data base management programs for soil, weather, and crop management and experimental data. DSSAT uses the ICASA2 data standard. Although the software and standards were created for research purposes, more precisely for field experiments, the experiments are about crop growth and the prescribed minimum set of data largely corresponds to the weather and agronomic data normally used in farm management software. As the authors of the standard say, it is intended “to allow description of essentially any field experiment or commercial crop production situation”. The ICASA standard is used also by AgMIP as a format to import data as input to models (indeed the data needed for building the crop growth model are the same data needed to apply the model to crop growth decision support tools). ICASA is explicitly designed to support implementations in a variety of formats, including plain text, spreadsheets or structured formats (it has an XML schema). The core of the ICASA standard is the Master list of variables, a naming convention for agricultural model variables which is also used in AgMIP standards that build on the ICASA standards, like ACMO (see below). There are plans to render ICASA in RDF in the TERRA-REF project and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (see below). We saw the DSSAT “minimum data” in chapter 2.1.

  • Agronomy Ontology. As the official documentation says, “AGRO, the AGRonomy Ontology, describes agronomic practices, agronomic techniques, and agronomic variables used in agronomic experiments. AGRO is being built using traits identified by agronomists, the ICASA variables, and other existing ontologies such as ENVO, UO, and PATO, IAO, and CHEBI. Further, AGRO will power an Agronomy Management System and fieldbook modeled on a CGIAR Breeding Management System to capture agronomic data”.

  • Crop Research Ontology (CRO). The Crop Research Ontology is managed by the CGIAR. Describes experimental design, environmental conditions and methods associated with the crop study/experiment/trial and their evaluation. The concepts of the CO are used to curate agronomic databases and describe the data.

2.1.3. Geospatial and observations data standards

Besides strictly agronomical data standards, standards that support observations and measurements, with the related geospatial dimensions, are very relevant. The most commonly used are:

  • The ISO 19101 “Domain Reference Standard model for geospatial data infrastructures, defining the relations between dataset, metadata, feature instances, application schemas and services, and all related standards developed by ISO TC211. This model defines the concept of feature (as an “abstraction of real world phenomena”, which can occur as a type or an instance) and “feature types” (classes of features having common characteristics”).

  • The spatial data models and web services defined by the Open Geospatial Consortium, built along the lines of the above ISO specifications (almost always as joint ISO/OGC standards) and around the main OGC Geography Markup Language (GML): Web Feature Service , Web Coverage Service, Web Map Service. These services, rather than sharing geographic information at the file level using FTP, offer direct fine-grained access to geographic information at different levels: the feature and feature property level, the coverage level, the map level.

  • The ISO 19156:2011 / OGC Observations and Measurements (O&M) model. This is a conceptual model (and a schema) for observations, and for features involved in sampling when making observations (observations commonly involve sampling of an ultimate feature-of-interest); ISO 19156:2011 defines a common set of sampling feature types classified primarily by topological dimension and therefore embedded in geospatial features. Since one of the key types of data in meteorology are weather observations, we’ll see in chapter 3.2.3 that this ISO/OGC approach to geospatial and observational data is very relevant to weather data. Also the OGC Time series Profile of Observations and Measurements (TSML) and the OGC Sensor Model Language (SensorML) are part of the same framework and relevant to farm and weather observations.

  • In the direction of a more semantic web of geospatial data, the W3C is now working on/taking stock of geospatial ontologies following the ISO/OGC feature types approach. They’re considering a “Geospatial Features” ontology and a “Feature Types” ontology. For sensors data and observations, they’ve already developed together with OGC the OGC W3C Semantic Sensor Network Ontology.

  • Among researchers, common data formats used for generic observations are very popular, like NetCDF (preferably following the Climate and Forecast Metadata Conventions, see below), HDF5 or in general formats that are understood by widely used services (like OPeNDAP). NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It was developed by Unidata, one of the University Corporation for Atmospheric Research (UCAR)'s Community Programs (UCP).

  • Weather observations: In formats like NetCDF, variable names are either arbitrary and only locally defined or they are coded with implicit reference to a code list or table which is not machine readable; recommended syntaxes and units of measures are sometimes indicated in the metadata in a human-readable way and sometimes described in attached guidelines or even just assumed (e.g. common scientific practices). It is recommended that “the names of variables and dimensions should be meaningful and conform to any relevant conventions.” For NetCDF, a specific naming “convention” for Climate and Forecast data was developed, called “Climate and Forecast (CF) Metadata Conventions” or simply “CF Conventions”, which “define metadata that provide a definitive description of what the data in each variable represents, and the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities”.

  • Dataset metadata: In lesson 4.3, the distinction between data and dataset metadata standardization was introduced. Many of the existing standards for observations described above (ISO 19101 OGC geospatial standards, NetCDF and CF Conventions) prescribe metadata conventions for the dataset. Dataset metadata should also include descriptions of the content and structure of the dataset, e.g. the phenomenon observed, the dimensions of the dataset, the units of measure etc. An RDF vocabulary that aims at providing a semantic web approach to describing the structure of a dataset together with its content is the W3C DataCube vocabulary.

2.1.4. Schemas for farm management data

As far as we know, except for the standards under development in initiatives like AgGateway (see later), AgroRDF is the only published data standard for representing and describing farm work.

KTBL’s AGROXML / AGRORDF. agroXML is an XML/RDF schema developed in the iGreen project for representing and describing farm work. agroXML has been developed by a team consisting of members from makers of agricultural software systems, machinery companies, service providers and research organisations. It provides elements and XML data types for representing data on work processes on the farm including accompanying operating supplies like fertilizers, pesticides, crops etc. It can be used within FMIS as a file format for documentation purposes but also within web services and interfaces between the farm and external stakeholders as a means to exchange data in a structured, standardized and easy to use way. It covers topics relevant to on farm activity including “crop”, “cropSpecies”, “chemical substance”, “harvestDate”, “enginePower”.

2.2. Projects that use some data standards and are developing new standards

2.2.1. Projects from public research

These are normally more for field experiments and crop growth models, but with similar data needs.

  • APSIM. The Agricultural Production Systems sIMulator (APSIM) is internationally recognised as a highly advanced simulator of agricultural systems. The APSIM initiative (AI) was established in 2007 to promote the development and use of the science modules and infrastructure software of APSIM. The Foundation Members of the AI are CSIRO, the State of Queensland and The University of Queensland. AgResearch Ltd., New Zealand became a party in 2015 and other organisations may apply to join at any time. It contains a suite of modules which enable the simulation of systems that cover a range of plant, animal, soil, climate and management interactions. In terms of data standards, for the formatting of messages APSIM uses an XML schema called Data Description Markup Language (DDML) for data types, units, scales, and the Simulation Description Markup Language (SDML) for the simulation data. These formats as far as we could see are not meant as exchange standards and are not used outside of the APSIM software. However, data can be exported into a CSV that can be read for instance by AgMIP tools (and there is an R package available for direct importing into the R platform).

  • AgMIP. The Agricultural Model Intercomparison and Improvement Project (AgMIP) aims to utilize intercomparisons of various types of methods to improve crop and economic models and ensemble projections and to produce enhanced assessments by the crop and economic modeling communities researching climate change agricultural impacts and adaptation. AgMIP also collaborates with the CCAFS AgTrials project on metadata standards. AgMIP uses some own standards (like ACMO and ACE) but heavily reuses and extends the ICASA2 standards. They provide QuadUI, a simple desktop application for converting crop modeling data to standard AgMIP format (JSON) and then translating to compatible model-ready formats for multiple crop models. Currently, the application reads weather, soil and field management information in either DSSAT format or a harmonized AgMIP CSV format. Output formats currently supported are DSSAT (ICASA2) and APSIM (see below). They also provide the ACMO UI: desktop utility to help generate ACMO (AgMIP Crop Model Output) files from model outputs. As interal data model AgMIP uses a flexible, non-relational database architecture was selected. The AgMIP IT team established the AgMIP Crop Experiment (ACE) harmonized data format to overcome incompatible file organization and structural complexity. Definitions of data elements are based on the ICASA standards which provide a comprehensive and extensible ontology for the description and definition of agricultural practices. Data are managed in ACE using JSON (JavaScript Object Notation, www.json.org) key-value structures. The key in each key-value pair corresponds to an ICASA parameter definition and units.

  • AgTrials. Managed by the CGIAR Research Program on Climate Change and Agricultural Food Security (CCAFS). Alpha version of a web application to compile and store information on the performance of agricultural technology, so far, it allowed to collect, organize and upload raw data and their associated metadata from more than 800 trials carried out in the last three decades; covering more than 20 countries across Africa, South Asia and Latin America and 16 crops and 7 livestock species. It’s run in collaboration with the crop modeling initiatives AgMIP and Global Futures. The results are available in AgMIP formats as well as modelled with the Crop Ontology.

2.2.2. Projects from/for industry

There are hundreds of FMIS that of course use some internal data model and store data in some format, but in most cases they focus on the end user and don’t make their data formats explicit nor engage with the creation or negotiation of data standards. On the other hand, there are a few initiatives that work at a broader level, involving actors from different types of industry, and focus on the interoperability of FMIS data.

  • AgGateway. AgGateway is a non-profit consortium of businesses serving the agriculture industry. It currently has more than 230 member companies working on eConnectivity activities within eight major segments: agricultural retail; systems and software developers and service providers; crop nutrition; crop protection; grain and feed; precision agriculture; seed; specialty chemical​. Their work on standardization covers all aspects of farm management. Currently, AgGateway is working on three major projects: (a) the SPADE (Standardized Precision Ag Data Exchange) project, which aims at establishing a framework of standards to simplify mixed-fleet field operations and regulatory compliance and to allow seamless data exchange between hardware systems and software applications that collect field data across farming operations; (b) the PAIL (Precision Ag Irrigation Language) project to provide an industry-wide format that will enable the exchange and use of data to and from irrigation management systems, which are currently stored in a variety of proprietary formats; and (c) the ADAPT toolkit to enable interoperability between different precision agriculture software and hardware applications. AgGateway publishes their guidelines/standards in different forms: some are public and some are available only to members. These standards are meant to be used by intermediaries who create FMIS but also by manufacturers of machinery. The standards are data models, expressed as XML schemas but translatable to XML and Json according to specific guidelines, and controlled vocabularies. The AgGateway standard that we will consider more in depth in the next chapter is PAIL, both because its focus is on irrigation and includes weather data and because it’s in an advanced status of recognition as a standard: it’s coordinated with NRES-03/2 US TAG ISO TC23/SC18, has been submitted to ASABE, and it will be submitted to ISO for consideration as a new standard after adoption by ASABE.

  • Open Ag Data Alliance. The Open Ag Data Alliance was formed in early 2014 as an open source project with widespread industry support and headed by the Open Ag Technology and Systems Group (OATS) at Purdue University. Its goal is “to help the industry get data flowing automatically for farmers in agriculture so they can reap the benefits of making data-driven decisions and stop wrangling data and incompatible systems”. The alliance has over 25 commercial partners worldwide. They aim at pursuing their objective through the development of open data sharing standards (APIs) and open source software libraries that will serve as a conduit between data generation and data consumption. They also aim at facilitating easy transferability of a farmer’s data among solution providers and allowing farmers and actors (suppliers, advisors) to collaborate on the same platform based on specific and very granular access permissions: rather than focusing on ownership, they focus on access rights by facilitating the protocols for giving and revoking access to data. Their objectives are similar to AgGateway’s, only the method is different: while AgGateway focuses on common data models, data formats and controlled vocabularies, OADA focuses more on the web service side. As stated on their website, “the alliance sees real-time API connection as complementing other open data exchange projects, such as from AgGateway”. Also OADA’s services seem at the moment to be meant for the US. A practical application of the OADA API approach is the “Real-Time Connections API for Weather, Soil Moisture Data” developed by ServiTech (more in the next chapter).

  • AgroConnect. This is the only initiative focusing on data standards for FMIS that our research revealed in Europe. It has very similar objectives to those of AgGateway and they collaborate closely. The members of the AgroConnect association are companies and organisations that trade goods, services, products and produce with farmers. A special group of stakeholders are the providers of FMIS: “All these companies and organisations share a common goal and that is to enable easy data exchange in the agricultural supply chain between all parties involved.” They promote: standard data models; standard interface definitions (EDI-messages, API’s) for data interchange; standards for identifying farms, persons, crop fields, animal, batches; all type of standard code lists, e.g. for crop types, soil types, animal types, etc. standard protocols for data exchange.

There don’t seem to be other initiatives of this type, focusing on data sharing through FMIS and involving different stakeholders. Apparently, standardization of FMIS in Europe is high around the ISOBUS standard, but has not extended to the integration of components of the FMIS workflow that are not tractors and typical machinery and do not aim at transporting data across different FMIS. There are efforts trying to attract the industry towards a data exchange platform around the FIWARE platform through the SmartAgriFood accelerator and projects like AGICOLUS, but there doesn’t seem to be much work on data standards in addition to ISOBUS.

3. Examples of interoperable observation data

This section doesn’t have the objective of making learners experts of XML, JSON or RDF data serializations. It doesn’t even expect to provide a full understanding of the fragments of data serializations that are presented. The objective of presenting short examples of data expressed in different ways is to make learners familiar with how interoperable farm data may look and to provide inspiration for data managers, service providers and developers who want to make their data more interoperable.

3.1.1. Dataset metadata describing the structure of the observations dataset using DataCube RDF

The example below describes a data structure designed to contain one measurement ("minimum daily air temperature, average", indicated with a conventional URI from the ICASA Master Variables List, not yet published) and three dimensions: area, period and identifier of the field where the measurement is taken. This means that these will be the data in that dataset and the names and attributes used in the data will be those indicated here.

The example uses the RDF Turtle syntax and, among other vocabularies, Dublin Core (identified by the dct: prefix) and DataCube (qb: prefix):

Figure 1 Example of RDF encoding of dataset metadata and data structure using Data Cube

ISO 19115 “Geographic information -- Metadata”, in the ISO 19101 series mentioned above, is also a suitable vocabulary covering dozens of metadata elements for a dataset. While ISO/TS 19139 provides an XML schema for ISO 19115, an RDF (OWL) representation of ISO 19115 has been developed by CSIRO Australia.

3.1.2. Record of measurement of fruit mass at the temperature of 22.3°C using O&M XML

This is an example from a dataset of observations from agricultural experiments. It uses the XML schema of the ISO data standard for Observations and Measurements (O&M). It shows how to use elements from the O&M XML schema to describe a simple observation: the measurement of fruit mass at the temperature of 22.3 °C.

Figure 2 Example of observation in XML format using the O&M schema

3.1.3. Measurement of air temperature at a specific point using O&M JSON

The next example shows how to use the O&M JSON schema to describe a simple observation: the measurement of air temperature at a specific point. Although no prefix is used, the fact that the JSON follows the O&M JSON schema tells us that the properties used are from O&M. The labels used (“observedProperty”, “featureOfInterest”) and the nesting structure (“uom” under “result”) clearly show that even if in a different format, the schema is the same as the one used in the previous XML example.

In the XML file, the prefix om: would be mapped to the namespace of the XML schema; in the JSON file, the @context would be the URL of the JSON schema. In both cases, any software parsing the datasets will interpret the elements/labels in the same way.

Figure 3 Example of observation in JSON format using the O&M model

3.1.4. Field experiment data using ICASA variable names in JSON

The previously mentioned ICASA data standard has been serialized into JSON. The example below shows an experiment encoded in JSON using the ICASA data model and variables. You can see that instead of URIs ICASA uses short coded variable names: all the codes are in the ICASA Master Variables list, which defines the meaning of all variables and constitutes the ICASA semantic resource, where you would find that “fielele” means “field elevation” and “icpcr” means “residue, crop code for previous crop”.

Since variables are identified by codes and not by URIs, and codes aren’t even associated with definitions in a machine-readable file, software tools can’t look up the meaning and can’t infer the reference semantics behind the code. Therefore, even if the ICASA variables are probably the most complete list for agricultural experiments and are used in other systems, at the moment using them does not ensure full semantic interoperability. There is work to express the ICASA variables in an ontology. Example below is from Integrated description of agricultural field experiments and production: the ICASA Version 2.0 data standards.

Figure 4 Example of an experiment described in JSON format using the ICASA standard

3.1.5. Weather observations using ICASA variables in tabular format

ICASA variable can be used in tabular format as well, as column names. In the example below, definitions of the variables is provided at the beginning of the file, mainly for human reading. Although the ICASA variables are used as simple strings, already the fact that they come from a controlled vocabulary can allow for a good degree of interoperability, for example to aggregate the data with other datasets that use the ICASA variables.

Figure 5 Example of tabular data using the ICASA variables

3.1.6. Observation of tree height using the SSN/SOSA Ontology in RDF

The W3C “Semantic Sensor Network Ontology” (SSN) mentioned above is an ontology built on the basis of OGC SensorML and O&M standards. The classes and properties that are most relevant for observations are under the namespace http://www.w3.org/ns/sosa/ (Sensor, Observation, Sample, and Actuator, SOSA)). You will notice that the vocabulary follows the OGC Observation - FeatureOfInterest - ObservedProperty model. Note that this descriptions also uses other vocabularies, in particular a vocabulary for units of measure: @prefix qudt-1-1: <http://qudt.org/1.1/schema/qudt#> . Example below is from Semantic Sensor Network Ontology.

Figure 6 Example of observation in RDF Turtle using the W3C SSN ontology

3.1.7. Observation of minimum air temperature using DataCube and ICASA variables

Observations can also be encoded using Data Cube, although inside the Observation entity other vocabularies are needed. The example below encodes the measurement of daily minimum temperature (following the data structure defined in the Data Cube dataset header in example 3.1.1 above). Besides the namespaces used in the previous example, here we have two additional namespaces for the metadata elements, one for statistical attributes (SDMX) and one for variables (the ICASA variables):

@prefix sdmx-attribute: <http://purl.org/linked-data/sdmx/2009/attribute#>

@prefix icasa-var: <http://purl.org/icasa/variables#>

The use of the icasa-var: prefix before the tmina property tells the machine that it should look up the tmina variable in the list of variables published at http://purl.org/icasa/variables#.

Figure 7 Observation encoded in RDF (Turtle) using Data Cube and ICASA

3.2. Controlled variable names

It is clear from the examples above that there is a strong need for standardized variable names and that this need is currently being addressed mainly in two ways, either URIs in published vocabularies (like O&M) or prescribed strings/codes published as controlled lists (like ICASA). However, the second approach is probably also going to adopt the URI method.

The issue of variable names is not only just an issue of URI or string: a variable name can be a combination of several dimensions, normally feature of interest + observed property + observation methods + parameters (e.g. hourly + average + wind speed + at 10 metres).

The AgGateway PAIL standard is taking an interesting approach to this. Describing it in detail would be too much for this lesson, but in short, the approach in PAIL is to reduce an observation to a key-value pair, with the key expressing all the meaning and the value just the value. There is a controlled vocabulary for each of the aspects of a variable (time window, aggregation level, feature of interest, observed property, observation methods, parameters …) and the observation key is a new concept (in turn making up another vocabulary) which is the orthogonal combination of concepts from these vocabularies. The idea is to have a registry for all the orthogonal keys. The PAIL team is considering mapping the final valid keys in their orthogonal vocabulary to existing standardized variable lists. Those interested in new developments on the PAIL standard can join the project community.

Summary

After the introduction to interoperability given in the previous lesson, this lesson went deeper into specific data standards (vocabularies) that are relevant for farm data, from crop research and field experiment data standards to observations data standards and general geospatial standard. It briefly described some published standards and projects that are working on new standards.

The second part provided examples of interoperable farm data fragments, with a commentary to explain the different approaches and how they use formats and vocabularies to make data interoperable. These examples were meant to make learners familiar with interoperable farm data and to provide inspiration to data managers and developers on new ways to make their data interoperable.