Skip to content
Snippets Groups Projects
Name Last commit Last update
images
scripts/python
LICENSE
README.md

WoSIS Graphql API Masterclass

This master-class aims to explain and exemplify the use of WoSIS Graphql API.


Table of contents


Introduction

WoSIS stands for 'World Soil Information Service', a large database based on PostgreSQL + API's, workflows, dashboards etc., developed and maintained by ISRIC, WDC-Soils. It provides a growing range of quality-assessed and standardised soil profile data for the world. For this, it draws on voluntary contributions of data holders/providers worldwide.

The source data come from different types of surveys ranging from systematic soil surveys (i.e., full profile descriptions) to soil fertility surveys (i.e., mainly top 20 to 30 cm). Further, depending on the nature of the original surveys the range of soil properties can vary greatly (see https://essd.copernicus.org/articles/12/299/2020/).

The quality-assessed and standardised data are made available freely to the international community through several webservices, this in compliance with the conditions (licences) specified by the various data providers. This means that we can only serve data with a so-called 'free' licence to the international community (https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?any=wosis_latest). A larger complement of geo-referenced data with a more restrictive licence can only be used by ISRIC itself for producing SoilGrids maps and similar products (i.e. output as a result of advanced data processing). Again, the latter map layers are made freely available to the international community (https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?resultType=details&sortBy=relevance&any=soilgrids250m%202.0&fast=index&_content_type=json&from=1&to=20).

WoSIS workflow WoSIS workflow for ingesting, processing and disseminating data.

During this master class, you will first learn what GraphQL and API (application programming interface) are. Next, using guided steps, we will explore the basics of WoSIS and GraphQL via a graphical interface. From that point onwards we will slowly increase complexity and use WoSIS data. Building upon this, we will instruct how to create code that uses soil data.

The workshop requires no previous knowledge of WoSIS or GraphQL. However, it is advisable to have basic coding knowledge on the Python or R languages.

The aim of this master-class is to provide clear documentation on how to use the WoSIS Graphql API.

WoSIS public products

WoSIS data can be accessed via OGC services and a GraphQL API.

OGC web services was the initial available way to download and access WoSIS. You can find more information on how to access WoSIS using the good old OGC web services at https://www.isric.org/explore/wosis/accessing-wosis-derived-datasets.

Recently, we developed a GraphQL API and the aim of this master-class is to show and describe how to use this tool to download and explore WoSIS data.


What is GraphQL?

GraphQL is a query language for API's. GraphQL isn't tied to any specific database or storage engine and is instead backed by your existing code and data.

If you are new to GraphQL it might be good to check the official documentation: https://graphql.org/learn/.

GraphQL works as an abstraction layer between application and database, allowing direct queries to the DB using web technologies (HTTP requests) and JSON objects. GraphQL is the brother of REST.

Other external "easy to follow" general docs on graphQL:


Requirements

In order to move forwards don't need to have any extra tools apart from a browser.

However, if your aim is to use this API in scripting then its advisable to have knowledge in at least one of the following languages:

  • Python
  • R

API root endpoint and web interfaces

Root endpoint

WoSIS GraphQL API root endpoint is can be found at:

https://graphql.isric.org/wosis/graphql

This is the main GraphQL root endpoint. If you are an advanced GraphQL user and you use a custom script or a GraphQL client this is what you should use.

Nonetheless if you click on the above link using a browser you'll probably get the following error message:

{"errors":[{"message":"Only `POST` requests are allowed."}]}

This is expected because this GraphQL endpoint expects POST requests and not GET requests.

To simplify use we provide two Web interfaces IDE's that can be used in a graphical way to explore and access the data.

Web interfaces IDE's

We provide the following interactive in-browser GraphQL IDE's

For the exercises in this master-class we'll use graphiql but you are free to use the one you prefer most.

Explore current schema

As per the current date WoSIS graphQL schema is composed by Profiles that contain Layers and for each layer several measurementValues can be found per property (e.g. PH). For the same property, each layer can have one or more measurements e.g. One layer with several samples.

  1. Profile A
    1. Layer X
      1. measurementValues E
      2. measurementValues R
  2. Profile B
    1. Layer Y
      1. measurementValues E
      2. measurementValues R
      3. measurementValues T

Please explore the current schema using graphiQL IDE. Follow the following link https://graphql.isric.org/wosis/graphiql

You will find at the root:

  • wosisLatestAttributes - All current attributes (properties) distributed by WoSIS with the total number profiles and respective layers.
  • wosisLatestLayers - WoSIS layers, at this level you'll get all layers and respective measurements.
  • wosisLatestProfiles - WoSIS profiles, this is probably were you want to start since it contains all levels of WoSIS product (Profiles, Layers, measurements)

Using graphiQL interface please spend some time exploring WoSIS schema.

While expanding wosisLatestProfiles we'll get the following:

wosisLatestProfiles

Please note the objects with the right arrow marked in red. Expand this objects and check its contents.

Explore the documentation

One of the advantages of graphQL is the automatically generated documentation. In order to access the documentation in graphiQL click on the DOCS button marked in red in bellow image.

wosisLatestProfiles

Please spend some time exploring the documentation and try to familiarize yourself with the structure.

The image bellow shows wosisLatestProfiles autogenerated documentation.

wosisLatestProfiles

First queries

Its now time to start exploring WoSIS data using queries.

  • Get all WoSIS Latest Attributes
{
  wosisLatestAttributes {
    layers
    profiles
    description
    code
    attribute
  }
}

The following query will return the following error:

{
  "errors": [
    {
      "message": "You must provide a 'first' or 'last' argument to properly paginate the 'wosisLatestAttributes' field.",
      "locations": [
        {
          "line": 2,
          "column": 3
        }
      ]
    }
  ]
}

In order to avoid overloading WoSIS API we must always use first or last in all our queries.

The correct way to write our query is:

  • Get the first 100 records of WoSIS Latest Attributes
{
  wosisLatestAttributes(first: 100) {
    attribute
    description
    code
    layers
    profiles
  }
}

In practice this query will return all WoSIS Latest Attributes because currently we have less that 100 attributes.

  • Get the first 10 wosisLatestProfiles profiles without any classification record.
{
  wosisLatestProfiles(first: 10) {
    continent
    region
    countryName
    datasetCode
    latitude
    longitude
    geomAccuracy
    profileCode
  }
}
  • Get the last 10 wosisLatestProfiles profiles with all available classification records (FAO, USDA, WRB).
{
  wosisLatestProfiles(last: 10) {
    continent
    region
    countryName
    datasetCode
    latitude
    longitude
    geomAccuracy
    profileCode
    faoMajorGroup
    faoMajorGroupCode
    faoPublicationYear
    faoSoilUnit
    faoSoilUnitCode
    usdaGreatGroup
    usdaOrderName
    usdaPublicationYear
    usdaSubgroup
    usdaSuborder
    wrbPrefixQualifiers
    wrbPrincipalQualifiers
    wrbPublicationYear
    wrbReferenceSoilGroup
    wrbReferenceSoilGroupCode
    wrbSuffixQualifiers
    wrbSupplementaryQualifiers
  }
}

Please note that you can use graphiQL IDE interface to easily create your queries. If you are a beginner its recommended that you generated your queries via the user interface.

  • Get first 10 profiles and for each profile get also the first 10 layers:
{
  wosisLatestProfiles(first: 10) {
    profileId
    continent
    region
    countryName
    datasetCode
    latitude
    longitude
    geomAccuracy
    profileCode
    layers(first: 10) {
      layerId
      layerNumber
      licence
      lowerDepth
      upperDepth
      organicSurface
    }
  }
}
  • Get first 10 profiles and for each profile get also the first 10 layers and also the first 10 values of silt:
{
  wosisLatestProfiles(first: 10) {
    profileId
    continent
    region
    countryName
    datasetCode
    latitude
    longitude
    geomAccuracy
    profileCode
    layers(first: 10) {
      layerId
      layerNumber
      licence
      lowerDepth
      upperDepth
      organicSurface
      siltValues(first: 10) {
        valueAvg
        value
        date
      }
    }
  }
}
  • Get first 10 profiles and for each profile get also the first 10 layers and for each layer also get the first 10 values of silt and the first 10 values of organic carbon:
{
  wosisLatestProfiles(first: 10) {
    profileId
    continent
    region
    countryName
    datasetCode
    latitude
    longitude
    geomAccuracy
    profileCode
    layers(first: 10) {
      layerId
      layerNumber
      licence
      lowerDepth
      upperDepth
      organicSurface
      siltValues(first: 10) {
        valueAvg
        value
        date
      }
      orgcValues(first: 10) {
        valueAvg
        value
        date
      }
    }
  }
}

Probably at this point you have some empty results in orgcValues due to the fact that some layers might not have any organic carbon measurement.

As exemplified we can request all types of values (Silt; Sand; Organic carbon; PH etc.) but the more we request the slower the query will be.

Exploratory queries without any filtering can be important as a first contact with our data but at some point its recommended to apply filters.

Filtering

Perhaps one of the biggest advantage of this graphQL API is the ability to filter data. In the majority of the cases a user might want to extract specific data, for this we'll make use of Filtering capabilities.

  • Get first 10 profiles from continent Europe

  • Get first 10 profiles from continent Europe or Africa

  • Get first 10 profiles with at least one WRB classification record.

  • Get first 10 profiles and respective layers that have at least one Organic Carbon measurement.

Spatial queries

Using variables

Pagination concepts

Scripting

Python examples

Perhaps also on Google Colab.

The simplest way to perform a graphQL request in python is to use requests.

  • Get the fist 10 profiles.
import requests
import json

# GraphQL query
query = """
{
  wosisLatestProfiles(first: 10) {
    continent
    region
    countryName
    datasetCode
    latitude
    longitude
    geomAccuracy
    profileCode
  }
}
"""
# GraphQL endpoint
url='https://graphql.isric.org/wosis/graphql'
# Send POST request
r = requests.post(url, json={'query': query})
# Print status_code if needed
# print(r.status_code)
# Parse JSON
parsed = json.loads(r.text)
# Print JSON
json=json.dumps(parsed, indent=4, sort_keys=True)
print(json)

The response will be

{
    "data": {
        "wosisLatestProfiles": [
            {
                "continent": "Africa",
                "countryName": "Zambia",
                "datasetCode": "AF-AfSIS-I",
                "geomAccuracy": 1e-06,
                "latitude": -16.044876098632812,
                "longitude": 28.257427215576172,
                "profileCode": "icr056260",
                "region": "Eastern Africa"
            },
            {
                "continent": "Africa",
                "countryName": "Zambia",
                "datasetCode": "AF-AfSIS-I",
                "geomAccuracy": 1e-06,
                "latitude": -16.044876098632812,
                "longitude": 28.257427215576172,
                "profileCode": "icr056261",
                "region": "Eastern Africa"
            }
        ]
    },
    "meta": {
        "graphqlQueryCost": 2
    }
}

Using variables in our script

  • Get all profiles from Portugal with Organic Carbon measurements

R examples

Conclusions