Skip to content
Snippets Groups Projects
Commit d562735c authored by Luís de Sousa's avatar Luís de Sousa
Browse files

Full text review

parent 81d9fa45
No related branches found
No related tags found
1 merge request!5Full text review
......@@ -16,7 +16,7 @@ WoSIS stands for 'World Soil Information Service', a large database based on Pos
The source data come from different types of surveys ranging from systematic soil surveys (i.e., full profile descriptions) to soil fertility surveys (i.e., mainly top 20 to 30 cm). Further, depending on the nature of the original surveys the range of soil properties can vary greatly (see [https://essd.copernicus.org/articles/12/299/2020/](https://essd.copernicus.org/articles/12/299/2020/)).
The quality-assessed and standardised data are made available freely to the international community through several webservices, this in compliance with the conditions (licences) specified by the various data providers. This means that we can only serve data with a so-called 'free' licence to the international community ([https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?any=wosis_latest](https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?any=wosis_latest)). A larger complement of geo-referenced data with a more restrictive licence can only be used by ISRIC itself for producing SoilGrids maps and similar products (i.e. output as a result of advanced data processing). Again, the latter map layers are made freely available to the international community ([https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?resultType=details&sortBy=relevance&any=soilgrids250m%202.0&fast=index&_content_type=json&from=1&to=20](https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?resultType=details&sortBy=relevance&any=soilgrids250m%202.0&fast=index&_content_type=json&from=1&to=20)).
The quality-assessed and standardised data are made available freely to the international community through several web services, this in compliance with the conditions (licences) specified by the various data providers. This means that we can only serve data with a so-called 'free' licence to the international community ([https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?any=wosis_latest](https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?any=wosis_latest)). A larger complement of geo-referenced data with a more restrictive licence can only be used by ISRIC itself for producing SoilGrids maps and similar products (i.e. output as a result of advanced data processing). Again, the latter map layers are made freely available to the international community ([https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?resultType=details&sortBy=relevance&any=soilgrids250m%202.0&fast=index&_content_type=json&from=1&to=20](https://data.isric.org/geonetwork/srv/eng/catalog.search#/search?resultType=details&sortBy=relevance&any=soilgrids250m%202.0&fast=index&_content_type=json&from=1&to=20)).
![WoSIS workflow](https://www.isric.org/sites/default/files/WOSIS_workflow_20221214.png "WoSIS workflow")
_WoSIS workflow for ingesting, processing and disseminating data._
......@@ -29,9 +29,9 @@ The aim of this master-class is to provide clear instructions and documentation
### WoSIS public products
WoSIS data can be accessed via **OGC services** and a **GraphQL API**.
WoSIS data can be accessed via **OGC web services** and a **GraphQL API**.
Until recently, OGC web services provided the main entry point to download and access WoSIS. You can find more information on how to access WoSIS using the good "old OGC web services" at [https://www.isric.org/explore/wosis/accessing-wosis-derived-datasets](https://www.isric.org/explore/wosis/accessing-wosis-derived-datasets).
Until recently, OGC web services provided the main entry point to download and access WoSIS. You can find more information on how to access WoSIS using the SOAP based OGC web services at [https://www.isric.org/explore/wosis/accessing-wosis-derived-datasets](https://www.isric.org/explore/wosis/accessing-wosis-derived-datasets).
In 2023, we developed a GraphQL API tool to easily access the data. The aim of this master-class is to show and describe how this tool can be used to explore and download WoSIS data.
......@@ -42,24 +42,24 @@ In 2023, we developed a GraphQL API tool to easily access the data. The aim of t
If you are new to GraphQL it might be good to check the official documentation: [https://graphql.org/learn/](https://graphql.org/learn/).
GraphQL works as an abstraction layer between application and database, allowing direct queries to the DB using web technologies (HTTP requests) and JSON objects. GraphQL is the brother of REST.
GraphQL works as an abstraction layer between application and database, allowing direct queries to the database using web technologies (HTTP requests) and JSON objects. GraphQL is the brother of REST.
For other external "easy to follow" general documents on graphQL see:
For other good introduction documents on GraphQL see:
- [Digital Ocean - Introduction to GraphQL](https://www.digitalocean.com/community/tutorials/an-introduction-to-graphql).
- [Kadaster GraphQL](https://labs.kadaster.nl/developer/graphql/).
- [Workshop spatial graphql](https://github.com/lcalisto/workshop-spatial-graphql).
- [Kadaster graphQL](https://labs.kadaster.nl/developer/graphql/).
- [Learn graphql queries](https://graphql.org/learn/queries/).
- [Learn GraphQL queries](https://graphql.org/learn/queries/).
- [GraphQL queries](https://hasura.io/learn/graphql/intro-graphql/graphql-queries/).
- [Digital Ocean - Introduction to Graphql](https://www.digitalocean.com/community/tutorials/an-introduction-to-graphql).
- [GraphQL cheatsheet](https://devhints.io/graphql)
----------
## Requirements
In order to move forwards you do not need to have any extra tools apart from a browser.
In order to move forwards you do not need to have any extra tools apart from a web browser.
However, if your aim is to use this API in scripting then its advisable to have knowledge of at least one of the following languages:
However, if your aim is to use this API in scripting then it is advisable to have knowledge on at least one of the following languages:
- Python
- R
......@@ -72,30 +72,30 @@ The WoSIS GraphQL API root endpoint can be found at:
https://graphql.isric.org/wosis/graphql
This is the main GraphQL root endpoint. If you are an advanced GraphQL user and you use a custom script or a GraphQL client this is what you should use.
This is the main GraphQL root endpoint. This is the endpoint to be used directly by applications and/or code scripts. If you are an advanced GraphQL user and you use a custom script or a GraphQL client this is what you should use.
Nonetheless, if you click on the above link using a browser you'll probably get the following error message:
Nonetheless, if you click on the above link using a web browser you'll probably get the following error message:
```json
{"errors":[{"message":"Only `POST` requests are allowed."}]}
```
This is expected because this GraphQL endpoint expects POST requests and not GET requests.
This is expected because this GraphQL endpoint expects POST requests and not GET requests. Meaning that it cannot be used directly from a web browser.
To simplify use, we provide two [Web interfaces IDE's](#web-interfaces-ides) that can be used in a graphical way to explore and access data.
To allow use from a web browser, we provide two [Web interfaces IDE's](#web-interfaces-ides) that can be used in a graphical way to explore and access data.
### Web interfaces IDE's
We provide the following interactive in-browser GraphQL IDE's:
- https://graphql.isric.org/wosis/graphiql using [graphiql](https://github.com/graphql/graphiql) web interface _"interactive in-browser IDE"_
- https://graphql.isric.org/wosis/playground using [playground](https://github.com/graphql/graphql-playground) web interface IDE
- https://graphql.isric.org/wosis/graphiql using [**graphiql**](https://github.com/graphql/graphiql) web interface _"interactive in-browser IDE"_
- https://graphql.isric.org/wosis/playground using [**playground**](https://github.com/graphql/graphql-playground) web interface IDE
For the exercises in this master-class we will use graphiql, but you are free to use the one you prefer.
For the exercises in this master-class we will use **graphiql**, but you are free to use the one you prefer.
## Explore current schema
As per the current date the WoSIS graphQL schema is composed of __*Profiles*__ that contain __*Layers*__ and for each layer several __*measurementValues*__ can be found per property (e.g., PH).
The current WoSIS GraphQL schema is composed of __*Profiles*__ that contain __*Layers*__ and for each layer several __*measurementValues*__ can be found per soil observations (e.g., PH assessed in aqueous solution).
__For a given property, each layer can have one or more measurements (e.g., one layer with several samples.)__
......@@ -109,15 +109,15 @@ __For a given property, each layer can have one or more measurements (e.g., one
2) measurementValues R
3) measurementValues T
Please explore the current schema using graphiQL IDE.For this, follow this link https://graphql.isric.org/wosis/graphiql
Please explore the current schema using **graphiql** IDE. For this, follow this link [https://graphql.isric.org/wosis/graphiql](https://graphql.isric.org/wosis/graphiql).
You will be at the root:
- __*wosisLatestAttributes*__ - All current attributes (properties) distributed by WoSIS with the total number profiles and respective layers.
- __*wosisLatestAttributes*__ - All current attributes (observations) distributed by WoSIS with the total number profiles and respective layers.
- __*wosisLatestLayers*__ - WoSIS layers, at this level you'll get all layers and respective measurements.
- __*wosisLatestProfiles*__ - WoSIS profiles, this is probably were you want to start since it contains all levels of WoSIS product (Profiles, Layers, measurements)
Using graphiQL interface please spend some time exploring the WoSIS schema.
Use the **graphiql** interface to spend some time exploring the WoSIS schema.
While expanding __*wosisLatestProfiles*__ we'll get the following:
......@@ -127,21 +127,21 @@ You will be at the root:
## Explore the documentation
One of the advantages of graphQL is the automatically generated documentation. In order to access the documentation in graphiQL click on the __*DOCS*__ button marked in red in image below.
One of the advantages of GraphQL is the automatically generated documentation. In order to access the documentation in GraphQL click on the __*DOCS*__ button marked in red in image below.
![wosisLatestProfiles](./images/docs_button.jpg "Expanded wosisLatestProfiles")
Please spend some time exploring the documentation and try to familiarize yourself with the structure.
Please spend some time exploring the documentation and try to familiarise yourself with the structure.
The image below shows wosisLatestProfiles autogenerated documentation.
The image below shows documentation auto-generated for *wosisLatestProfiles*.
![wosisLatestProfiles](./images/wosisLatestProfiles_doc.jpg "wosisLatestProfiles Docs"){width=25%}
## First queries
Its now time to start exploring WoSIS data using queries.
It is now time to start exploring WoSIS data using queries.
- Get all WoSIS Latest Attributes
......@@ -156,7 +156,7 @@ query MyQuery {
}
}
```
The following query will return the following error:
The query above returns the following error:
```json
{
......@@ -174,7 +174,7 @@ The following query will return the following error:
}
```
In order to avoid overloading WoSIS API we must always use `first` in all our queries.
In order to avoid overloading the WoSIS API we must always use the parameter `first` in all our queries.
The correct way to write our query is:
......@@ -245,7 +245,7 @@ query MyQuery {
}
```
Please note that you can use the graphiQL IDE interface to easily create your queries. If you are a beginner, its recommended that you generate your queries via the user interface.
Please note that you can use the **graphiql** IDE to easily create your queries. If you are a beginner, it is recommended that you generate your queries via the user interface.
- Get __first 10 profiles__ and for each profile get also the __first 10 layers__:
......@@ -337,15 +337,15 @@ query MyQuery {
}
```
Probably at this point you have some empty results in `orgcValues` due to the fact that some layers do not have any organic carbon measurement.
Probably at this point you have some empty results in the `orgcValues` field due to the fact that some layers do not have any organic carbon measurement.
As exemplified, we can request all types of values (Silt; Sand; Organic carbon; pH etc.) but the more data we request the slower the query will be.
Exploratory queries without any filtering can be important as a first contact with the data, but at some point its recommended to apply filters.
Exploratory queries without any filtering can be important as a first contact with the data, but at some point it is recommended to apply filters.
## Filtering
Perhaps the main advantage of this graphQL API is the ability to filter data. In the majority of cases, a user may want to extract specific data; for this we will make use of Filtering capabilities.
Perhaps the main advantage of this GraphQL API is the ability to filter data. In the majority of cases, a user may want to extract specific data; for this we will make use of Filtering capabilities.
Before we start performing queries please spend some time exploring the filter object inside `wosisLatestProfiles` as shown in the image below:
......@@ -394,7 +394,7 @@ query MyQuery {
__*OR* & *AND*__
In the previous example we used the`in` operator, but the same `query` can be done using the`OR` operator:
In the previous example we used the `in` operator, but the same `query` can be made using the `OR` operator:
```graphql
query MyQuery {
......@@ -417,7 +417,7 @@ query MyQuery {
}
}
```
Please note that some operators (`AND`, `OR` etc.) expect an array as input `[]`
Please note that some operators (`AND`, `OR` etc.) expect an array as input (`[]`).
- Get __first 5 profiles__ with the respective __first 10 layers__ from country __Netherlands__ `AND` __with at least one layer__. In other words, we do not want profiles without layers in this query.
......@@ -545,15 +545,15 @@ query MyQuery {
```
## Using variables
In GraphQL we are able to use variables in our queries. This variables are important for:
In GraphQL we are able to use variables in our queries. Variables are important for:
- Scripting, in order to be able to interact with our script variables
- Ingest complex JSON objects into our query
- Make sure the query is easy to read
When using GraphiQL we have a query variables box. Inside this box we can add our variables in JSON format.
When using **graphiql** we have a query variables box. Inside this box we can add our variables in JSON format.
Lets demonstrate the usage of variable in the following queries:
Let us demonstrate the usage of variables in the following queries:
- Get __first 10 profiles__ from continent __Europe__
......@@ -585,7 +585,7 @@ query MyQuery($first:Int, $continent:String) {
}
```
In your GraphiQL you should have something as bellow image:
In your **graphiql** you should have something as bellow image:
![variables](./images/variables.jpg "wosisLatestProfiles variables")
......@@ -626,7 +626,7 @@ In the next chapter we'll make use of variables to better provide JSON component
This API has spatial capabilities. It is possible to perform several __spatial queries__ and apply __spatial filters__. Spatial components are GeoJSON-based.
In order to use spatial queries, we'll lets use 2 geometries of Gelderland in GeoJSON format.
In order to use spatial queries, we'll use 2 geometries of Gelderland in GeoJSON format.
You can use https://geojson.io to visualize, create and update GeoJSON geometries.
......@@ -646,7 +646,7 @@ You can use https://geojson.io to visualize, create and update GeoJSON geometrie
2) Points (3 points) in Gelderland in Geojson format:
2) Points (3) in Gelderland in Geojson format:
```json
{
......@@ -707,11 +707,13 @@ Inside `Query variables` add the `geomGelderland` variable:
}
```
Example on what you should see in GraphiQL:
Example on what you should see in **graphiql**:
![geomGelderlandExample1](./images/geomGelderland_example1.jpg "geomGelderlandExample1")
The __GEOM object__ corresponds to the geometry. Please spend some time exploring this object in GraphiQL interface. Make sure you explore the `Filter` capabilities.
The __GEOM object__ corresponds to the geometry. Please spend some time exploring this object in the **graphiql** interface. Make sure you explore the `Filter` capabilities.
- Using the previous query change the `query variables` to the points geometry:
......@@ -756,7 +758,7 @@ query MyQuery($geomGelderland: GeoJSON!) {
## Pagination concepts
Depending on the way you create your query it can evolve hight computational resources. Besides, if not using pagination you could easily create a query that returned a huge records with all the problems that brings.
Depending on the way you create your query it can evolve high computational resources. Besides, if not using pagination you could easily create a query that returned a huge number of records, with all the problems that brings.
To solve this problem __we enforce pagination in this GraphQL API__.
......@@ -772,14 +774,14 @@ __The `Offset:` argument__
`Offset` is an optional argument that indicates *where in the list the server should start when returning items* for a particular query.
This arguments `first` and `offset` are extremely important when you need to extract and download data.
The arguments `first` and `offset` are extremely important when you need to extract and download data.
We'll make use of pagination on our scripts. We'll show how to use pagination and extract a considerable amount of data from WoSIS using this GraphQL API.
## Scripting
### Python examples
The simplest way to perform a graphQL request in python is to use requests.
The simplest way to perform a GraphQL request in python is to use the `requests` package.
- Get the __fist 5 profiles__ and add it to a Pandas dataframe:
......@@ -942,13 +944,13 @@ df.to_csv('wosis_gelderland.csv', index=False)
```
The result will be:
`There are 136 WoSIS profiles with layers inside Gelderland region`
`There are 136 WoSIS profiles with layers inside the Gelderland region`
CSV result file can be found [here](./scripts/python/wosis_gelderland.csv)
The CSV result file can be found [here](./scripts/python/wosis_gelderland.csv)
### R examples
The simplest way to perform a graphQL request in r is to use {httr}.
The simplest way to perform a GraphQL request in r is to use {httr}.
- Get the __first 5 profiles__ and add it to a Pandas dataframe:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment