Skip to content
Snippets Groups Projects
Commit 25d13bf7 authored by Paul van Genuchten's avatar Paul van Genuchten
Browse files

restore gitlab config

parent 9c9f8f45
No related branches found
No related tags found
No related merge requests found
Pipeline #70809 passed
# github synchronises to WUR-gitlab, where rest of CI-CD is managed
default:
interruptible: false
stages:
- metadata
# metadata
metadata:
image: harbor.containers.wurnet.nl/isric/pygeodatacrawler:1.0.12
stage: metadata
script:
# init database if it does not exist
#- pip install sqlalchemy
#- pycsw-admin.py setup-db -c ./pycsw.cfg
# covert yml to iso (only changed?)
- export pgdc_webdav_url=https://catalogue.ejpsoil.eu/collections/metadata:main/items
- export pgdc_schema_path=/pyGeoDataCrawler/geodatacrawler/schemas
- cd /pyGeoDataCrawler
- poetry run crawl-metadata --dir=$CI_PROJECT_DIR/datasets --mode=export --dir-out=/tmp
# upload iso to pycsw (remove first, only changed?)
- pycsw-admin.py delete-records --config=$CI_PROJECT_DIR/pycsw.cfg -y
- pycsw-admin.py load-records --config=$CI_PROJECT_DIR/pycsw.cfg --path=/tmp
# load also other xml files
- pycsw-admin.py load-records --config=$CI_PROJECT_DIR/pycsw.cfg --path=$CI_PROJECT_DIR/datasets -r
when: on_success
only:
- main
......@@ -4,6 +4,9 @@ This repository is a participative effort to collect and maintain a series of de
If a dataset is already described elsewhere ([INSPIRE](https://inspire-geoportal.ec.europa.eu/overview.html?view=themeOverview&theme=so), [OpenAire](https://explore.openaire.eu/search/find/dataproviders?fv0=soil&f0=q), ...) a reference should be made to the external source, so the metadata can be automatically synchronised. Use the metadata:dataseturi property to capture a reference to the remote document. For now we support either a DOI or a iso19139:2007 document.
The initial format for data submission is CSV, according to a predefined template. Records in the CSV's are converted to [MCF](https://github.com/geopython/pygeometa), an optimal format for content versioning in GIT. The MCF is later transformed to iso19139 which can be ingested by common catalog products, such as [pycsw](https://pycsw.org)
The initial content of this repository has been collected as part of the [EJP Soil project](https://ejpsoil.eu/), which has run a number of [stock takes](https://ejpsoil.eu/fileadmin/projects/ejpsoil/WP2/Deliverable_2.2_Stocktaking_on_soil_quality_indicators_and_associated_decision_support_tools__including_ICT_tools.pdf) of available soil datasets in EU member states.
Dataset descriptions are stored in a format called [metadata control file](https://geopython.github.io/pygeometa/reference/mcf/) (MCF), which is a subset of [ISO19115](https://www.iso.org/standard/53798.html) encoded as [YAML](https://yaml.org/), an optimal encoding for content versioning. An online editor for MCF files is [MDME](https://osgeo.github.io/mdme).
......
......@@ -6,8 +6,6 @@ identification:
abstract: A collection of Soil datasets in Europe, initally collected by JRC, and extended by the EJP Soil project
robot:
mdUrlPattern: https://dev-ejpsoil-catalog.containers.wurnet.nl/collections/metadata:main/items/{0}
msUrl: https://dev-ejpsoil-mapserver.containers.wurnet.nl/
webdavUrl: https://dev-ejpsoil-webdav.containers.wurnet.nl/
mdUrlPattern: https://catalogue.ejpsoil.eu/collections/metadata:main/items/{0}
\ No newline at end of file
[server]
home=/home/pycsw
url=https://catalogue.ejpsoil.eu/
mimetype=application/xml; charset=UTF-8
encoding=UTF-8
language=en-US
maxrecords=50
loglevel=DEBUG
logfile=
#ogc_schemas_base=http://foo
#federatedcatalogues=http://catalog.data.gov/csw
#pretty_print=true
#gzip_compresslevel=8
#domainquerytype=range
#domaincounts=true
#spatial_ranking=true
profiles=apiso
#workers=2
timeout=30
[manager]
transactions=false
allowed_ips=127.0.0.1
#csw_harvest_pagesize=10
[metadata:main]
identification_title=EJPSoil Data Catalogue
identification_abstract=The catalogue presents a series of datasets relevant to the EJPSoil project
identification_keywords=catalogue,discovery,metadata,soil
identification_keywords_type=theme
identification_fees=None
identification_accessconstraints=None
provider_name=ISRIC - World Soil Information
provider_url=https://www.isric.org
contact_name=Genuchten, Paul van
contact_position=SDI Specialist
contact_address=PO Box 353
contact_city=Wageningen
contact_stateorprovince=Gelderland
contact_postalcode=6700 AJ
contact_country=the Netherlands
contact_phone=
contact_fax=
contact_email=info@isric.org
contact_url=https://pub.orcid.org/v3.0/0000-0002-4789-174X
contact_hours=Hours of Service
contact_instructions=During hours of service. Off on weekends.
contact_role=pointOfContact
[repository]
# sqlite
#database=sqlite:////var/www/pycsw/tests/functionaltests/suites/cite/data/cite.db
# postgres
database=postgresql://ejpsoil:ejpsoil@postgres-generic-svc.isric-data-prod:5433/ejpsoil_catalog
# mysql
#database=mysql://username:password@localhost/pycsw?charset=utf8
#mappings=path/to/mappings.py
table=records
#filter=type = 'http://purl.org/dc/dcmitype/Dataset'
#max_retries=5
[metadata:inspire]
enabled=true
languages_supported=eng
default_language=eng
date=2023-06-01
gemet_keywords=Soil
conformity_service=notEvaluated
contact_name=Paul van Genuchten
contact_email=info@isric.org
temp_extent=
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment