Skip to content
Snippets Groups Projects
  • Berg, Anne-Wil van den's avatar
    ed9af2ac
    Squashed commit of the following: · ed9af2ac
    Berg, Anne-Wil van den authored
    commit 46830548
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Sep 23 11:29:52 2024 +0200
    
        scripts for postprocessing of fluxes and to create posterior input
    
    commit 67d41396
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Sep 23 11:08:53 2024 +0200
    
        updates; bug fixes
    
    commit 9b085532
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Sep 23 11:07:20 2024 +0200
    
        no tm5mp clone
    
    commit f43b1e20
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri Jun 21 14:22:27 2024 +0200
    
        added wrappers around flux and obspack output
    
    commit 70df24bf
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri Jun 7 16:27:38 2024 +0200
    
        tm5 datasets instructions added
    
    commit 79823a55
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue Jun 4 09:55:43 2024 +0200
    
        division of temporal and regional descriptions
    
    commit 4513addf
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Jun 3 19:16:46 2024 +0200
    
        results had to be a list
    
    commit 8eabba8a
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Jun 3 15:45:07 2024 +0200
    
        chartostring not needed; done in io4.py
    
    commit 6d62c901
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Jun 3 14:44:49 2024 +0200
    
        start of statevector hacking tutorial
    
    commit c035192e
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 11:12:15 2024 +0200
    
        np.zeros instead of zeros
    
    commit f54c3fe4
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 10:37:55 2024 +0200
    
        transcom data added
    
    commit 094212ab
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 10:32:37 2024 +0200
    
        more docu
    
    commit a6734e87
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 10:32:23 2024 +0200
    
        addded tools_transcom that doesn't read uppon import
    
    commit 85bdf7f8
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 17:23:08 2024 +0200
    
        ctdata tutorial added
    
    commit 91e99f7a
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 17:00:31 2024 +0200
    
        better explanation of ctecc settings
    
    commit f1a47e72
    Merge: 06484d5d 18ee746d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 16:14:59 2024 +0200
    
        Tutorial devs from joram local and snellius
        Merge branch 'merge_ctecc_master' of git.wur.nl:ctdas/CTDAS into merge_ctecc_master
    
    commit 06484d5d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 16:14:56 2024 +0200
    
        tutorial update
    
    commit 18ee746d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 13:03:34 2024 +0200
    
        added tutorial.md
    
    commit 2af1610a
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 13:00:21 2024 +0200
    
        posterior fluxes for NH with ctecc
    
    commit 3bfa142b
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 28 08:58:28 2024 +0200
    
        NH gridded/zoom/gvp compatible with ctecc pipeline
    
    commit 8c94ab42
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 24 14:28:50 2024 +0200
    
        updates from snellius test
    
    commit d0ba3c78
    Merge: c8063ed0 27a65324
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 23 11:42:11 2024 +0200
    
        Merge branch 'merge_ctecc_master' of git.wur.nl:ctdas/CTDAS into merge_ctecc_master
    
    commit c8063ed0
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 23 11:42:01 2024 +0200
    
        updated regions with additional function to create latitude banded regions
    
    commit 27a65324
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 23 11:25:26 2024 +0200
    
        obs_gvp_co2 NHgridded compatible with ctecc pipeline
    
    commit c75b58a2
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 21 12:33:51 2024 +0200
    
        first steps of the tutorial
    
    commit 836d42e1
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 21 12:28:00 2024 +0200
    
        platform automatically added by start_ctdas script
    
    commit a283671d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 21 12:02:43 2024 +0200
    
        added all templates with correct filepermissions
    
    commit 5877564b
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 17 16:49:20 2024 +0200
    
        old file removed
    
    commit 49836dd6
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 17 16:39:30 2024 +0200
    
        working version of both pipelines
    
    commit 1b8399fb
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 17 10:20:58 2024 +0200
    
        unified template system and start script
    
    commit 0f3ad128
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 16:26:37 2024 +0200
    
        ctecc new objects
    
    commit 1adaccfb
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 16:16:10 2024 +0200
    
        reset outputdir before invert
    
    commit bfac11a1
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 15:06:45 2024 +0200
    
        Tools from ctecc containing addition functionality
    
    commit e156973f
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 14:31:45 2024 +0200
    
        k, v changed to key value which is more pythonic. also list creation omited since dict.items() returns an iterator and so in principle this faster
    
    commit a9e81650
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 14:30:34 2024 +0200
    
        adhoc fix to have a single cyclecontrol object for both CTECC and CTECO2
    
    commit 6765b2f2
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 14:15:15 2024 +0200
    
        The command line syntax used is not compatible with 'sh' but only with bash. Opted to fix this by using bash as it is installed on moth systems anyway
    
    commit 87bc5d50
    Author: Ingrid Luijkx <ingrid.luijkx@wur.nl>
    Date:   Wed Jun 28 14:49:29 2023 +0200
    
        Added noise reduction scheme developed by Remco to the NH gridded statevector
    
    commit 6b366fbc
    Author: Ingrid Luijkx <ingrid.luijkx@wur.nl>
    Date:   Tue Aug 16 16:31:36 2022 +0200
    
        Fixed the sampling for the validation sites to all values, and not only representative hours. Plus small change to the preprocess observations pipeline.
    ed9af2ac
    History
    Squashed commit of the following:
    Berg, Anne-Wil van den authored
    commit 46830548
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Sep 23 11:29:52 2024 +0200
    
        scripts for postprocessing of fluxes and to create posterior input
    
    commit 67d41396
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Sep 23 11:08:53 2024 +0200
    
        updates; bug fixes
    
    commit 9b085532
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Sep 23 11:07:20 2024 +0200
    
        no tm5mp clone
    
    commit f43b1e20
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri Jun 21 14:22:27 2024 +0200
    
        added wrappers around flux and obspack output
    
    commit 70df24bf
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri Jun 7 16:27:38 2024 +0200
    
        tm5 datasets instructions added
    
    commit 79823a55
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue Jun 4 09:55:43 2024 +0200
    
        division of temporal and regional descriptions
    
    commit 4513addf
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Jun 3 19:16:46 2024 +0200
    
        results had to be a list
    
    commit 8eabba8a
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Jun 3 15:45:07 2024 +0200
    
        chartostring not needed; done in io4.py
    
    commit 6d62c901
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Mon Jun 3 14:44:49 2024 +0200
    
        start of statevector hacking tutorial
    
    commit c035192e
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 11:12:15 2024 +0200
    
        np.zeros instead of zeros
    
    commit f54c3fe4
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 10:37:55 2024 +0200
    
        transcom data added
    
    commit 094212ab
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 10:32:37 2024 +0200
    
        more docu
    
    commit a6734e87
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 31 10:32:23 2024 +0200
    
        addded tools_transcom that doesn't read uppon import
    
    commit 85bdf7f8
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 17:23:08 2024 +0200
    
        ctdata tutorial added
    
    commit 91e99f7a
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 17:00:31 2024 +0200
    
        better explanation of ctecc settings
    
    commit f1a47e72
    Merge: 06484d5d 18ee746d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 16:14:59 2024 +0200
    
        Tutorial devs from joram local and snellius
        Merge branch 'merge_ctecc_master' of git.wur.nl:ctdas/CTDAS into merge_ctecc_master
    
    commit 06484d5d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 16:14:56 2024 +0200
    
        tutorial update
    
    commit 18ee746d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 13:03:34 2024 +0200
    
        added tutorial.md
    
    commit 2af1610a
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 30 13:00:21 2024 +0200
    
        posterior fluxes for NH with ctecc
    
    commit 3bfa142b
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 28 08:58:28 2024 +0200
    
        NH gridded/zoom/gvp compatible with ctecc pipeline
    
    commit 8c94ab42
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 24 14:28:50 2024 +0200
    
        updates from snellius test
    
    commit d0ba3c78
    Merge: c8063ed0 27a65324
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 23 11:42:11 2024 +0200
    
        Merge branch 'merge_ctecc_master' of git.wur.nl:ctdas/CTDAS into merge_ctecc_master
    
    commit c8063ed0
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 23 11:42:01 2024 +0200
    
        updated regions with additional function to create latitude banded regions
    
    commit 27a65324
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 23 11:25:26 2024 +0200
    
        obs_gvp_co2 NHgridded compatible with ctecc pipeline
    
    commit c75b58a2
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 21 12:33:51 2024 +0200
    
        first steps of the tutorial
    
    commit 836d42e1
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 21 12:28:00 2024 +0200
    
        platform automatically added by start_ctdas script
    
    commit a283671d
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Tue May 21 12:02:43 2024 +0200
    
        added all templates with correct filepermissions
    
    commit 5877564b
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 17 16:49:20 2024 +0200
    
        old file removed
    
    commit 49836dd6
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 17 16:39:30 2024 +0200
    
        working version of both pipelines
    
    commit 1b8399fb
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Fri May 17 10:20:58 2024 +0200
    
        unified template system and start script
    
    commit 0f3ad128
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 16:26:37 2024 +0200
    
        ctecc new objects
    
    commit 1adaccfb
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 16:16:10 2024 +0200
    
        reset outputdir before invert
    
    commit bfac11a1
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 15:06:45 2024 +0200
    
        Tools from ctecc containing addition functionality
    
    commit e156973f
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 14:31:45 2024 +0200
    
        k, v changed to key value which is more pythonic. also list creation omited since dict.items() returns an iterator and so in principle this faster
    
    commit a9e81650
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 14:30:34 2024 +0200
    
        adhoc fix to have a single cyclecontrol object for both CTECC and CTECO2
    
    commit 6765b2f2
    Author: JJDHooghiem <joramjd@gmail.com>
    Date:   Thu May 16 14:15:15 2024 +0200
    
        The command line syntax used is not compatible with 'sh' but only with bash. Opted to fix this by using bash as it is installed on moth systems anyway
    
    commit 87bc5d50
    Author: Ingrid Luijkx <ingrid.luijkx@wur.nl>
    Date:   Wed Jun 28 14:49:29 2023 +0200
    
        Added noise reduction scheme developed by Remco to the NH gridded statevector
    
    commit 6b366fbc
    Author: Ingrid Luijkx <ingrid.luijkx@wur.nl>
    Date:   Tue Aug 16 16:31:36 2022 +0200
    
        Fixed the sampling for the validation sites to all values, and not only representative hours. Plus small change to the preprocess observations pipeline.

Tutorial

Prerequisites: software and environment

CTDAS is mainly written in python, and so most python 3 installations will work. Most of it is written in the core python library and only a couple of additional components need to be present:

  1. netCDF4
  2. numpy
  3. great_circle_calculator
  4. pyenkf

It depends on your platform how to deal with this. Se below; also we assume that you have access to the input data.

Snellius

Make sure that you are part of the ctdas and tm5meteo groups. You can check if you can do:

cd /projects/0/ctdas/input/

or

cd /projects/0/tm5meteo/tm5-nc/

In case you encounter permissions errors contact Joram

Go and read:

/projects/0/ctdas/modules/README.md

CTDAS

To clone the CTDAS repository

git clone https://git.wur.nl/ctdas/CTDAS 

For some applications additional software will be required, for example tm5 zoom or tm5 mp.

Starting a ctdas experiment.

Once the repository is cloned and the required components are installed an experiment can be started with the 'ctdas_start.sh' in the CTDAS folder:

./start_ctdas.sh <target_directory> <name> <example_template> <platform>

This will copy the repo to the 'target_directory/name' and copy the example_template and platform jobfiles. It will set the proper paths in these templates. The current supported platforms are snellius and aether; the supported example templates are ctecc and cteco2. If you want to know more about these files and there structure you can check the contents of the files in the template directory. As an example, if you want to start a cteco2 experiment in /scratch-shared/username with name test_ctdas on snellius you would use:

./start_ctdas.sh /scratch-shared/username test_ctdas cteco2 snellius

After this you have to change directories to where the experiment was installed target_directory/name in the above example this would be:

cd /scratch-shared/username/test_ctdas/exec

This is where the main programs and the da library, the da directory. Note that it is a copy and not linked to where you cloned the CTDAS directory. This ensures that during archiving of your experiment, the source code is archived properly. The other important files are

test_ctdas.py
test_ctdas.rc
test_ctdas.jb

The have been given the same name as your experiment name by the start_ctdas.sh script. Lets open up these files, and briefly look at their contents

test_ctdas.py

This is the main program, that is started when we will run it. Before you start, you wan't to make sure that the correct da objects are important. There are 8 main objects:

  1. cyclecontrol
  2. pipelines
  3. dasystem
  4. platform
  5. observations
  6. optimizer
  7. obsoperators
  8. statevectors

In general you might want to change observations, statevectors and the obsoperator. The platform is already set by the starter script. Currently only a single cyclecontrol and dasystem object exist. Changing the optimizer is something that is more for advanced users and at this point it is not important. Depending on the template you choose (cteco2 vs ctecc) you have various options. Currently in both cases only the two options for the observation operator (random vs tm5-zoom (cteco2, ctecc) or tm5-mp (ctecc) (more on valid combinations here))).

test_ctdas.rc

This file holds several options that control how the experiment is performed. The general format is derived from an Xresources file (X11 windowing system), and can be thought of as a key value pair:

some.key    : some.value

It is internally converted into a python dictionary. Simply go over them, and check the options. They should be self explanorie or documented in the rc file itself.

One parameter that is currently of interest that should be checked is:

da.system.rc        :

And depending on weather you choose ctecc or cteco2 as an experiment to test with, it can either be

da/rc/cteco2/carbontracker_gcp2021_insitu.rc

or

da/rc/ctecc/ctecc.rc

Note if you choose to run the cteco2 objects (see below) with the ctecc pipeline, still opt for the ctecc.rc.

dasystem rc

The dasystem rc file you have chosen above contains additional settings for a data assimilation. You should open it up and again read through the settings. One aspect that currently needs to be done manually is set the datadir that should specify the location of ctdas input data, on snellius this is

/projects/0/ctdas/input/ctdas_2012

and on aether this is

/mnt/beegfs/user/gkoren/ctdas_2012

Several settings here have sane defaults and do not need to be touched with the exception of the obspack settings. there is a small test rc file turned on by default, as processing observations can take a while.

test_ctdas.jb

This is the job file we submit to the batch system. For an introduction to jobfiles, you can find many tutorials online, for example the one from snellius. In general you want to pay attention to the amount of resources requested:

#SBATCH -n 90 # set this to how many you want

starting the experiment

The experiment can be started with:

sbatch test_ctdas.jb

Which will submit this to the jobqueue. Once it starts running it wil produce a test_ctdas.log text file that you can monitor with:

tail -f test_ctdas.log

which will continiously output the contents of the file to the terminal. Note it sometimes takes a while for this file to appear.

Valid object combinations

Not all objecs can work in concert. This is because the protecol for inter-object communication is not standardized. Typically, an observation operator works wel with a statevector and a subset of the observation objects. Writing the interface implementation into the obsoperator can be a laborious task and in addition can benefit from hand-based optimization to minimize io and enhance execution time, rendering it incompatible with all.

The cteco2 pipeline works with (observationoperator_tm5_cteco2, obs_gvp_co2 and statectorNHgridded). The observationoperator_tm5_cteco2 can also be substituded for the RandomMizedObservationOperator which can be imported from the baseclass

The above objects can also be run together with the ctecc template but the configuration is a bit different to achieve this. In the ctecc [ctecc dasystem rc] da/rc/ctecc/ctecc.rc under observations

known issues

  1. number of processes is not always forwarded to the obsoperator installation, which can cause openmp or mpi errors. make sure that the jobscript, resources in the rc file, and that of the obsoperator are aligned.

CTDATA

CTDAS ships with a data load wrapper around most of the datasets addedin the datadir repo. This tool is found in:

da/ctdata/fluxload.py

To add a dataset, you have to

  1. create a dictionary with info about the file location and structure (explained below)
  2. add this dictionary to the ctefluxes dictionary found in the same file above
  3. add a similar entry in your obsoperator (this is described below for tm5-mp) now it is availible for the data_fetch function

now these two steps are elaborated on in the next section

File format info (step 1 above)

The data_fetch function requires some information. First it needs te be able to construct a filename. It does so from the fileroot entry, which is the location; then it uses the strf to get the correct date identifier. for the below entry:

sib4biomean     =     { 'fileroot' : c13input+'/input/SiB4v2_246_split/daymean_'                             , 
			'strf' : '%m-%d'    , 
			'data_per_day' : 1 , 
			'key': 'nee_daymean'         , 
			'factor' : 1.0, 
			'ctname' : 'co2_biom'}

the file would be constructed as follows, for the date 01-01 it will produce:

c13input+'/input/SiB4v2_246_split/daymean_01-01.nc'                             , 

Now that we have a file, it will be opened and we want to read a variable. This will be done like so: key is the name of the variable in the netcdf file. the data_per_day will tell how much entries per day you have. factor will multiply the datasets in case of a unit conversion is needed. ctname is the carbontracker output name, typically <species>_<flux> examples are co2_ff, co_prod, co2c13_ocn etc...

The output is structured either (dates,data_per_day,lat,lon) or (dates,lat,lon). Plase choose a variable name for the dictionary that fits the descritption and has a version number

new_flux_v1 = { .... 

making it available (step 2 above)

Scroll down till you hit the ctefluxes dictionary definition. Add the entry: customary if your new flux name is new flux, please do add it like:

'new_flux_v1' : new_flux_v1

testing

Scroll down to the bottom; some lines of code are commented out as an example. Add your own lines of code to test loading, and run the python file:

# define some dates you want to test, needs to be continious
dates=np.array([dt.datetime(2000,2,1) + dt.timedelta(i) for i in range(0,1)])
# load new_flux_v1
data=fetch_data('new_flux_v1',dates)
# some tests; can be whatever you want
print(np.shape(data))
print(np.mean(data,axis=0))
python da/ctdata/fluxload.py

adding a flux to tm5-mp

the tm5-mp ctdas interface has a flexible wrapper for reading data. It can handle both 2d, i.e. surface emissions, and 3d datasets, i.e. atmospheric production terms. To add a dataset go to your tm5-mp folder and open proj/enkf/base/read_data_definitions.F90. This contains all the info that the reader needs to open and read data from a NetCDF file.

After studying the type read_dataset defintion and its comments, shown at the top, scroll down to the line:

  integer, parameter :: number_of_known_emissions = 50

and increment its value with the amount of datasets you add.

now have a look at the first dataset:

    read_datasets(2)%name = 'gfas_daily'
    read_datasets(2)%frequency = 'daily'
    read_datasets(2)%file_prefix = trim(ctdata)//'/fire/gfas_daily_1x1/gfas_1x1_'
    read_datasets(2)%varname = 'co2fire'
    read_datasets(2)%cdate = 'yyyymm'

the 2 within brackets means that this is entry number 2 of all datasets. The %<variable name> accesses the attribute (i.e. class like features). The name is used as a dataset name, and for consistency it is nice to have these the same as the ctdata entry as in CTDAS. The frequency tells the reader how often it should be read. To get an idea of the supported frequencies study the other dataset definitions in this file. The file_prefix is the root of the file. using the variable cdate (character date) the file is constructed. the file is constructed by the reader, in pseudo code:

file_prefix + cdate + file_postfix +.nc

finally to read data, it is using varname to get the variable from the file. It will determine based on frequency and the cdate how many timesteps are in a file. Other options can be used as well. Probably the most useful is unit_factor if the field needs to be multiplied by a constant (-1 for sib4 gpp or 83 for kg C to mol).

Now by example we can add datasets. Scroll down to the last available dataset in the file (which in the above case of number_of_known_emissions =50 will have index 50, we can copy and paste and modify one of the existing sets. Finally change the index to the last increment. So if you added dataset number 51, the index should be 51, as such: read_datasets(51)

A note on netCDF dimensions.

Fortran code is fast, but also dumb. Unlike many high level languages, the netCDF interface does not auto recognise dimensions, axis and so on. It will simply read in a number of bytes and cast those into the variable. Hence something in the dimension: (time,lat,lon) can be broadcast without error into (lon,lat,time). To make things more confusing, the format which fortran uses is often opposit from python. Without going into detail why and how I simple test, use ncdump:

ncdump /projects/0/ctdas/input/ctdas_2012/biosphere/SiB4v2_BioFluxes_v247/SiB4v2_BioFluxes_v247_1x1_200001.nc

will output:

float reco(time, latitude, longitude)

which is correct. Also make sure that fluxes are correctly gridded as tm5 doesn't not shift longitude if it start at 0.

Object walktrhough

RandomMizedObservationOperator

The random observation operator currently only supports flask samples from obs_ctecc (co2 only) and obs_gvp_co2. The reason for this is that all stuff read in by add_simulations is required to be put out by th observation operator. The best way to deal with this is to have the RandomMized obs generation be part of the observation object so that the randomized observation operator can do sample.generate_random output which will be compatible with sample.add_simulations.

CycleControl

a dictionary that contains info and methods on running a kalman smoother.

Obspack

ctecc obspack

rc

the rc format is slightly different. It requires an additional field that specifies if the site is assimilated or not:

co2_aao_aircraft-pfp_1_allvalid                    : do-not-use   noaa       2006-06-07   2009-09-18   T T

instead of:

co2_aao_aircraft-pfp_1_allvalid                    : do-not-use   noaa       2006-06-07   2009-09-18   T

Statevector hacking

LW

    def __init__(self, variance=1, nparams_per_region=None, cov_t_length=0, cyclic=False, water=True, nmembers=150, mean=0, regionsfile='/projects/0/ctdas/input/ctdas_2012/regions.nc' ,regions='regions', constraint=None, datetime=None, freq='D', cov_region_length=None,limit_to_transcom=True,region_covariance_matrix=None,startdate=None,enddate=None,func=None,fluxname=None,griddedtc=None,gridded_length_scales=None,within_tcregions=None,length_scales=None,nlag=1,nsmooth=1,lw=True,coupled=[],limit_region_unc=False,limit_factor=1.0,bboxes=None,applyregionmask=True,griddictfunc=None): 

Region stettings

water regionsfile regions cov_region_length limit_to_transcom region_covariance_matrix griddedtc gridded_lenght_scales within_tcregions length_scales applyregionmask griddictfunc

Temporal settings