Commit 194c4e29 authored by Hans van den Heuvel's avatar Hans van den Heuvel
Browse files

Fixed bug for specs Waldo

parent 53541e00
......@@ -229,8 +229,9 @@ if dataset.food_composition.sheet is not None:
# We also have to do the food_composition translation
# First remove all but keep the P-code data
# Also use shorter name:
fcs = dataset.food_composition.sheet[
dataset.food_composition.sheet['idToFood'].str.startswith('P')]
fcs = dataset.food_composition.sheet[(
dataset.food_composition.sheet['idToFood'].str.startswith('P') &
dataset.food_composition.sheet['idFromFood'].str.contains('-'))]
# Now split the first column
fs = pd.DataFrame()
# Bit of a mess, to combine again.
......@@ -247,7 +248,8 @@ if dataset.food_composition.sheet is not None:
left_on='idFoodProcessed', right_on='idToFood-PC',
how='left').assign()
efsa_combined.loc[
(efsa_combined['idToFood-PC'].notna()),
(efsa_combined['idToFood-PC'].notna() &
efsa_combined['idFoodProcessed'].str.contains('-')),
'idFoodProcessed'] = efsa_combined['idFromFood']
#############################################################################
......
......@@ -55,7 +55,7 @@ The following is happening in the script, essentially
* Then the script will try to match both the ``FromFX`` and ``FXToRpc`` column of [FoodTranslations.csv](FoodTranslations.csv) with the columns ``Matrix FoodEx2 Code`` and ``Matrix Code`` from the EU sheet, *for all rows that didn't already match in the previous step*. If a match was found, then the value of ``FXToProcType`` will be copied to ``idProcessingType``.
* If no substance file was given, then just copy the field ``ParamCode Active Substance`` to ``idSubstance``. But if a substance was given, then strip the dash from the ``CASNumber`` column in the substance file, and match the column ``ParamCode Active Substance`` in the EFSA sheet to ``code`` in the substances sheet. If a match was found then copy the modified (without dash) ``CASNumber`` to ``idSubstance``.
* If a foodcompositions file was given, then an additional translation is done. This table needs to have the layout of the MCRA FoodComposition.
* All records of ``idToFood`` starting with ``P`` will be deleted (in memory, not on disk)
* Only records of ``idToFood`` starting with ``P`` and ``idFromFood`` which contain a dash (-) will be used
* The ``idFromFood`` column is split on the dash (-)
* A new column is temporarily added combining ``idToFood`` and the right part of the split on ``idFromFood``
* For all matches of the new column with the field ``idFoodProcessed`` in ``ProcessingFactors``, the field ``idFoodProcessed`` will be replaced by the field ``idFromFood`` from the FoodComposition table, and duplicates will also be added
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment