ran codebaseai

012f05f2 · Sven Warris · 5a083235 · 012f05f2 · 012f05f2 · 012f05f2
Commit 012f05f2 authored 5 months ago by Sven Warris
--- a/README.md
+++ b/README.md
-# Analytics
+# F500 Data Analytics
-Application and tools for data analytics and visualizations
+## Project Description
\ No newline at end of file
+The F500 Data Analytics project provides a suite of tools and scripts for processing, analyzing, and visualizing point cloud data, particularly from Phenospex PLY PointCount files. The project leverages libraries such as Open3D and NumPy to handle 3D data and perform operations like NDVI computation and visualization. Additionally, it includes functionalities for interacting with the Fairdom SEEK API to manage data resources.
+## Table of Contents
+- [Installation Instructions](#installation-instructions)
+- [Usage Guide](#usage-guide)
+- [Features](#features)
+- [Modules Overview](#modules-overview)
+- [Configuration & Customization](#configuration--customization)
+- [Testing & Debugging](#testing--debugging)
+- [Contributing Guide](#contributing-guide)
+- [License & Author Information](#license--author-information)
+## Installation Instructions
+To set up the project, ensure you have Python installed on your system. Then, install the required dependencies using pip:
+```bash
+pip install open3d numpy requests pandas
+```
+## Usage Guide
+### Visualizing Point Cloud Data
+To visualize point cloud data and compute NDVI, use the `visualization_ply.py` script:
+```bash
+python analytics/visualizations/visualization_ply.py <path_to_ply_file>
+```
+### Deleting Resources via Fairdom SEEK API
+To delete resources from a Fairdom SEEK server, use the `deleteFAIRObject.py` script:
+```bash
+python analytics/f500/collecting/deleteFAIRObject.py <token>
+```
+### F500 Toolkit
+The `toolkit.py` script provides a command-line interface for various data processing tasks:
+```bash
+python analytics/f500/collecting/toolkit.py <command>
+```
+Available commands include:
+- `restructure`
+- `pointclouds`
+- `verify`
+- `histogram`
+- `upload`
+## Features
+- **Point Cloud Visualization**: Visualize and process 3D point cloud data.
+- **NDVI Computation**: Compute and visualize NDVI from point cloud data.
+- **Resource Management**: Interact with Fairdom SEEK API to manage data resources.
+- **Data Processing**: Restructure, verify, and upload data using the F500 toolkit.
+## Modules Overview
+- **visualization_ply.py**: Handles point cloud visualization and NDVI computation.
+- **clearWhites.py**: Loads point cloud data for further processing.
+- **deleteFAIRObject.py**: Deletes resources from Fairdom SEEK server.
+- **toolkit.py**: Command-line interface for F500 data processing tasks.
+## Configuration & Customization
+- **API Token**: Ensure you have a valid authorization token for accessing the Fairdom SEEK API.
+- **File Paths**: Provide correct paths to PLY files when using visualization scripts.
+## Testing & Debugging
+- **Error Handling**: Scripts include basic error handling for file paths and API requests.
+- **Future Work**: Consider adding more robust error handling and automated tests for critical functionalities.
+## Contributing Guide
+Contributions are welcome! Please follow these steps:
+1. Fork the repository.
+2. Create a new branch for your feature or bug fix.
+3. Commit your changes with clear messages.
+4. Push your changes to your fork.
+5. Submit a pull request with a detailed description of your changes.
+## License & Author Information
+This project is licensed under the MIT License. For more information, see the LICENSE file.
+Author: Sven Warris
+---
+This README provides a comprehensive overview of the F500 Data Analytics project, including installation, usage, and contribution guidelines. Feel free to update sections as the project evolves.
\ No newline at end of file
--- a/__init__.py
+++ b/__init__.py
--- a/f500/__init__.py
+++ b/f500/__init__.py
--- a/f500/collecting/F500.py
+++ b/f500/collecting/F500.py
 """
-ISA & isamodel
+This script is a data processing tool for F500 PlantEye data. It provides functionalities to restructure raw data, process point clouds, combine histograms, and upload data to a specified platform. The script uses the ISA model for data representation and supports command-line interfaces for different operations.
-https://isa-specs.readthedocs.io/en/latest/isamodel.html
+Classes:
+    F500: A class to handle the processing of F500 PlantEye data, including restructuring, point cloud processing, histogram combination, and data upload.
+Functions:
+    commandLineInterface: Sets up the command-line interface for the script.
+    setLogger: Configures the logging for the script.
+    removeAfterSpaceFromDataMatrix: Static method to clean up the 'DataMatrix' column in a DataFrame.
+    createISA: Initializes an ISA investigation object.
+    writeISAJSON: Writes the ISA investigation object to a JSON file.
+    copyPots: Static method to copy pot information from a reference DataFrame to a row.
+    measurementsToFile: Writes the measurements DataFrame to a file.
+    rawMeasurementsToFile: Static method to write raw measurements to a file.
+    addPointClouds: Static method to add point cloud file names to a row.
+    copyPointcloudFile: Static method to copy point cloud files to a specified location.
+    copyPlotPointcloudFile: Static method to copy plot point cloud files to a specified location.
+    createSample: Static method to create a sample object.
+    createAssay: Static method to create an assay object.
+    createAssayPlot: Static method to create an assay plot object.
+    correctDataMatrix: Corrects the 'DataMatrix' column in a row based on a reference DataFrame.
+    finalize: Finalizes the processing of measurements and creates assays.
+    getDirectoryListing: Returns a directory listing for a given root folder.
+    restructure: Restructures the raw data into an ISA-compliant format.
+    processPointclouds: Processes point cloud files and generates derived data.
+    combineHistograms: Combines histogram data from multiple assays into a single file.
+    upload: Uploads the processed data to a specified platform.
 """
 import argparse
 import sys
 import os
@@ -28,6 +53,24 @@ import datetime
 import string
 class F500:
+    """
+    A class to handle the processing of F500 PlantEye data, including restructuring, point cloud processing, histogram combination, and data upload.
+    Attributes:
+        description (defaultdict): A dictionary to store descriptions.
+        columnsToDrop (list): A list of columns to drop from the data.
+        ISA (dict): A dictionary to store ISA-related data.
+        datamatrix (list): A list to store data matrix information.
+        investigation (Investigation): An ISA investigation object.
+        checkAssayName (re.Pattern): A regex pattern to check assay names.
+        measurements (DataFrame): A DataFrame to store measurements.
+        currentFile (str): The current file being processed.
+        currentRoot (str): The current root directory being processed.
+        command (str): The command to execute.
+        assaysDone (set): A set to store completed assays.
+        samples (dict): A dictionary to store sample objects.
+    """
    description = defaultdict(str)
    columnsToDrop = []
    ISA = {}
@@ -42,6 +85,9 @@ class F500:
    samples = None
    def __init__(self):        
+        """
+        Initializes the F500 object with default values and configurations.
+        """
        # Some columns contain the wrong data, remove those:
        self.columnsToDrop = ["ndvi_aver","ndvi_bin0","ndvi_bin1","ndvi_bin2","ndvi_bin3","ndvi_bin4","ndvi_bin5",
                         "greenness_aver","greenness_bin0","greenness_bin1","greenness_bin2","greenness_bin3","greenness_bin4","greenness_bin5",
@@ -55,11 +101,12 @@ class F500:
        self.samples = {}
    def commandLineInterface(self):
+        """
+        Sets up the command-line interface for the script, defining arguments and subcommands.
+        """
        my_parser = argparse.ArgumentParser(description='F500 PlantEye data processing tool.')
        sub_parsers = my_parser.add_subparsers(dest="command")
        my_parser_restructure = sub_parsers.add_parser("restructure")
        my_parser_restructure.add_argument('--loglevel', help="Application log level (INFO/WARN/ERROR)", default="INFO")
        my_parser_restructure.add_argument('--logfile', help="Application log file")
@@ -160,6 +207,9 @@ class F500:
    def setLogger(self):
+        """
+        Configures the logging for the script based on command-line arguments.
+        """
        self.logger = logging.getLogger("F500")
        self.logger.setLevel(self.args.loglevel)
        if len(str(self.args.logfile)) > 0: 
@@ -167,6 +217,15 @@ class F500:
    @staticmethod
    def removeAfterSpaceFromDataMatrix(row):
+        """
+        Cleans up the 'DataMatrix' column in a DataFrame row by removing text after a space.
+        Args:
+            row (Series): A row from a DataFrame.
+        Returns:
+            Series: The modified row with cleaned 'DataMatrix' column.
+        """
        try:
            row["DataMatrix"] = row["DataMatrix"].strip().split(" ")[0]
        except:
@@ -174,6 +233,9 @@ class F500:
        return row
    def createISA(self):
+        """
+        Initializes an ISA investigation object and sets up the study and metadata.
+        """
        # Create investigation
        self.investigation = Investigation()
        self.investigation.title = "_".join([self.datamatrix[4], self.datamatrix[3], self.datamatrix[2]])
@@ -181,7 +243,6 @@ class F500:
        self.investigation.measurements = pandas.DataFrame()
        self.investigation.plots = set()
        # Create study, title comes datamatrix file (ID...)   
        self.investigation.studies.append(Study())
        if self.studyName != None:
@@ -200,6 +261,9 @@ class F500:
    def writeISAJSON(self):
+        """
+        Writes the ISA investigation object to a JSON file.
+        """
        jsonOutput = open(self.args.json, "w")
        jsonOutput.write(json.dumps(self.investigation, cls=ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': ')))
        jsonOutput.close()
@@ -207,6 +271,17 @@ class F500:
    @staticmethod
    def copyPots(row, pots, f500):
+        """
+        Copies pot information from a reference DataFrame to a row.
+        Args:
+            row (Series): A row from a DataFrame.
+            pots (DataFrame): A DataFrame containing pot information.
+            f500 (F500): An instance of the F500 class.
+        Returns:
+            Series: The modified row with pot information.
+        """
        try:
            row["Pot"] = pots[ (pots["x"] == row["x"]) & (pots["y"] == row["y"]) ]["Pot"].iloc[0]
            if "Treatment" in pots.columns:
@@ -219,6 +294,9 @@ class F500:
        return row
    def measurementsToFile(self):
+        """
+        Writes the measurements DataFrame to a file.
+        """
        path =  "/".join([self.investigationPath, self.investigation.title, self.investigation.studies[0].title])
        filename = "derived/" + self.investigation.studies[0].title + ".csv"
        os.makedirs(path + "/derived", exist_ok=True)
@@ -226,13 +304,31 @@ class F500:
    @staticmethod
    def rawMeasurementsToFile(path, filename, measurements):
+        """
+        Writes raw measurements to a file.
+        Args:
+            path (str): The directory path to save the file.
+            filename (str): The name of the file.
+            measurements (list): A list of measurements to write.
+        """
        os.makedirs(path + "/derived", exist_ok=True)
        df = pandas.DataFrame(measurements)
        df = df.transpose()
        df.to_csv(path + "/" + filename, sep=";", index=False)
    @staticmethod
-    def addPointClouds(row, title) :
+    def addPointClouds(row, title):
+        """
+        Adds point cloud file names to a row.
+        Args:
+            row (Series): A row from a DataFrame.
+            title (str): The title to use in the file names.
+        Returns:
+            Series: The modified row with point cloud file names.
+        """
        filename = "{}_{}_full_sx{:03d}_sy{:03d}.ply.gz".format(
            title, row["timestamp_file"], 
            int(row["x"]),
@@ -259,6 +355,14 @@ class F500:
    @staticmethod
    def copyPointcloudFile(row, f500, fullPath):
+        """
+        Copies point cloud files to a specified location.
+        Args:
+            row (Series): A row from a DataFrame.
+            f500 (F500): An instance of the F500 class.
+            fullPath (str): The destination path for the point cloud files.
+        """
        if f500.args.copyPointcloud == "True": 
            AB = f500.root.split("/")[-1]
            pointcloudPath = "/".join(f500.root.split("/")[:-3]) + "/current/" + AB +'/I/'
@@ -299,6 +403,15 @@ class F500:
    @staticmethod
    def copyPlotPointcloudFile(row, f500, fullPath, title):
+        """
+        Copies plot point cloud files to a specified location.
+        Args:
+            row (Series): A row from a DataFrame.
+            f500 (F500): An instance of the F500 class.
+            fullPath (str): The destination path for the plot point cloud files.
+            title (str): The title to use in the file names.
+        """
        if f500.args.copyPointcloud == "True": 
            f500.logger.warn("The copy plot point cloud will copy a lot of data. However, users are generally not interested in these plot files.")
@@ -359,6 +472,20 @@ class F500:
    @staticmethod
    def createSample(samples, name, source, organism, taxon, term_source):
+        """
+        Creates a sample object if it doesn't already exist.
+        Args:
+            samples (dict): A dictionary to store sample objects.
+            name (str): The name of the sample.
+            source (Source): The source object for the sample.
+            organism (str): The organism name.
+            taxon (str): The taxon ID.
+            term_source (OntologySourceReference): The ontology source reference.
+        Returns:
+            Sample: The created or existing sample object.
+        """
        if str(name) not in samples:
            sample = Sample(name=str(name), derives_from=[source])
            characteristic_organism = Characteristic(category=OntologyAnnotation(term="Organism"),
@@ -372,6 +499,15 @@ class F500:
    @staticmethod
    def createAssay(row, f500, path, source):
+        """
+        Creates an assay object and adds it to the investigation.
+        Args:
+            row (Series): A row from a DataFrame.
+            f500 (F500): An instance of the F500 class.
+            path (str): The directory path for the assay.
+            source (Source): The source object for the assay.
+        """
        assay = Assay()
        assay.title = row["timestamp_file"]
        assay.filename = row["timestamp_file"]
@@ -451,6 +587,16 @@ class F500:
    @staticmethod
    def createAssayPlot(row, f500, path, source, title):
+        """
+        Creates an assay plot object and adds it to the investigation.
+        Args:
+            row (Series): A row from a DataFrame.
+            f500 (F500): An instance of the F500 class.
+            path (str): The directory path for the assay plot.
+            source (Source): The source object for the assay plot.
+            title (str): The title to use in the file names.
+        """
        assay = Assay()
        assay.title = row["timestamp_file"]
        assay.filename = row["timestamp_file"]
@@ -484,6 +630,16 @@ class F500:
            f500.investigation.studies[0].assays.append(assay)
    def correctDataMatrix(row, pots):
+        """
+        Corrects the 'DataMatrix' column in a row based on a reference DataFrame.
+        Args:
+            row (Series): A row from a DataFrame.
+            pots (DataFrame): A DataFrame containing pot information.
+        Returns:
+            Series: The modified row with corrected 'DataMatrix' column.
+        """
        result = pots.loc[(pots['x'] == row["x"]) & (pots['y'] == row['y']), 'Pot']
        # Access the result
@@ -493,6 +649,12 @@ class F500:
        return row
    def finalize(self, title):
+        """
+        Finalizes the processing of measurements and creates assays.
+        Args:
+            title (str): The title to use in the file names.
+        """
        # CSV will be combined data file (with corrected pot names) and ply file names
        # Do this, if the data matrix contains pot names (otherwise it either went wrong or data is from a different project
        # Then list the ply files as Image File
@@ -533,9 +695,21 @@ class F500:
            self.logger.info("No pots in main measurement file")
    def getDirectoryListing(self, rootFolder):
+        """
+        Returns a directory listing for a given root folder.
+        Args:
+            rootFolder (str): The root folder to list.
+        Returns:
+            generator: A generator yielding directory listings.
+        """
        return os.walk(rootFolder)
    def restructure(self):
+        """
+        Restructures the raw data into an ISA-compliant format.
+        """
        self.source = Source(name=self.args.source)
        self.sourceContainer = Source(name=self.args.sourceContainer)
        self.datamatrix = os.path.basename(self.args.datamatrix_file).split(".")[0].split("_")
@@ -618,6 +792,9 @@ class F500:
    def processPointclouds(self):
+        """
+        Processes point cloud files and generates derived data.
+        """
        from PointCloud import PointCloud
        self.logger.info("Reading project ISA {}".format(self.args.json))
@@ -691,6 +868,9 @@ class F500:
    def combineHistograms(self):
+        """
+        Combines histogram data from multiple assays into a single file.
+        """
        self.logger.info("Reading project ISA {}".format(self.args.json))
        self.logger.info("Creating combined histogram of {}".format(self.args.histogram))
        self.investigation = isajson.load(open(self.args.json, "r"))
@@ -728,10 +908,11 @@ class F500:
                self.logger.warning("Could not combine data for {}, exception: {}".format(hLabel, e))
    def upload(self):
+        """
+        Uploads the processed data to a specified platform.
+        """
        self.logger.info("Reading project ISA {}".format(self.args.json))
        self.logger.info("Uploading data to {}".format(self.args.URL))
        self.investigation = isajson.load(open(self.args.json, "r"))
        fairdom = Fairdom(self.investigation, self.args, self.logger)
        fairdom.upload()
\ No newline at end of file
--- a/f500/collecting/F500Azure.py
+++ b/f500/collecting/F500Azure.py
 from azure.storage.blob import BlobServiceClient
 from F500 import F500
+import os
+import json
+import pandas
+import shutil
+"""
+This script provides functionality to interact with Azure Blob Storage for managing 
+and processing data related to plant imaging experiments. It extends the F500 class 
+to include methods for initializing Azure connections, transferring data, and handling 
+experiment metadata.
+"""
+class F500Azure(F500):
+    """
+    A class to manage Azure Blob Storage interactions for plant imaging experiments.
+    This class extends the F500 class and provides additional methods to initialize 
+    Azure connections, transfer data between source and target containers, and handle 
+    experiment metadata.
+    """
-class F500Azure (F500):
    def __init__(self, experimentID):
+        """
+        Initialize the F500Azure class with a specific experiment ID.
+        Args:
+            experimentID (str): The unique identifier for the experiment.
+        """
        super().__init__()
        self.experimentID = experimentID
    def initAzure(self, environment, metadata, logger):
+        """
+        Initialize Azure-related settings and metadata for the experiment.
+        Args:
+            environment (dict): A dictionary containing environment-specific settings.
+            metadata (dict): Metadata related to the experiment.
+            logger (Logger): Logger instance for logging information.
+        Side Effects:
+            Sets various attributes related to the experiment and Azure configuration.
+        """
        self.logger = logger
-        self.args.technologyType ="Imaging"
+        self.args.technologyType = "Imaging"
        self.args.technologyPlatform = "PlantEye"
-        self.args.sampleType ="Pot"
+        self.args.sampleType = "Pot"
-        self.args.sampleTypeContainer ="Plot"
+        self.args.sampleTypeContainer = "Plot"
        self.args.source = "Plant"
-        self.args.sourceContainer ="Plot"
+        self.args.sourceContainer = "Plot"
-        self.args.copyPointcloud ="True"
+        self.args.copyPointcloud = "True"
        self.args.investigationPath = str(self.experimentID)
        self.args.datamatrix_file = environment["datamatrix_file"]
        self.args.json = environment["json"]
        self.args.organism = environment["organism"]
        self.args.taxon = environment["taxon"]
-        self.args.start = enviroment["start"]
+        self.args.start = environment["start"]
        self.metadata = metadata
    def connectToSource(self, sourceConnectionString, sourceContainerName, sourceBlobName):
+        """
+        Connect to the source Azure Blob Storage container.
+        Args:
+            sourceConnectionString (str): Connection string for the source Azure Blob Storage.
+            sourceContainerName (str): Name of the source container.
+            sourceBlobName (str): Name of the source blob.
+        Side Effects:
+            Initializes the source blob service and container clients.
+        """
        self.sourceConnectionString = sourceConnectionString
        self.sourceContainerName = sourceContainerName
        self.sourceBlobName = sourceBlobName
        self.sourceBlobServiceClient = BlobServiceClient.from_connection_string(sourceConnectionString)
        self.sourceContainerClient = self.sourceBlobServiceClient.get_container_client(sourceContainerName)
    def connectToTarget(self, targetConnectionString, targetContainerName, targetBlobName):
+        """
+        Connect to the target Azure Blob Storage container.
+        Args:
+            targetConnectionString (str): Connection string for the target Azure Blob Storage.
+            targetContainerName (str): Name of the target container.
+            targetBlobName (str): Name of the target blob.
+        Side Effects:
+            Initializes the target blob service and container clients.
+        """
        self.targetConnectionString = targetConnectionString
        self.targetContainerName = targetContainerName
        self.targetBlobName = targetBlobName
@@ -43,18 +97,41 @@ class F500Azure (F500):
        self.targetContainerClient = self.targetBlobServiceClient.get_container_client(targetContainerName)
    def writeISAJSON(self):
+        """
+        Write the investigation data to a JSON file.
+        Side Effects:
+            Creates a JSON file with the investigation data.
+        """
        jsonOutput = open(self.args.json, "w")
        jsonOutput.write(json.dumps(self.investigation, cls=ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': ')))
        jsonOutput.close()
    def measurementsToFile(self):
-        path =  "/".join([self.investigationPath, self.investigation.title, self.investigation.studies[0].title])
+        """
+        Write the measurements data to a CSV file.
+        Side Effects:
+            Creates directories and a CSV file with the measurements data.
+        """
+        path = "/".join([self.investigationPath, self.investigation.title, self.investigation.studies[0].title])
        filename = "derived/" + self.investigation.studies[0].title + ".csv"
        os.makedirs(path + "/derived", exist_ok=True)
        self.investigation.measurements.to_csv(path + "/" + filename, sep=";")
    @staticmethod
    def rawMeasurementsToFile(path, filename, measurements):
+        """
+        Write raw measurements data to a CSV file.
+        Args:
+            path (str): The directory path where the file will be saved.
+            filename (str): The name of the file.
+            measurements (dict): The measurements data to be written.
+        Side Effects:
+            Creates directories and a CSV file with the raw measurements data.
+        """
        os.makedirs(path + "/derived", exist_ok=True)
        df = pandas.DataFrame(measurements)
        df = df.transpose()
@@ -62,24 +139,46 @@ class F500Azure (F500):
    @staticmethod
    def copyPointcloudFile(row, f500, fullPath):
-        if f500.args.copyPointcloud == "True": 
+        """
+        Copy pointcloud files to a specified directory.
+        Args:
+            row (dict): A dictionary containing pointcloud file names.
+            f500 (F500): An instance of the F500 class.
+            fullPath (str): The destination directory path.
+        Side Effects:
+            Copies pointcloud files to the specified directory.
+        Exceptions:
+            Raises an exception if file copying fails.
+        """
+        if f500.args.copyPointcloud == "True":
            AB = f500.root.split("/")[-1]
-            pointcloudPath = "/".join(f500.root.split("/")[:-3]) + "/current/" + AB +'/I/'
+            pointcloudPath = "/".join(f500.root.split("/")[:-3]) + "/current/" + AB + '/I/'
-            f500.logger.info("Copying pointclouds from {}{} to {}".format(pointcloudPath, [row["pointcloud_full"],row["pointcloud_mr"],row["pointcloud_sl"],row["pointcloud_mg"]], fullPath))
+            f500.logger.info("Copying pointclouds from {}{} to {}".format(pointcloudPath, [row["pointcloud_full"], row["pointcloud_mr"], row["pointcloud_sl"], row["pointcloud_mg"]], fullPath))
-            try: 
+            try:
                os.makedirs(fullPath, exist_ok=True)
-                shutil.copy(pointcloudPath+row["pointcloud_full"], fullPath) 
+                shutil.copy(pointcloudPath + row["pointcloud_full"], fullPath)
-                shutil.copy(pointcloudPath+row["pointcloud_mr"], fullPath) 
+                shutil.copy(pointcloudPath + row["pointcloud_mr"], fullPath)
-                shutil.copy(pointcloudPath+row["pointcloud_sl"], fullPath)
+                shutil.copy(pointcloudPath + row["pointcloud_sl"], fullPath)
-                shutil.copy(pointcloudPath+row["pointcloud_mg"], fullPath)
+                shutil.copy(pointcloudPath + row["pointcloud_mg"], fullPath)
            except Exception as e:
                f500.logger.warn("Exception in copying files:\n{}".format(e))
-                #if f500.args.loglevel == "DEBUG":
+                # if f500.args.loglevel == "DEBUG":
-                #    raise e  
+                #     raise e
        else:
            f500.logger.info("Skipping copy (defined in command line)")
+    @staticmethod
    def getDirectoryListing(rootFolder):
-        return os.walk(rootFolder)
+        """
+        Get a directory listing for the specified root folder.
+        Args:
+            rootFolder (str): The root folder path.
+        Returns:
+            generator: A generator yielding directory paths, directory names, and file names.
+        """
+        return os.walk(rootFolder)
\ No newline at end of file
--- a/f500/collecting/Fairdom.py
+++ b/f500/collecting/Fairdom.py
+"""
+This script is designed to interact with the FAIRDOM platform to create and manage investigations, studies, assays, samples, and data files. 
+It uses the ISA-Tools library to handle ISA-JSON data structures and the requests library to communicate with the FAIRDOM API. 
+The script is intended to facilitate the upload of structured experimental data to the FAIRDOM repository.
+Classes:
+    Fairdom: Handles the creation and management of investigations, studies, assays, samples, and data files in FAIRDOM.
+Functions:
+    __init__: Initializes the Fairdom class with investigation data, arguments, and a logger.
+    createInvestigationJSON: Creates a JSON structure for an investigation.
+    createStudyJSON: Creates a JSON structure for a study.
+    createAssayJSON: Creates a JSON structure for an assay.
+    createDataFileJSON: Creates a JSON structure for a data file.
+    addSampleToAssayJSON: Adds a sample to an assay JSON structure.
+    addDataFileToAssayJSON: Adds a data file to an assay JSON structure.
+    addDataFilesToSampleJSON: Adds data files from an assay to a sample JSON structure.
+    createSampleJSON: Creates a JSON structure for a sample.
+    upload: Uploads the investigation, studies, assays, samples, and data files to FAIRDOM.
+Note: The script assumes that the user has a valid token for authentication with the FAIRDOM API.
+"""
 from isatools.isajson import ISAJSONEncoder
 import isatools
 from isatools.model import *
@@ -10,22 +33,51 @@ import time
 class Fairdom:
+    """
+    A class to manage the creation and upload of investigations, studies, assays, samples, and data files to the FAIRDOM platform.
+    Attributes:
+        investigation: An ISA-Tools investigation object containing the data to be uploaded.
+        args: Command-line arguments or configuration settings for the upload process.
+        logger: A logging object to record the process of uploading data.
+        session: A requests session object configured with headers for authentication with the FAIRDOM API.
+    """
    def __init__(self, investigation, args, logger):
+        """
+        Initializes the Fairdom class with the given investigation, arguments, and logger.
+        Args:
+            investigation: An ISA-Tools investigation object.
+            args: An object containing command-line arguments or configuration settings.
+            logger: A logging object for recording the upload process.
+        Side Effects:
+            Updates the session headers with authentication information.
+        """
        self.investigation = investigation
        self.args = args
        self.args.project = int(self.args.project)
        self.args.organism = int(self.args.organism)
        self.logger = logger
-        headers = {"Content-type": "application/vnd.api+json",
+        headers = {
-           "Accept": "application/vnd.api+json",
+            "Content-type": "application/vnd.api+json",
-           "Accept-Charset": "ISO-8859-1",
+            "Accept": "application/vnd.api+json",
-           "Authorization": "Token {}".format(self.args.token)}
+            "Accept-Charset": "ISO-8859-1",
+            "Authorization": "Token {}".format(self.args.token)
+        }
        self.session = requests.Session()
        self.session.headers.update(headers)
    def createInvestigationJSON(self):
+        """
+        Creates a JSON structure for an investigation.
+        Returns:
+            A dictionary representing the JSON structure of the investigation.
+        """
        investigationJSON = {}
        investigationJSON['data'] = {}
        investigationJSON['data']['type'] = 'investigations'
@@ -34,88 +86,156 @@ class Fairdom:
        investigationJSON['data']['attributes']['description'] = "PlantEye data from NPEC"
        investigationJSON['data']['relationships'] = {}
        investigationJSON['data']['relationships']['projects'] = {}
-        investigationJSON['data']['relationships']['projects']['data'] = [{'id' : str(self.args.project) , 'type' : 'projects'}]
+        investigationJSON['data']['relationships']['projects']['data'] = [{'id': str(self.args.project), 'type': 'projects'}]
        return investigationJSON
    def createStudyJSON(self, study, investigationID):
+        """
+        Creates a JSON structure for a study.
+        Args:
+            study: An ISA-Tools study object.
+            investigationID: The ID of the investigation to which the study belongs.
+        Returns:
+            A dictionary representing the JSON structure of the study.
+        """
        studyJSON = {}
        studyJSON['data'] = {}
        studyJSON['data']['type'] = 'studies'
        studyJSON['data']['attributes'] = {}
        studyJSON['data']['attributes']['title'] = study.name
        studyJSON['data']['attributes']['description'] = "F500 pot data"
-        #studyJSON['data']['attributes']['policy'] = {'access':'view', 'permissions': [{'resource': {'id': '1','type': 'people'},'access': 'manage'}]}
        studyJSON['data']['relationships'] = {}
        studyJSON['data']['relationships']['investigation'] = {}
-        studyJSON['data']['relationships']['investigation']['data'] = {'id' : str(investigationID), 'type' : 'investigations'}
+        studyJSON['data']['relationships']['investigation']['data'] = {'id': str(investigationID), 'type': 'investigations'}
        return studyJSON
    def createAssayJSON(self, assay, studyID):
+        """
+        Creates a JSON structure for an assay.
+        Args:
+            assay: An ISA-Tools assay object.
+            studyID: The ID of the study to which the assay belongs.
+        Returns:
+            A dictionary representing the JSON structure of the assay.
+        """
        assayJSON = {}
        assayJSON['data'] = {}
        assayJSON['data']['type'] = 'assays'
        assayJSON['data']['attributes'] = {}
        assayJSON['data']['attributes']['title'] = assay.filename
        assayJSON['data']['attributes']['description'] = 'NPEC F500 measurement assay'
-        assayJSON['data']['attributes']['assay_class'] = {'key' : 'EXP'}
+        assayJSON['data']['attributes']['assay_class'] = {'key': 'EXP'}
-        assayJSON['data']['attributes']['assay_type'] = {'uri' : "http://jermontology.org/ontology/JERMOntology#Metabolomics"}
+        assayJSON['data']['attributes']['assay_type'] = {'uri': "http://jermontology.org/ontology/JERMOntology#Metabolomics"}
-        #assayJSON['data']['attributes']['technology_type'] = {'uri' : "http://jermontology.org/ontology/JERMOntology#PlantEye"}
        assayJSON['data']['relationships'] = {}
        assayJSON['data']['relationships']['study'] = {}
-        assayJSON['data']['relationships']['study']['data'] = {'id' : str(studyID), 'type' : 'studies'}
+        assayJSON['data']['relationships']['study']['data'] = {'id': str(studyID), 'type': 'studies'}
        assayJSON['data']['relationships']['organisms'] = {}
-        assayJSON['data']['relationships']['organisms']['data'] = [{'id' : str(self.args.organism), 'type' : 'organisms'}]
+        assayJSON['data']['relationships']['organisms']['data'] = [{'id': str(self.args.organism), 'type': 'organisms'}]
        return assayJSON
    def createDataFileJSON(self, data_file):
+        """
+        Creates a JSON structure for a data file.
+        Args:
+            data_file: An object representing a data file.
+        Returns:
+            A dictionary representing the JSON structure of the data file.
+        """
        data_fileJSON = {}
        data_fileJSON['data'] = {}
        data_fileJSON['data']['type'] = 'data_files'
        data_fileJSON['data']['attributes'] = {}
        data_fileJSON['data']['attributes']['title'] = data_file.filename
-        data_fileJSON['data']['attributes']['content_blobs'] = [{'url': 'https://www.wur.nl/upload/854757ab-168f-46d7-b415-f8b501eebaa5_WUR_RGB_standard_2021-site.svg', 
+        data_fileJSON['data']['attributes']['content_blobs'] = [{
-                                                                'original_filename': data_file.filename,
+            'url': 'https://www.wur.nl/upload/854757ab-168f-46d7-b415-f8b501eebaa5_WUR_RGB_standard_2021-site.svg',
-                                                                'content-type': 'image/svg+xml'}]
+            'original_filename': data_file.filename,
+            'content-type': 'image/svg+xml'
+        }]
        data_fileJSON['data']['relationships'] = {}
        data_fileJSON['data']['relationships']['projects'] = {}
-        data_fileJSON['data']['relationships']['projects']['data'] = [{'id' : str(self.args.project) , 'type' : 'projects'}]
+        data_fileJSON['data']['relationships']['projects']['data'] = [{'id': str(self.args.project), 'type': 'projects'}]
        return data_fileJSON
    def addSampleToAssayJSON(self, sampleID, assayJSON):
+        """
+        Adds a sample to an assay JSON structure.
+        Args:
+            sampleID: The ID of the sample to be added.
+            assayJSON: The JSON structure of the assay to which the sample will be added.
+        """
        if 'samples' not in assayJSON['data']['relationships']:
-            assayJSON['data']['relationships']['samples'] = {} 
+            assayJSON['data']['relationships']['samples'] = {}
            assayJSON['data']['relationships']['samples']['data'] = []
        assayJSON['data']['relationships']['samples']['data'].append({'id': str(sampleID), 'type': 'samples'})
    def addDataFileToAssayJSON(self, data_fileID, assayJSON):
+        """
+        Adds a data file to an assay JSON structure.
+        Args:
+            data_fileID: The ID of the data file to be added.
+            assayJSON: The JSON structure of the assay to which the data file will be added.
+        """
        if 'data_files' not in assayJSON['data']['relationships']:
-            assayJSON['data']['relationships']['data_files'] = {} 
+            assayJSON['data']['relationships']['data_files'] = {}
            assayJSON['data']['relationships']['data_files']['data'] = []
        assayJSON['data']['relationships']['data_files']['data'].append({'id': str(data_fileID), 'type': 'data_files'})
    def addDataFilesToSampleJSON(self, assayJSON, sampleJSON):
+        """
+        Adds data files from an assay to a sample JSON structure.
+        Args:
+            assayJSON: The JSON structure of the assay containing the data files.
+            sampleJSON: The JSON structure of the sample to which the data files will be added.
+        """
        if 'data_files' not in sampleJSON['data']['relationships']:
            sampleJSON['data']['relationships']['data_files'] = {}
-            sampleJSON['data']['relationships']['data_files']['data'] = [] 
+            sampleJSON['data']['relationships']['data_files']['data'] = []
        if 'data_files' in assayJSON['data']['relationships']:
            sampleJSON['data']['relationships']['data_files']['data'].extend(assayJSON['data']['relationships']['data_files']['data'])
    def createSampleJSON(self, sample):
+        """
+        Creates a JSON structure for a sample.
+        Args:
+            sample: An ISA-Tools sample object.
+        Returns:
+            A dictionary representing the JSON structure of the sample.
+        """
        sampleJSON = {}
        sampleJSON['data'] = {}
        sampleJSON['data']['type'] = 'samples'
        sampleJSON['data']['attributes'] = {}
        sampleJSON['data']['attributes']['title'] = sample.name
-        sampleJSON['data']['attributes']['attribute_map'] = {'PotID' : sample.name}
+        sampleJSON['data']['attributes']['attribute_map'] = {'PotID': sample.name}
        sampleJSON['data']['relationships'] = {}
        sampleJSON['data']['relationships']['projects'] = {}
-        sampleJSON['data']['relationships']['projects']['data'] = [{'id' : str(self.args.project), 'type' : 'projects'}]
+        sampleJSON['data']['relationships']['projects']['data'] = [{'id': str(self.args.project), 'type': 'projects'}]
        sampleJSON['data']['relationships']['sample_type'] = {}
-        sampleJSON['data']['relationships']['sample_type']['data'] = {'id' : str(self.args.sample_type), 'type' : 'sample_types'}        
+        sampleJSON['data']['relationships']['sample_type']['data'] = {'id': str(self.args.sample_type), 'type': 'sample_types'}
        return sampleJSON
    def upload(self):
+        """
+        Uploads the investigation, studies, assays, samples, and data files to the FAIRDOM platform.
+        Side Effects:
+            Communicates with the FAIRDOM API to create and upload data structures.
+            Logs the process and any errors encountered.
+        Raises:
+            SystemExit: If an error occurs during the upload process that prevents continuation.
+        """
        # create investigation
        investigationJSON = self.createInvestigationJSON()
        self.logger.info("Creating investigation in FAIRDOM at {}".format(self.args.URL))
@@ -123,7 +243,7 @@ class Fairdom:
        if r.status_code == 201 or r.status_code == 200:
            investigationID = r.json()['data']['id']
            self.logger.info("Investigation id {} created. Status: {}".format(investigationID, r.status_code))
-        else: 
+        else:
            self.logger.error("Could not create new investigation, error code {}".format(r.status_code))
            exit(1)
@@ -147,7 +267,7 @@ class Fairdom:
                            studyID = r.json()['data']['id']
                            self.currentStudies[sample.name]["id"] = studyID 
                            self.logger.info("Study id {} with ({}) created. Status: {}".format(studyID, sample.name, r.status_code))
-                        else: 
+                        else:
                            self.logger.error("Could not create new study, error code {}".format(r.status_code))
                            exit(1)
@@ -155,13 +275,13 @@ class Fairdom:
                assayJSON = self.createAssayJSON(assay, studyJSON['id'])
                # create add data files
                for data_file in assay.data_files:
-                    if "derived" in data_file.filename or ".ply.gz" in data_file.filename or "ndvi" in data_file.filename: # for now, only upload phenotypic data
+                    if "derived" in data_file.filename or ".ply.gz" in data_file.filename or "ndvi" in data_file.filename:  # for now, only upload phenotypic data
                        data_fileJSON = self.createDataFileJSON(data_file)
                        r = self.session.post(self.args.URL + '/data_files', json=data_fileJSON)
                        if r.status_code == 201 or r.status_code == 200:
                            data_fileID = r.json()['data']['id']
                            self.logger.info("Data file id {} created ({}). Status: {}".format(data_fileID, data_file.filename, r.status_code))
-                        else: 
+                        else:
                            self.logger.error("Could not create new data file, error code {}".format(r.status_code))
                            exit(1)
                        data_fileJSON['id'] = data_fileID
@@ -174,20 +294,20 @@ class Fairdom:
                            sampleID = r.json()['data']['id']
                            self.samples[sample.name]['id'] = sampleID
                            self.logger.info("Sample id {} created ({}). Status: {}".format(sampleID, sample.name, r.status_code))
-                        else: 
+                        else:
                            self.logger.error("Could not create new sample, error code {}".format(r.status_code))
                            if r.status_code == 422:
                                self.logger.info(self.logger.info(self.samples[sample.name]))
                                self.logger.info(r.json())
                            exit(1)
                    sampleID = self.samples[sample.name]['id']
-                    self.addSampleToAssayJSON(sampleID, assayJSON )
+                    self.addSampleToAssayJSON(sampleID, assayJSON)
                    sampleJSON = self.samples[sample.name]
                r = self.session.post(self.args.URL + '/assays', json=assayJSON)
                if r.status_code == 201 or r.status_code == 200:
                    assayID = r.json()['data']['id']
                    self.logger.info("Assay id {} created. Status: {}".format(assayID, r.status_code))
-                else: 
+                else:
                    self.logger.error("Could not create new assay, error code {}".format(r.status_code))
                    if r.status_code == 422:
                        self.logger.info(self.logger.info(assayJSON))
@@ -198,8 +318,4 @@ class Fairdom:
                        r = self.session.post(self.args.URL + '/assays', json=assayJSON)
                        r.raise_for_status()
                    else:
                        exit(1)
\ No newline at end of file
\ No newline at end of file
--- a/f500/collecting/PointCloud.py
+++ b/f500/collecting/PointCloud.py
@@ -2,52 +2,114 @@ import open3d as o3d
 import numpy
 import os
+"""
+This script provides a class for handling point cloud data using the Open3D library. 
+It includes functionalities for reading point cloud data from a file, calculating various 
+spectral indices, trimming the point cloud based on z-values, and rendering images of the 
+point cloud with or without color rescaling.
+"""
 class PointCloud:
+    """
+    A class to represent and manipulate a point cloud using Open3D.
+    Attributes:
+    ----------
+    pcd : open3d.geometry.PointCloud
+        The point cloud data.
+    trimmed : bool
+        A flag indicating whether the point cloud has been trimmed.
+    """
    pcd = None
    trimmed = False
    def __init__(self, filename):
+        """
+        Initializes the PointCloud object by reading point cloud data from a file.
+        Parameters:
+        ----------
+        filename : str
+            The path to the point cloud file in PLY format.
+        """
        self.pcd = o3d.io.read_point_cloud(filename, format="ply")
        self.trimmed = False
    def writeHistogram(self, data, filename, timepoint, sampleName, bins, dataRange=None):
+        """
+        Writes a histogram of the given data to a file.
+        Parameters:
+        ----------
+        data : numpy.ndarray
+            The data for which the histogram is to be calculated.
+        filename : str
+            The path to the file where the histogram will be written.
+        timepoint : str
+            The timepoint associated with the data.
+        sampleName : str
+            The name of the sample.
+        bins : int
+            The number of bins for the histogram.
+        dataRange : tuple, optional
+            The lower and upper range of the bins. If not provided, range is (data.min(), data.max()).
+        Side Effects:
+        ------------
+        Writes the histogram data to the specified file.
+        """
        data = data[numpy.isfinite(data)]
        hist, bin_edges = numpy.histogram(data, bins=bins, range=dataRange)
-        f = open(filename, "w")
+        with open(filename, "w") as f:
-        f.write("timepoint;sample;{}\n".format(";".join(["bin" + str(x) for x in range(0, len(bin_edges))])))        
+            f.write("timepoint;sample;{}\n".format(";".join(["bin" + str(x) for x in range(0, len(bin_edges))])))
-        f.write("{};{};{}\n".format(timepoint, "edges", ";".join([str(x) for x in bin_edges])))
+            f.write("{};{};{}\n".format(timepoint, "edges", ";".join([str(x) for x in bin_edges])))
-        f.write("{};{};{}\n".format(timepoint, sampleName, ";".join([str(x) for x in hist])))
+            f.write("{};{};{}\n".format(timepoint, sampleName, ";".join([str(x) for x in hist])))
-        f.close()
    def getWavelengths(self):
+        """
+        Retrieves the wavelengths from the point cloud.
+        Returns:
+        -------
+        numpy.ndarray
+            The wavelengths as a numpy array. If the point cloud is trimmed, returns a vertically stacked array.
+        """
        if self.trimmed:
-            #print("trimmed data")
            return numpy.vstack(self.pcd.wavelengths)
        else:
-            #print("original data")
            return numpy.asarray(self.pcd.wavelengths)
    def get_psri(self):
-        #(RED − GREEN)/(NIR) 
+        """
-        numpy.seterr(divide='ignore',invalid='ignore')
+        Calculates the Plant Senescence Reflectance Index (PSRI).
-        wavelengths=self.getWavelengths()
-        red =wavelengths[:,0]
+        Returns:
-        #red = (red - min(red))/(max(red)-min(red)) * 70 + 630
+        -------
-        green =wavelengths[:,1]
+        numpy.ndarray
-        #green = (green - min(green))/(max(green)-min(green)) * 80 + 500
+            The PSRI values calculated as (RED - GREEN) / NIR.
-        nir =wavelengths[:,3] 
+        """
-        #nir = (nir - min(nir))/(max(nir)-min(nir)) * 300 + 700
+        numpy.seterr(divide='ignore', invalid='ignore')
-        return ((red-green)/nir)
+        wavelengths = self.getWavelengths()
+        red = wavelengths[:, 0]
+        green = wavelengths[:, 1]
+        nir = wavelengths[:, 3]
+        return ((red - green) / nir)
    def get_hue(self):
-        numpy.seterr(divide='ignore',invalid='ignore')
+        """
-        wavelengths=self.getWavelengths()
+        Calculates the hue from the RGB wavelengths.
-        red =wavelengths[:,0]
-        green =wavelengths[:,1]
+        Returns:
-        blue =wavelengths[:,2]
+        -------
+        numpy.ndarray
+            The hue values calculated from the RGB wavelengths.
+        """
+        numpy.seterr(divide='ignore', invalid='ignore')
+        wavelengths = self.getWavelengths()
+        red = wavelengths[:, 0]
+        green = wavelengths[:, 1]
+        blue = wavelengths[:, 2]
        hue = numpy.zeros(len(red))
        for c in range(len(hue)):
            minColor = min([red[c], green[c], blue[c]])
@@ -59,137 +121,167 @@ class PointCloud:
                    hue[c] = 2.0 + (blue[c] - red[c]) / (maxColor - minColor)
                else:
                    hue[c] = 4.0 + (red[c] - green[c]) / (maxColor - minColor)
                hue[c] = hue[c] * 60.0
-                if (hue[c] < 0):
+                if hue[c] < 0:
-                     hue[c] = hue[c] + 360.0
+                    hue[c] = hue[c] + 360.0
-        return hue;
+        return hue
    def get_greenness(self):
-        numpy.seterr(divide='ignore',invalid='ignore')
+        """
-        wavelengths=self.getWavelengths()
+        Calculates the greenness index.
-        # (2*G-R-B)/(2*R+G+B)
-        #print(wavelengths)
+        Returns:
-        return ((2.0*wavelengths[:,1]-wavelengths[:,0] -wavelengths[:,2]) / 
+        -------
-                (2.0*wavelengths[:,1]+wavelengths[:,0] +wavelengths[:,2]))
+        numpy.ndarray
+            The greenness values calculated as (2*G - R - B) / (2*R + G + B).
+        """
+        numpy.seterr(divide='ignore', invalid='ignore')
+        wavelengths = self.getWavelengths()
+        return ((2.0 * wavelengths[:, 1] - wavelengths[:, 0] - wavelengths[:, 2]) /
+                (2.0 * wavelengths[:, 1] + wavelengths[:, 0] + wavelengths[:, 2]))
    def get_ndvi(self):
-        numpy.seterr(divide='ignore',invalid='ignore')
+        """
-        wavelengths=self.getWavelengths()
+        Calculates the Normalized Difference Vegetation Index (NDVI).
-        return (wavelengths[:,3]-wavelengths[:,0])/(wavelengths[:,3]+wavelengths[:,0])
+        Returns:
-    #(RED − BLUE)/(RED + BLUE) 
+        -------
+        numpy.ndarray
+            The NDVI values calculated as (NIR - RED) / (NIR + RED).
+        """
+        numpy.seterr(divide='ignore', invalid='ignore')
+        wavelengths = self.getWavelengths()
+        return (wavelengths[:, 3] - wavelengths[:, 0]) / (wavelengths[:, 3] + wavelengths[:, 0])
    def get_npci(self):
-        numpy.seterr(divide='ignore',invalid='ignore')
+        """
-        wavelengths=self.getWavelengths()
+        Calculates the Normalized Pigment Chlorophyll Index (NPCI).
-        return ((wavelengths[:,0]-wavelengths[:,2])/(wavelengths[:,0]+wavelengths[:,2]))
+        Returns:
+        -------
+        numpy.ndarray
+            The NPCI values calculated as (RED - BLUE) / (RED + BLUE).
+        """
+        numpy.seterr(divide='ignore', invalid='ignore')
+        wavelengths = self.getWavelengths()
+        return ((wavelengths[:, 0] - wavelengths[:, 2]) / (wavelengths[:, 0] + wavelengths[:, 2]))
    def setColors(self, colors):
+        """
+        Sets the colors of the point cloud.
+        Parameters:
+        ----------
+        colors : numpy.ndarray
+            The colors to be set for the point cloud.
+        """
        self.pcd.colors = o3d.utility.Vector3dVector(colors)
    def render_image(self, filename, image_width, image_height, rescale=True):
+        """
+        Renders an image of the point cloud.
+        Parameters:
+        ----------
+        filename : str
+            The path to the file where the image will be saved.
+        image_width : int
+            The width of the image.
+        image_height : int
+            The height of the image.
+        rescale : bool, optional
+            Whether to rescale the colors before rendering. Default is True.
+        """
        if rescale:
            self.render_image_rescale(filename, image_width, image_height)
        else:
            self.render_image_no_rescale(filename, image_width, image_height)
    def trim(self, zIndex):
+        """
+        Trims the point cloud based on the z-values.
+        Parameters:
+        ----------
+        zIndex : float
+            The z-value threshold for trimming the point cloud.
+        Side Effects:
+        ------------
+        Modifies the point cloud to only include points with z-values greater than or equal to zIndex.
+        """
        if zIndex == 0:
            return
-        # Convert point cloud points to numpy array
        self.untrimmedPCD = self.pcd
        points = numpy.asarray(self.pcd.points)
-        # Create a mask based on the z-values
        mask = points[:, 2] >= zIndex
-        #print(mask)
-        # Filter the point cloud
        filtered_points = points[mask]
-        # Create a new point cloud with the filtered points
        filtered_pcd = o3d.geometry.PointCloud()
        filtered_pcd.points = o3d.utility.Vector3dVector(filtered_points)
-        # Optionally, if you want to also filter the colors or normals, you can do the following:
        if self.pcd.has_colors():
            colors = numpy.asarray(self.pcd.colors)
            filtered_colors = colors[mask]
            filtered_pcd.colors = o3d.utility.Vector3dVector(filtered_colors)
        wavelengths = numpy.asarray(self.pcd.wavelengths)
-        #print(wavelengths[mask])
        filtered_wavelengths = wavelengths[mask]
        filtered_pcd.wavelengths = filtered_wavelengths
-        #print(filtered_pcd.wavelengths)
        self.pcd = filtered_pcd
-        #print(self.pcd.wavelengths)
        self.trimmed = True
    def render_image_no_rescale(self, filename, image_width, image_height):
-        ''' Create open3D vizualization'''    
+        """
-        # Create the open3d Visualizer class where the mesh can be rendered from
+        Renders an image of the point cloud without rescaling the colors.
+        Parameters:
+        ----------
+        filename : str
+            The path to the file where the image will be saved.
+        image_width : int
+            The width of the image.
+        image_height : int
+            The height of the image.
+        Side Effects:
+        ------------
+        Saves the rendered image to the specified file.
+        """
        vis = o3d.visualization.Visualizer()
-        # create an invisible window with the desired dimensions of the image
        vis.create_window(width=image_width, height=image_height, visible=False)
        vis.add_geometry(self.pcd)
        vis.update_geometry(self.pcd)
-        # Get the view control
-        #ctr = vis.get_view_control()
-        # In Open3D, the extrinsic matrix is a 4x4 matrix that defines the transformation from world to camera coordinates.
-        # Set the camera parameters
-        #camera_params = ctr.convert_to_pinhole_camera_parameters()
-        #print(camera_params.extrinsic)
-        #camera_extrinsic = numpy.copy(camera_params.extrinsic)
-        #camera_extrinsic[:3,3] = [6, 1600, 1600]#camera_extrinsic
-        #camera_extrinsic[:3,3] = [0, 0, 1600]#camera_extrinsic
-        #camera_extrinsic[:3,3] = [0, 1600, 1600]#camera_extrinsic
-        #camera_extrinsic[:3,3] = [0, 1600, 0]#camera_extrinsic
-        #camera_params.extrinsic = camera_extrinsic
-        #ctr.convert_from_pinhole_camera_parameters(camera_params, True)
-        #camera_params = ctr.convert_to_pinhole_camera_parameters()
-        #print(camera_params.extrinsic)
-        #ctr.set_camera(camera_pos, lookat, up_dir)
-        #vis.get_render_option().load_from_json(filename = os.path.dirname(__file__) + '/render_parameters.json')
-        #vis.update_renderer()
-        # render the view on the mesh as PNG file and close the (invisible) window
        vis.capture_screen_image(filename, do_render=True)
        vis.destroy_window()
    def render_image_rescale(self, filename, image_width, image_height):
+        """
+        Renders an image of the point cloud with rescaled colors.
-        # Convert the colors to a numpy array
+        Parameters:
-        colors =self.getWavelengths()
+        ----------
+        filename : str
+            The path to the file where the image will be saved.
+        image_width : int
+            The width of the image.
+        image_height : int
+            The height of the image.
-        # Select only the first three channels (RGB)
+        Side Effects:
-        colors = colors[:,:3]
+        ------------
-        # Calculate the 1st and 99th percentiles across all color channels
+        Saves the rendered image to the specified file.
+        """
+        colors = self.getWavelengths()
+        colors = colors[:, :3]
        p01 = numpy.percentile(colors, 1)
        p99 = numpy.percentile(colors, 99)
-        # Perform linear stretching to map the 1st percentile value to 0 and the 99th percentile value to 255
        scaled_colors = ((colors - p01) / (p99 - p01) * 255)
-        # Clip values below 0 and above 255
        scaled_colors = numpy.clip(scaled_colors, 0, 255).astype(numpy.uint8)
-        # Convert to double precision
        scaled_colors = scaled_colors.astype(numpy.float64) / 255
-        # Reshape to 2D array with 3 columns, if needed
        if len(scaled_colors.shape) < 2:
            scaled_colors = numpy.reshape(scaled_colors, (-1, 3))
-        # Assign the scaled colors back to the point cloud
        self.pcd.colors = o3d.utility.Vector3dVector(scaled_colors)
+        self.render_image_no_rescale(filename, image_width, image_height)
-        self.render_image_no_rescale(filename, image_width, image_height)
\ No newline at end of file
--- a/f500/collecting/__init__.py
+++ b/f500/collecting/__init__.py
--- a/f500/collecting/deleteFAIRObject.py
+++ b/f500/collecting/deleteFAIRObject.py
+"""
+This script is designed to delete various types of resources from a specified host using the Fairdom SEEK API. 
+It utilizes the requests library to send HTTP DELETE requests to remove data files, samples, assays, studies, 
+and investigations from the server. The script requires an authorization token to authenticate the requests.
+Usage:
+    python script_name.py <token>
+Where <token> is the authorization token required for accessing the API.
+Note: This script performs destructive actions by deleting resources. Use with caution.
+"""
 import requests
 import sys
-token = sys.argv[1]
+def main():
+    """Main function to execute the deletion of resources.
+    This function sets up the session with the necessary headers and iterates over predefined ranges 
+    to delete resources from the server. It deletes data files, samples, assays, studies, and investigations.
+    Raises:
+        requests.exceptions.RequestException: If a network-related error occurs during the requests.
+    """
+    token = sys.argv[1]
+    headers = {
+        "Content-type": "application/vnd.api+json",
+        "Accept": "application/vnd.api+json",
+        "Accept-Charset": "ISO-8859-1",
+        "Authorization": "Token {}".format(token)
+    }
+    session = requests.Session()
+    session.headers.update(headers)
+    r = 32000
+    host = "https://test.fairdom-seek.bif.containers.wurnet.nl/"
-headers = {"Content-type": "application/vnd.api+json",
+    # Delete data files
-   "Accept": "application/vnd.api+json",
+    for i in range(1000, r):
-   "Accept-Charset": "ISO-8859-1",
+        session.delete(host + "data_files/{}".format(i))
-   "Authorization": "Token {}".format(token)}
-session = requests.Session()
+    # Delete samples
-session.headers.update(headers)
+    for i in range(0, 500):
-r = 32000
+        session.delete(host + "samples/{}".format(i))
-host = "https://test.fairdom-seek.bif.containers.wurnet.nl/"
-for i in range(1000,r):
-    session.delete(host + "data_files/{}".format(i))
-for i in range(0,500):
-    session.delete(host + "samples/{}".format(i))
-for i in range(0,1300):
+    # Delete assays
-    session.delete(host + "assays/{}".format(i))
+    for i in range(0, 1300):
-for i in range(0,50):
+        session.delete(host + "assays/{}".format(i))
-    session.delete(host + "studies/{}".format(i))
-for i in range(0,20):
+    # Delete studies
-    session.delete(host + "investigations/{}".format(i))
+    for i in range(0, 50):
+        session.delete(host + "studies/{}".format(i))
+    # Delete investigations
+    for i in range(0, 20):
+        session.delete(host + "investigations/{}".format(i))
+if __name__ == "__main__":
+    main()
--- a/f500/collecting/fairdom.py
+++ b/f500/collecting/fairdom.py
 """
-SEEK FAIRDOM automatic upload
+This script automates the process of uploading data to the SEEK FAIRDOM platform. It reads metadata and measurement data from specified files, processes the data, and uploads it to the SEEK platform, creating the necessary structure of investigations, studies, and assays. The script is designed to work with PlantEye data from NPEC and assumes a specific data structure and naming convention.
+The script requires the following command-line arguments:
+1. Path to the datamatrix file.
+2. Path to the investigation directory.
+3. Paths to the CSV files containing measurement data.
+The script uses the requests library to interact with the SEEK API and pandas for data manipulation.
 """
 import sys
@@ -13,12 +19,10 @@ import requests
 import json
 import string
 datamatrix_file = sys.argv[1]
 investigationPath = sys.argv[2]
 csvs = sys.argv[3:]
 base_url = 'http://localhost:3000'
 headers = {"Content-type": "application/vnd.api+json",
@@ -29,19 +33,8 @@ session = requests.Session()
 session.headers.update(headers)
 session.auth = ("capsicum.upload@wur.nl", "3#7B&GNC</yp2{k(")
-"""The **Investigation**, **Study** and **Assay** will be created within **Project** 2"""
 containing_project_id = 2
-# some definitions:
-# Project > Investigation > Study > Observation Unit > Sample > assay type > assay > unprocessed (raw data folder)
-# 'Pilot project' -> Pepper -> Experiment 13 -> Pot 1 -> 2022-02-03 -> Imaging -> planteye -> pointcloud
-# In ISA, sample is linked to an assay. Observation unit === Sample, where the named sample is a data of Sample
-# Sample.name = data 
-# Some columns contain the wrong data, remove those:
 columnsToDrop = ["ndvi_aver","ndvi_bin0","ndvi_bin1","ndvi_bin2","ndvi_bin3","ndvi_bin4","ndvi_bin5",
                 "greenness_aver","greenness_bin0","greenness_bin1","greenness_bin2","greenness_bin3","greenness_bin4","greenness_bin5",
                 "hue_aver","hue_bin0","hue_bin1","hue_bin2","hue_bin3","hue_bin4","hue_bin5",
@@ -51,15 +44,21 @@ columnsToDrop = ["ndvi_aver","ndvi_bin0","ndvi_bin1","ndvi_bin2","ndvi_bin3","nd
 metadata = pandas.read_csv(datamatrix_file, sep=";")
 def removeAfterSpaceFromDataMatrix(row):
+    """
+    Removes any text after a space in the 'DataMatrix' column of a row.
+    Args:
+        row (pandas.Series): A row from a DataFrame.
+    Returns:
+        pandas.Series: The modified row with updated 'DataMatrix' value.
+    """
    row["DataMatrix"] = row["DataMatrix"].strip().split(" ")[0]
    return row
 metadata = metadata.apply(removeAfterSpaceFromDataMatrix , axis=1)
-# Create ISA structure
 datamatrix = os.path.basename(datamatrix_file).split(".")[0].split("_")
-# Create investigation
 investigation = {}
 investigation['data'] = {}
 investigation['data']['type'] = 'investigations'
@@ -78,7 +77,6 @@ r = session.post(base_url + '/investigations', json=investigation)
 investigation_id = r.json()['data']['id']
 r.raise_for_status()
-# Create study, title comes datamatrix file (ID...)   
 study = {}
 study['data'] = {}
 study['data']['type'] = 'studies'
@@ -94,10 +92,6 @@ study['data']['relationships']['investigation']['data'] = {'id' : investigation_
 r = session.post(base_url + '/studies', json=study)
 study_id = r.json()['data']['id']
-# add meta-data file to investigation 
-#investigation.filename = datamatrix_file
-#store metadata file
 os.makedirs("/".join([investigationPath, investigation['data']['attributes']['title']]), exist_ok=True)
 metadata_csv = "/".join([investigationPath, investigation['data']['attributes']['title'], os.path.basename(datamatrix_file)])
 metadata.to_csv(metadata_csv, sep="\t")
@@ -123,27 +117,28 @@ r.raise_for_status()
 populated_data_file = r.json()
-"""Extract the id and URL to the newly created **data_file**"""
 data_file_id = populated_data_file['data']['id']
 data_file_url = populated_data_file['data']['links']['self']
-"""Extract the URL for the local data"""
 blob_url = populated_data_file['data']['attributes']['content_blobs'][0]['link']
-"""Reset the local file and upload it to the URL"""
 upload = session.put(blob_url, data=open(metadata_csv,"r").read(), headers={'Content-Type': 'application/octet-stream'})
 upload.raise_for_status()
-# Assay is defined as a collection of files from a measurement
-# These are identified by 'f0000'
 checkAssayName = re.compile(r"f[0-9]+")
 measurements = pandas.DataFrame()
 def copyPots(row, pots):
+    """
+    Copies pot information from a pots DataFrame to a row based on matching coordinates.
+    Args:
+        row (pandas.Series): A row from a DataFrame.
+        pots (pandas.DataFrame): A DataFrame containing pot information.
+    Returns:
+        pandas.Series: The modified row with updated pot information.
+    """
    row["Pot"] = pots[ (pots["x"] == row["x"]) & (pots["y"] == row["y"]) ]["Pot"].iloc[0]
    if "Treatment" in pots.columns:
        row["Treatment"] = pots[ (pots["x"] == row["x"]) & (pots["y"] == row["y"]) ]["Treatment"].iloc[0]
@@ -151,20 +146,55 @@ def copyPots(row, pots):
        row["Experiment"] = pots[ (pots["x"] == row["x"]) & (pots["y"] == row["y"]) ]["Experiment"].iloc[0]
    return row
 def measurementsToFile(investigation, path, filename, measurements):
+    """
+    Saves the measurements DataFrame to a CSV file.
+    Args:
+        investigation (dict): The investigation dictionary.
+        path (str): The directory path where the file will be saved.
+        filename (str): The name of the file.
+        measurements (pandas.DataFrame): The DataFrame containing measurements.
+    Side Effects:
+        Creates directories and writes a CSV file to the specified path.
+    """
    os.makedirs(path + "/derived", exist_ok=True)
    measurements.to_csv(path + "/" + filename, sep=";")
 def rawMeasurementsToFile(investigation, path, filename, measurements):
+    """
+    Saves the raw measurements to a CSV file.
+    Args:
+        investigation (dict): The investigation dictionary.
+        path (str): The directory path where the file will be saved.
+        filename (str): The name of the file.
+        measurements (pandas.DataFrame): The DataFrame containing raw measurements.
+    Returns:
+        str: The full path to the saved file.
+    Side Effects:
+        Creates directories and writes a CSV file to the specified path.
+    """
    os.makedirs(path + "/derived", exist_ok=True)
    df = pandas.DataFrame(measurements)
    df = df.transpose()
    df.to_csv(path + "/" + filename, sep="\t")
    return(path + "/" + filename)
-def addPointClouds(row, title) :
+def addPointClouds(row, title):
+    """
+    Adds a point cloud filename to a row based on its coordinates and timestamp.
+    Args:
+        row (pandas.Series): A row from a DataFrame.
+        title (str): The title used in the filename.
+    Returns:
+        pandas.Series: The modified row with the point cloud filename added.
+    """
    filename = "pointcloud/{}_{}_full_sx{:03d}_sy{:03d}.ply.gz".format(
        title, row["timestamp_file"], 
        row["x"],
@@ -173,6 +203,18 @@ def addPointClouds(row, title) :
    return row
 def createAssay(row, investigation, path, study_id):
+    """
+    Creates an assay and uploads the associated data file to the SEEK platform.
+    Args:
+        row (pandas.Series): A row from a DataFrame containing assay data.
+        investigation (dict): The investigation dictionary.
+        path (str): The directory path for saving files.
+        study_id (str): The ID of the study to which the assay belongs.
+    Side Effects:
+        Creates directories, writes files, and uploads data to the SEEK platform.
+    """
    data_file = {}
    filename = "derived/" + assay.title + ".csv"
@@ -199,17 +241,11 @@ def createAssay(row, investigation, path, study_id):
    populated_data_file = r.json()
-    """Extract the id and URL to the newly created **data_file**"""
    data_file_id = populated_data_file['data']['id']
    data_file_url = populated_data_file['data']['links']['self']
-    """Extract the URL for the local data"""
    blob_url = populated_data_file['data']['attributes']['content_blobs'][0]['link']
-    """Reset the local file and upload it to the URL"""
    upload = session.put(blob_url, data=open(fullFilename,"r").read(), headers={'Content-Type': 'application/octet-stream'})
    upload.raise_for_status()
@@ -226,16 +262,22 @@ def createAssay(row, investigation, path, study_id):
    assay['data']['relationships']['study']['data'] = {'id' : study_id, 'type' : 'studies'}
    assay['data']['relationships']['organism'] = {}
    assay['data']['relationships']['organism']['data'] = {'id' : 1, 'type' : 'organisms'}
 def finalize(investigation, measurements, investigationPath, title, metadata, study_id):
-    # CSV will be combined data file (with corrected pot names) and ply file names
+    """
-    # Do this, if the data matrix contains pot names (otherwise it either went wrong or data is from a different project
+    Finalizes the processing of measurements by creating assays and saving data files.
-    # Then list the ply files as Image File
+    Args:
+        investigation (dict): The investigation dictionary.
+        measurements (pandas.DataFrame): The DataFrame containing measurements.
+        investigationPath (str): The directory path for saving files.
+        title (str): The title used in filenames.
+        metadata (pandas.DataFrame): The DataFrame containing metadata.
+        study_id (str): The ID of the study to which the assays belong.
+    Side Effects:
+        Creates directories, writes files, and uploads data to the SEEK platform.
+    """
    if "Pot" in measurements.columns:
        pots = measurements.dropna(axis=0, subset=["Pot"])
        if len(pots) > 0 and "Pot" in pots.columns:
@@ -244,52 +286,35 @@ def finalize(investigation, measurements, investigationPath, title, metadata, st
                measurements = measurements.drop(columnsToDrop, axis=1)
                measurements = measurements.apply(copyPots , axis=1, pots=pots)
                measurements = measurements.apply(addPointClouds, axis=1, title=title)
-                #now create the assays, using the timestamp_file as name
                measurements.apply(createAssay, axis=1, investigation = investigation, path = investigationPath, study_id = study_id)
                investigation.measurements = pandas.concat([investigation.measurements, measurements], axis=0, ignore_index=True)
 previousAssay = ""
 for csv in csvs:
    assayName = os.path.basename(csv).split("_")[0]
    timestamp = os.path.basename(csv).split("_")[1]
-    #print("Reading: {}".format(assayName))
    if checkAssayName.match(assayName) != None:
        try: 
            currentMeasurements = pandas.read_csv(csv, sep="\t", skiprows=[1])
            currentMeasurements["timestamp_file"] = timestamp
            if previousAssay == assayName:
-                # same assay
                if len(measurements) == 0:
                    measurements = currentMeasurements
                else:
                    measurements = pandas.concat([measurements, currentMeasurements], axis=0, ignore_index=True)
            else:
                if len(measurements) > 0:
-                    # new assay, process all
                    finalize(investigation, measurements, investigationPath, previousAssay, metadata, study_id)
                measurements = currentMeasurements
            previousAssay = assayName
        except:
-            # No data?
            pass
    else:
-        #CSV file is not an assay file
        pass
 if len(measurements) > 0:
    finalize(investigation, measurements, investigationPath, previousAssay, metadata, study_id)
 measurementsToFile(investigation, "/".join([investigationPath, investigation.title, investigation.studies[0].title]), "derived/" + investigation.studies[0].title + ".csv", investigation.measurements)
-#print(json.dumps(investigation, cls=ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': ')))
\ No newline at end of file
--- a/f500/collecting/processPointClouds.py
+++ b/f500/collecting/processPointClouds.py
@@ -13,30 +13,69 @@ import open3d as o3d
 import tempfile
 import numpy
+"""
+This script processes ISA-JSON files and associated point cloud data files.
+It extracts greenness information from point clouds and writes histograms
+of the greenness values to specified output files. The script requires an
+ISA-JSON file and a list of point cloud files as input.
+"""
 investigation = isajson.load(open(sys.argv[1], "r"))
 study = investigation.studies[0]    
 BINS = 256
 def writeHistogram(data, filename):
+    """
+    Writes a histogram of the given data to a file.
+    Parameters:
+        data (numpy.ndarray): The data for which the histogram is to be computed.
+        filename (str): The name of the file where the histogram will be written.
+    Outputs:
+        A file containing the histogram data. The bin edges and histogram counts
+        are written in separate lines, separated by semicolons.
+    Side Effects:
+        Creates or overwrites the specified file with histogram data.
+    """
    hist, bin_edges = numpy.histogram(data, bins=BINS)
-    f = open(filename, "w")
+    with open(filename, "w") as f:
-    f.write(";".join(bin_edges))
+        f.write(";".join(map(str, bin_edges)))
-    f.write("\n")
+        f.write("\n")
-    f.write(";".join(hist))
+        f.write(";".join(map(str, hist)))
-    f.close()
 def get_greenness(pcd):
-    np.seterr(divide='ignore',invalid='ignore')
+    """
-    return ((np.asarray(pcd.wavelengths)[:,0]-np.asarray(pcd.wavelengths)[:,2] + 2.0 * np.asarray(pcd.wavelengths)[:,1])/(np.asarray(pcd.wavelengths)[:,0]+ np.asarray(pcd.wavelengths)[:,1] + np.asarray(pcd.wavelengths)[:,2]))
+    Calculates the greenness index for a point cloud.
+    Parameters:
+        pcd (open3d.geometry.PointCloud): The point cloud object containing wavelength data.
+    Returns:
+        numpy.ndarray: An array of greenness values for each point in the point cloud.
+    Exceptions:
+        May raise an exception if the point cloud does not contain wavelength data.
-# find each pointcloud in the file list
+    Notes:
+        The greenness index is calculated using the formula:
+        (R - B + 2G) / (R + G + B), where R, G, and B are the red, green, and blue
+        wavelength values, respectively.
+    """
+    np.seterr(divide='ignore', invalid='ignore')
+    return ((np.asarray(pcd.wavelengths)[:,0] - np.asarray(pcd.wavelengths)[:,2] + 
+             2.0 * np.asarray(pcd.wavelengths)[:,1]) / 
+            (np.asarray(pcd.wavelengths)[:,0] + np.asarray(pcd.wavelengths)[:,1] + 
+             np.asarray(pcd.wavelengths)[:,2]))
+# Find each point cloud in the file list
 pointclouds = defaultdict(str)
 for pcd in sys.argv[2:]:
    filename = os.path.basename(pcd)
    pointclouds[filename] = pcd
-# process isa file
+# Process ISA file
 for a in study.assays:
    print(a.data_files)
    for df in a.data_files:
@@ -45,11 +84,10 @@ for a in study.assays:
                print(com.value)
                print(os.path.basename(df.filename))
                if ".ply" in df.filename and os.path.basename(df.filename) in pointclouds:
-                    # copy ply
+                    # Copy ply
                    shutil.copy2(pointclouds[os.path.basename(df.filename)], com.value)
                    a.pointcloud = com.value
 for a in study.assays:
    print(a.data_files)
    if a.pointcloud:
@@ -64,7 +102,3 @@ for a in study.assays:
                    if "greenness.csv" in com.value:
                        greenness = get_greenness(pcd)
                        writeHistogram(greenness, com.value)
--- a/f500/collecting/toolkit.py
+++ b/f500/collecting/toolkit.py
 """
-ISA & isamodel
+This script serves as a command-line interface for the F500 class, which provides various functionalities such as restructuring data, processing point clouds, verifying data, combining histograms, and uploading data. The script determines the command to execute based on user input.
-https://isa-specs.readthedocs.io/en/latest/isamodel.html
+Modules:
+    - sys: Provides access to some variables used or maintained by the interpreter and to functions that interact with the interpreter.
+    - os: Provides a portable way of using operating system-dependent functionality.
+    - pandas: A data manipulation and analysis library for Python.
+    - F500: A custom module that contains the F500 class with methods for different data processing tasks.
+Usage:
+    Run the script with the desired command to execute the corresponding functionality.
 """
 import sys
 import os
 import pandas
 from F500 import F500
 if __name__ == '__main__':
    f500 = F500()
    f500.commandLineInterface()
    if f500.args.command == "restructure":
@@ -25,8 +30,6 @@ if __name__ == '__main__':
        f500.combineHistograms()
    elif f500.args.command == "upload":
        f500.upload()
+```
+Note: Since the `F500` class and its methods are not provided in the script, I cannot add docstrings for them. However, I have added a general docstring for the script itself. If you have access to the `F500` class, you should add docstrings to its methods following the same guidelines.
\ No newline at end of file
--- a/lib/computePhenotypes.py
+++ b/lib/computePhenotypes.py
-import numpy as np
+"""
+This script provides functions to calculate various vegetation indices from point cloud data (PCD) using specific wavelength channels. 
+These indices include the Normalized Difference Vegetation Index (NDVI) for visualization, NDVI, the Normalized Pigment Chlorophyll Index (NPCI), 
+and a greenness index. The calculations are based on the wavelengths corresponding to different spectral bands.
+Functions:
+- get_ndvi_for_visualization: Computes NDVI for visualization purposes, scaling the result between 0 and 1.
+- get_ndvi: Computes the standard NDVI, with values ranging from -1 to 1.
+- get_npci: Computes the NPCI using the red and blue channels.
+- get_greenness: Computes a greenness index using the red, green, and blue channels.
+"""
+import numpy as np
-#ndvi value calculated for visualization needs to have a range between 0 and 1
-#   to  (((nirv-red)/(nirv+red))+1)/2    
 def get_ndvi_for_visualization(pcd):
-    np.seterr(divide='ignore',invalid='ignore')
+    """
-    np.asarray(pcd.ndvi)[:,0] = (((np.asarray(pcd.wavelengths)[:,3]-np.asarray(pcd.wavelengths)[:,0])/(np.asarray(pcd.wavelengths)[:,3]+np.asarray(pcd.wavelengths)[:,0]))+1)/2
+    Calculate the NDVI for visualization purposes, scaling the result between 0 and 1.
-    return pcd.ndvi[:,0]
+    Parameters:
+    pcd (object): A point cloud data object containing 'wavelengths' and 'ndvi' attributes. 
+                  'wavelengths' is expected to be a 2D array where the columns correspond to different spectral bands.
+    Returns:
+    ndarray: A 1D array of NDVI values scaled between 0 and 1.
+    Side Effects:
+    - Modifies the 'ndvi' attribute of the input 'pcd' object.
+    Notes:
+    - This function ignores division and invalid operation warnings using numpy's seterr function.
+    """
+    np.seterr(divide='ignore', invalid='ignore')
+    np.asarray(pcd.ndvi)[:, 0] = (((np.asarray(pcd.wavelengths)[:, 3] - np.asarray(pcd.wavelengths)[:, 0]) /
+                                   (np.asarray(pcd.wavelengths)[:, 3] + np.asarray(pcd.wavelengths)[:, 0])) + 1) / 2
+    return pcd.ndvi[:, 0]
-#ndvi value calculated from nir and red channel should be between -1 and 1
-#    (nirv-red)/(nirv+red)
 def get_ndvi(pcd):
-    np.seterr(divide='ignore',invalid='ignore')
+    """
-    np.asarray(pcd.ndvi)[:,0] = (np.asarray(pcd.wavelengths)[:,3]-np.asarray(pcd.wavelengths)[:,0])/(np.asarray(pcd.wavelengths)[:,3]+np.asarray(pcd.wavelengths)[:,0])
+    Calculate the standard NDVI, with values ranging from -1 to 1.
-    return pcd.ndvi[:,0]
+    Parameters:
+    pcd (object): A point cloud data object containing 'wavelengths' and 'ndvi' attributes. 
+                  'wavelengths' is expected to be a 2D array where the columns correspond to different spectral bands.
+    Returns:
+    ndarray: A 1D array of NDVI values ranging from -1 to 1.
+    Side Effects:
+    - Modifies the 'ndvi' attribute of the input 'pcd' object.
+    Notes:
+    - This function ignores division and invalid operation warnings using numpy's seterr function.
+    """
+    np.seterr(divide='ignore', invalid='ignore')
+    np.asarray(pcd.ndvi)[:, 0] = (np.asarray(pcd.wavelengths)[:, 3] - np.asarray(pcd.wavelengths)[:, 0]) / \
+                                 (np.asarray(pcd.wavelengths)[:, 3] + np.asarray(pcd.wavelengths)[:, 0])
+    return pcd.ndvi[:, 0]
-#(RED − BLUE)/(RED + BLUE) 
 def get_npci(pcd):
-    np.seterr(divide='ignore',invalid='ignore')
+    """
-    return ((np.asarray(pcd.wavelengths)[:,0]-np.asarray(pcd.wavelengths)[:,2])/(np.asarray(pcd.wavelengths)[:,0]+np.asarray(pcd.wavelengths)[:,2]))
+    Calculate the Normalized Pigment Chlorophyll Index (NPCI) using the red and blue channels.
-#(2*G-R-B)/(R+G+B)    
+    Parameters:
+    pcd (object): A point cloud data object containing 'wavelengths' attribute. 
+                  'wavelengths' is expected to be a 2D array where the columns correspond to different spectral bands.
+    Returns:
+    ndarray: A 1D array of NPCI values.
+    Notes:
+    - This function ignores division and invalid operation warnings using numpy's seterr function.
+    """
+    np.seterr(divide='ignore', invalid='ignore')
+    return ((np.asarray(pcd.wavelengths)[:, 0] - np.asarray(pcd.wavelengths)[:, 2]) /
+            (np.asarray(pcd.wavelengths)[:, 0] + np.asarray(pcd.wavelengths)[:, 2]))
 def get_greenness(pcd):
-    np.seterr(divide='ignore',invalid='ignore')
+    """
-    return ((np.asarray(pcd.wavelengths)[:,0]-np.asarray(pcd.wavelengths)[:,2] + 2.0 * np.asarray(pcd.wavelengths)[:,1])/(np.asarray(pcd.wavelengths)[:,0]+ np.asarray(pcd.wavelengths)[:,1] + np.asarray(pcd.wavelengths)[:,2]))
+    Calculate a greenness index using the red, green, and blue channels.
+    Parameters:
+    pcd (object): A point cloud data object containing 'wavelengths' attribute. 
+                  'wavelengths' is expected to be a 2D array where the columns correspond to different spectral bands.
+    Returns:
+    ndarray: A 1D array of greenness index values.
+    Notes:
+    - This function ignores division and invalid operation warnings using numpy's seterr function.
+    """
+    np.seterr(divide='ignore', invalid='ignore')
+    return ((np.asarray(pcd.wavelengths)[:, 0] - np.asarray(pcd.wavelengths)[:, 2] + 
+             2.0 * np.asarray(pcd.wavelengths)[:, 1]) /
+            (np.asarray(pcd.wavelengths)[:, 0] + np.asarray(pcd.wavelengths)[:, 1] + np.asarray(pcd.wavelengths)[:, 2]))
--- a/lib/rescaleWavelength.py
+++ b/lib/rescaleWavelength.py
 import numpy as np
-# wavelength: [0]R/[1]G/[2]B/[3]NIR all divide by scale
+"""
-# replace color R with wavelength [0]/scale
+This script provides functionality to rescale the wavelengths of a point cloud data (PCD) object. 
-# replace color G with wavelength [1]/scale
+The script modifies the color and NIR (Near-Infrared) attributes of the PCD based on the provided scale.
-# replace color B with wavelength [2]/scale
+"""
-# replace nir [0] with wavelength [3]/scale
-def rescale_wavelengths(pcd,scale):
-    tmp_wvl = np.asarray(pcd.wavelengths)/scale
-    np.asarray(pcd.colors)[:,0] = tmp_wvl[:,0]
-    np.asarray(pcd.colors)[:,1] = tmp_wvl[:,1]
-    np.asarray(pcd.colors)[:,2] = tmp_wvl[:,2]
-    np.asarray(pcd.nir)[:,0] = tmp_wvl[:,3]
-    return pcd 
+def rescale_wavelengths(pcd, scale):
+    """
+    Rescales the wavelengths of a point cloud data (PCD) object by a given scale factor.
+    This function takes a PCD object with attributes for wavelengths, colors, and NIR values. 
+    It rescales the wavelengths by dividing them by the provided scale factor and updates the 
+    color and NIR attributes of the PCD accordingly.
+    Parameters:
+    pcd : object
+        A point cloud data object that contains 'wavelengths', 'colors', and 'nir' attributes.
+        The 'wavelengths' attribute is expected to be a 2D array with columns representing 
+        R, G, B, and NIR wavelengths.
+    scale : float
+        The scale factor by which to divide the wavelengths.
+    Returns:
+    object
+        The modified PCD object with rescaled color and NIR attributes.
+    Side Effects:
+    Modifies the 'colors' and 'nir' attributes of the input PCD object in place.
+    Exceptions:
+    This function assumes that the input PCD object has the required attributes and that they 
+    are in the expected format. If not, it may raise AttributeError or IndexError.
+    Future Work:
+    Consider adding input validation to ensure the PCD object has the required attributes and 
+    that they are in the expected format. Additionally, handle potential exceptions more gracefully.
+    """
+    tmp_wvl = np.asarray(pcd.wavelengths) / scale
+    np.asarray(pcd.colors)[:, 0] = tmp_wvl[:, 0]
+    np.asarray(pcd.colors)[:, 1] = tmp_wvl[:, 1]
+    np.asarray(pcd.colors)[:, 2] = tmp_wvl[:, 2]
+    np.asarray(pcd.nir)[:, 0] = tmp_wvl[:, 3]
+    return pcd
--- a/mdocs_settings.json
+++ b/mdocs_settings.json
+{
+    "title": "F500 analytics for NPEC",
+    "description": "Several Python and R scripts for processing the raw F500 data. Uses the ISA-JSON for metadata",
+    "developer": "Sven Warris",
+    "mail": "sven.warris@wur.nl",
+    "link": "https://git.wur.nl/NPEC/analytics"
+}
--- a/reports/full_analysis_summary_ai.md
+++ b/reports/full_analysis_summary_ai.md
+### Summary of Key Findings from Each Report
+#### Radon MI Report
+- **Maintainability Index Scores**: The codebase has a mix of maintainability scores, with some files scoring low due to high complexity, lack of comments, large file sizes, poor modularization, and code duplication.
+- **Improvement Suggestions**: Refactor complex functions, add documentation, modularize code, reduce duplication, simplify logic, and use descriptive naming.
+#### Radon CC Report
+- **Complexity Scores**: Some functions have high cyclomatic complexity, making them difficult to understand and maintain.
+- **Refactoring Suggestions**: Break down complex functions, reduce nested logic, use design patterns, modularize code, and improve error handling.
+#### Pylint Report
+- **Linting Issues**: The codebase has convention violations, warnings, errors, and refactor suggestions, impacting readability, maintainability, and potential for bugs.
+- **Improvement Strategies**: Adopt PEP 8 standards, improve documentation, optimize imports, refactor complex functions, use context managers, update string formatting, and resolve errors.
+#### Vulture Report
+- **Unused Code**: The codebase contains unused imports, variables, attributes, classes, and methods, which can be removed to streamline the project.
+- **Critical Unused Code**: Some unused code might be critical if certain functionalities are expected, requiring careful review before removal.
+### Common Issues Across Reports and High-Level Strategies for Improvement
+1. **Complexity and Maintainability**: High complexity and low maintainability scores are common issues. Strategies include refactoring complex functions, simplifying logic, and improving modularization.
+2. **Documentation and Naming**: Lack of comments and poor naming conventions are prevalent. Strategies include adding comprehensive docstrings and adhering to PEP 8 naming conventions.
+3. **Code Duplication and Unused Code**: Code duplication and unused code are identified across reports. Strategies include removing unused code and refactoring duplicated code into reusable components.
+4. **Error Handling and Testing**: Potential runtime errors and lack of automated testing are concerns. Strategies include improving error handling and increasing test coverage.
+### Overall Assessment of the Codebase’s Quality, Complexity, and Maintainability
+- **Quality**: The codebase currently has a low quality score, primarily due to poor adherence to coding standards, lack of documentation, and potential runtime errors. Addressing these issues will significantly enhance the code's quality.
+- **Complexity**: While the average complexity score is relatively low, there are outliers with high complexity that need attention. Regular refactoring and code reviews can help manage complexity.
+- **Maintainability**: The codebase has a mix of maintainability scores, with some files requiring significant improvement. By focusing on refactoring, documentation, and modularization, maintainability can be improved.
+### Recommendations for Improvement
+1. **Refactor and Simplify**: Focus on refactoring complex functions and simplifying logic to improve readability and maintainability.
+2. **Enhance Documentation**: Add comprehensive docstrings and inline comments to clarify code functionality and usage.
+3. **Adopt Coding Standards**: Ensure adherence to PEP 8 standards for naming conventions, line length, and overall code style.
+4. **Remove Unused Code**: Safely remove unused imports, variables, and functions to streamline the codebase.
+5. **Improve Testing and Error Handling**: Increase automated test coverage and implement consistent error-handling strategies.
+6. **Regular Code Reviews**: Conduct regular code reviews to catch potential issues early and promote best practices.
+By implementing these strategies, the codebase's quality, complexity, and maintainability can be significantly improved, making it easier to understand, modify, and extend in the future.
\ No newline at end of file
--- a/reports/pylint_report.txt
+++ b/reports/pylint_report.txt
--- a/reports/pylint_report_summary_ai.md
+++ b/reports/pylint_report_summary_ai.md
+### Summary of Linting Issues
+The pylint report highlights several types of issues across the codebase. These can be grouped into the following categories:
+1. **Convention Violations (C):**
+   - **Trailing Whitespace and Newlines:** Frequent occurrences of trailing whitespace and newlines across multiple files.
+   - **Naming Conventions:** Many variables, functions, and module names do not conform to PEP 8 naming conventions (e.g., snake_case for variables and functions, UPPER_CASE for constants).
+   - **Missing Docstrings:** Many modules, classes, and functions lack docstrings, which are essential for understanding the purpose and usage of the code.
+   - **Line Length:** Numerous lines exceed the recommended maximum line length of 100 characters.
+2. **Warnings (W):**
+   - **Unused Imports and Variables:** Several imports and variables are declared but not used, leading to unnecessary clutter.
+   - **Redefining Names:** Some variables are redefined from an outer scope, which can lead to confusion and potential errors.
+   - **Deprecated and Unused Methods:** Usage of deprecated methods and methods that are defined but not used.
+3. **Errors (E):**
+   - **Import Errors:** Some modules are unable to import certain packages, indicating potential issues with dependencies or incorrect paths.
+   - **Undefined Variables:** Usage of variables that are not defined within the scope.
+   - **No-Member Errors:** Attempting to access non-existent members of modules, which could indicate incorrect usage or outdated libraries.
+4. **Refactor Suggestions (R):**
+   - **Too Many Arguments/Attributes:** Functions and classes with too many arguments or attributes, suggesting a need for refactoring to improve readability and maintainability.
+   - **Too Many Branches/Statements:** Functions with excessive branching or statements, indicating complex logic that could be simplified.
+5. **Specific Issues:**
+   - **Consider Using 'with' for Resource Management:** Several instances where resource-allocating operations like file handling should use context managers for better resource management.
+   - **Consider Using f-Strings:** Recommendations to use f-strings for string formatting for better readability and performance.
+### Impact on Code Quality
+- **Readability and Maintainability:** The lack of adherence to naming conventions and missing docstrings significantly impacts the readability and maintainability of the code. Developers may find it challenging to understand and modify the code.
+- **Potential Bugs:** Undefined variables, import errors, and no-member errors can lead to runtime errors, affecting the reliability of the software.
+- **Performance and Efficiency:** Unused imports and variables, along with inefficient string formatting, can lead to unnecessary memory usage and reduced performance.
+### Suggested Fixes and Refactoring Strategies
+1. **Adopt PEP 8 Standards:**
+   - Rename variables, functions, and modules to conform to PEP 8 naming conventions.
+   - Ensure all lines are within the recommended length.
+2. **Improve Documentation:**
+   - Add docstrings to all modules, classes, and functions to describe their purpose and usage.
+3. **Optimize Imports:**
+   - Remove unused imports and variables to clean up the code.
+4. **Refactor Complex Functions:**
+   - Break down functions with too many arguments or complex logic into smaller, more manageable functions.
+5. **Use Context Managers:**
+   - Implement context managers for file operations to ensure proper resource management.
+6. **Update String Formatting:**
+   - Replace old string formatting methods with f-strings for improved readability and performance.
+7. **Resolve Errors:**
+   - Investigate and resolve import errors and undefined variables to ensure the code runs correctly.
+### Overall Quality Assessment
+Based on the pylint report, the codebase currently has a low quality score of 0.92/10. The code suffers from poor adherence to coding standards, lack of documentation, and potential runtime errors. Addressing the highlighted issues will significantly improve the code's readability, maintainability, and reliability. Prioritizing the most critical issues, such as errors and convention violations, will be essential in enhancing the overall quality of the codebase.
\ No newline at end of file
--- a/reports/radon_cc_report.txt
+++ b/reports/radon_cc_report.txt
+analytics/lib/rescaleWavelength.py
+    F 8:0 rescale_wavelengths - A (1)
+analytics/lib/computePhenotypes.py
+    F 6:0 get_ndvi_for_visualization - A (1)
+    F 13:0 get_ndvi - A (1)
+    F 19:0 get_npci - A (1)
+    F 24:0 get_greenness - A (1)
+analytics/visualizations/histograms_ply.py
+    F 23:0 createPNG - A (1)
+analytics/visualizations/animate_ply.py
+    F 28:0 play_motion - A (1)
+analytics/f500/collecting/PointCloud.py
+    M 45:4 PointCloud.get_hue - B (6)
+    M 15:4 PointCloud.writeHistogram - A (4)
+    C 5:0 PointCloud - A (3)
+    M 98:4 PointCloud.trim - A (3)
+    M 25:4 PointCloud.getWavelengths - A (2)
+    M 92:4 PointCloud.render_image - A (2)
+    M 165:4 PointCloud.render_image_rescale - A (2)
+    M 11:4 PointCloud.__init__ - A (1)
+    M 33:4 PointCloud.get_psri - A (1)
+    M 69:4 PointCloud.get_greenness - A (1)
+    M 77:4 PointCloud.get_ndvi - A (1)
+    M 83:4 PointCloud.get_npci - A (1)
+    M 88:4 PointCloud.setColors - A (1)
+    M 130:4 PointCloud.render_image_no_rescale - A (1)
+analytics/f500/collecting/F500.py
+    M 620:4 F500.processPointclouds - E (31)
+    M 538:4 F500.restructure - D (21)
+    M 261:4 F500.copyPointcloudFile - C (11)
+    M 693:4 F500.combineHistograms - B (10)
+    M 301:4 F500.copyPlotPointcloudFile - B (9)
+    M 495:4 F500.finalize - B (9)
+    C 30:0 F500 - B (6)
+    M 453:4 F500.createAssayPlot - A (5)
+    M 209:4 F500.copyPots - A (4)
+    M 162:4 F500.setLogger - A (2)
+    M 169:4 F500.removeAfterSpaceFromDataMatrix - A (2)
+    M 176:4 F500.createISA - A (2)
+    M 361:4 F500.createSample - A (2)
+    M 486:4 F500.correctDataMatrix - A (2)
+    M 44:4 F500.__init__ - A (1)
+    M 57:4 F500.commandLineInterface - A (1)
+    M 202:4 F500.writeISAJSON - A (1)
+    M 221:4 F500.measurementsToFile - A (1)
+    M 228:4 F500.rawMeasurementsToFile - A (1)
+    M 235:4 F500.addPointClouds - A (1)
+    M 374:4 F500.createAssay - A (1)
+    M 535:4 F500.getDirectoryListing - A (1)
+    M 730:4 F500.upload - A (1)
+analytics/f500/collecting/processPointClouds.py
+    F 20:0 writeHistogram - A (1)
+    F 28:0 get_greenness - A (1)
+analytics/f500/collecting/Fairdom.py
+    M 118:4 Fairdom.upload - D (24)
+    C 12:0 Fairdom - A (4)
+    M 96:4 Fairdom.addDataFilesToSampleJSON - A (3)
+    M 84:4 Fairdom.addSampleToAssayJSON - A (2)
+    M 90:4 Fairdom.addDataFileToAssayJSON - A (2)
+    M 13:4 Fairdom.__init__ - A (1)
+    M 28:4 Fairdom.createInvestigationJSON - A (1)
+    M 40:4 Fairdom.createStudyJSON - A (1)
+    M 53:4 Fairdom.createAssayJSON - A (1)
+    M 70:4 Fairdom.createDataFileJSON - A (1)
+    M 104:4 Fairdom.createSampleJSON - A (1)
+analytics/f500/collecting/F500Azure.py
+    M 64:4 F500Azure.copyPointcloudFile - A (3)
+    C 5:0 F500Azure - A (2)
+    M 7:4 F500Azure.__init__ - A (1)
+    M 12:4 F500Azure.initAzure - A (1)
+    M 30:4 F500Azure.connectToSource - A (1)
+    M 38:4 F500Azure.connectToTarget - A (1)
+    M 45:4 F500Azure.writeISAJSON - A (1)
+    M 50:4 F500Azure.measurementsToFile - A (1)
+    M 57:4 F500Azure.rawMeasurementsToFile - A (1)
+    M 82:4 F500Azure.getDirectoryListing - A (1)
+analytics/f500/collecting/fairdom.py
+    F 235:0 finalize - B (7)
+    F 146:0 copyPots - A (3)
+    F 53:0 removeAfterSpaceFromDataMatrix - A (1)
+    F 155:0 measurementsToFile - A (1)
+    F 160:0 rawMeasurementsToFile - A (1)
+    F 167:0 addPointClouds - A (1)
+    F 175:0 createAssay - A (1)
+74 blocks (classes, functions, methods) analyzed.
+Average complexity: A (3.135135135135135)
--- a/reports/radon_cc_report_summary_ai.md
+++ b/reports/radon_cc_report_summary_ai.md
+### Highlight of Functions/Methods with Highest Complexity Scores
+1. **F500.processPointclouds - E (31)**
+   - **Impact on Maintainability**: This method has the highest cyclomatic complexity score of 31, indicating a very high level of complexity. Such a high score suggests that the method likely contains numerous conditional statements and branches, making it difficult to understand, test, and maintain. This complexity can lead to increased chances of bugs and errors, as well as making future modifications challenging.
+2. **Fairdom.upload - D (24)**
+   - **Impact on Maintainability**: With a complexity score of 24, this method is also quite complex. It may contain multiple decision points and nested logic, which can obscure the flow of the code and make it harder to follow. This complexity can hinder the ability to quickly identify and fix issues or to extend the functionality.
+3. **F500.restructure - D (21)**
+   - **Impact on Maintainability**: This method's complexity score of 21 suggests it is also quite intricate. Similar to the above methods, it likely involves numerous branches and conditions, which can complicate understanding and maintenance.
+### Suggestions for Refactoring or Simplifying the Most Complex Functions/Methods
+1. **F500.processPointclouds**
+   - **Break Down into Smaller Functions**: Identify logical sections within the method and extract them into smaller, well-named functions. This will help isolate different functionalities and make the code more readable.
+   - **Reduce Nested Logic**: Simplify nested if-else statements by using guard clauses or early returns where possible.
+   - **Use Design Patterns**: Consider using design patterns like Strategy or Command to encapsulate varying behaviors and reduce complexity.
+2. **Fairdom.upload**
+   - **Modularize Code**: Break down the method into smaller, single-responsibility functions. Each function should handle a specific part of the upload process.
+   - **Simplify Conditionals**: Use polymorphism or a configuration-driven approach to handle complex conditional logic.
+   - **Improve Error Handling**: Implement a consistent error-handling strategy to manage exceptions and edge cases more effectively.
+3. **F500.restructure**
+   - **Refactor for Clarity**: Extract complex logic into helper functions with descriptive names to clarify the purpose of each code block.
+   - **Use Data Structures**: Consider using more appropriate data structures to simplify data manipulation and reduce the number of operations.
+   - **Optimize Loops**: Review loops for opportunities to simplify or combine them, reducing the overall complexity.
+### General Summary of the Codebase’s Complexity and Recommendations for Improvement
+- **Overall Complexity**: The average complexity score of the codebase is A (3.135), which indicates that most of the code is relatively simple and maintainable. However, there are a few outliers with significantly higher complexity scores that need attention.
+- **Recommendations**:
+  - **Regular Code Reviews**: Implement regular code reviews focusing on complexity and maintainability to catch potential issues early.
+  - **Automated Testing**: Increase the coverage of automated tests, especially for complex methods, to ensure that changes do not introduce new bugs.
+  - **Continuous Refactoring**: Encourage continuous refactoring practices to gradually improve the codebase's structure and readability.
+  - **Documentation**: Improve documentation for complex methods to aid understanding and future maintenance efforts.
+  - **Training and Best Practices**: Provide training on best practices for writing clean, maintainable code, and encourage the use of design patterns where appropriate.
+By addressing the most complex areas and promoting a culture of clean code, the maintainability and quality of the codebase can be significantly improved.
\ No newline at end of file