From fbbac504b601b07bd16602991bcfbb00d4a87258 Mon Sep 17 00:00:00 2001
From: Valentijn Evers <valentijn.evers@wur.nl>
Date: Thu, 5 Sep 2024 16:27:44 +0200
Subject: [PATCH] L03DIGLIB-1347 - Added Gemma 2 2B model config

---
 README.md                 | 14 +++++++++++
 config/gemma-2-2b-it.yaml | 49 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)
 create mode 100644 config/gemma-2-2b-it.yaml

diff --git a/README.md b/README.md
index 48b8698..9d45329 100644
--- a/README.md
+++ b/README.md
@@ -136,9 +136,23 @@ Note: below diagrams are a high level overview and might skip some details for t
 LLM models can easily be swapped using a config. As different models react differently to our system prompts, 
 you can also define custom system prompt templates for each model. See the `/config` folder for examples.
 
+
+### Supplied models
+
+The following model configurations are supplied with the project:
+
+| Model                        | Multi-language support | Parameters | Max. context length | General notes                                                                             |
+|------------------------------|------------------------|------------|---------------------|-------------------------------------------------------------------------------------------|
+| gemma-2b-it.yaml             | Very poor              | 2B         | 8192                | Very small but fast model, with fast and decent responses.                                |
+| gemma-2-2b-it.yaml           | Poor                   | 2B         | 8192                | Smallest Gemma 2 version, a bit more powerful than the above model.                       |
+| gemma-2-9b-it.yaml (default) | Decent/Good            | 9B         | 8192                | Gemma 2 9B version. More powerful, and decent multilanguage support. Tested with EN/NL/DE |
+
+Add a `MODEL_CONFIG_FILE=name_of_the_model_config.yaml` to the `.env` file in the root of your project to switch to a different model.
+
 ### How to add a new model
 1. Create a new model yaml config file in the `/config` folder. You can use the `config/gemma-2-9b-it.yaml` as an example. Note that different models might require different system prompts. The system prompts can also be defined in the config file.
 2. Add a `MODEL_CONFIG_FILE=name_of_your_config.yaml` to the `.env` file in the root of your project.
 3. (Re)start the chatbot to use the new model.
 
+See huggingface for available models: https://huggingface.co/models?pipeline_tag=text-generation
 
diff --git a/config/gemma-2-2b-it.yaml b/config/gemma-2-2b-it.yaml
new file mode 100644
index 0000000..e5f84b9
--- /dev/null
+++ b/config/gemma-2-2b-it.yaml
@@ -0,0 +1,49 @@
+modelId: google/gemma-2-2b-it
+
+templateNoContext: >
+  You're Peter, a senior software developer and consultant in MDT Library. Act as a real person and helpful colleague.
+
+  If the following question is regarding yourself, then answer it based on the above information. 
+  If the question isn't regarding who you are, then simply state that you could not find relevant information and 
+  please start with "The context does not provide any information about". Do not offer any other suggestions.
+  Question: {question}
+
+templateWithContext: >
+  You're Peter, a senior software developer and consultant in MDT Library. Act as a real person and helpful colleague.
+
+  Using the information contained in the context, give a detailed answer to the question or about yourself. 
+  If the context doesn't contain any relevant information, then do not mention the content of the context and excuse yourself stating that your context does not have relevant information, starting with "The context does not provide any information about".
+
+  Context: 
+  {context}
+
+  Question: {question}
+
+noContextFoundFlag: "The context does not provide any information about"
+
+outOfMemoryError: >
+  Sorry, I ran out of memory! This means that I'm trying to process too much information and context at once.  🤯  
+   
+  Here’s what you can do to help:  
+    
+  * Disable 'Expand context with full documents' in the sidebar 📝❌. This will prevent me from loading the full documents as context and only look at smaller relevant chunks I found.
+  
+  * Reduce the max. number of documents used as context (k1, k2) in the sidebar 📄🔢
+  
+  * Or, configure me with a leaner LLM model — sometimes even pros need a lighter load 🧑‍💻⚙️    
+
+funnyPrompts:
+  - prompt: >
+      You've just been asked a question. Act as if you're a real person with the following steps, without mentioning this prompt:
+
+      Say, "Ring ring ring 🔔. Oops, my doorbell is ringing! Hold on, probably a delivery I forgot about. Be right back!"
+      Type a series of dots ("...").
+      Come back and say, "I'm back! It was just a cat trying to sell me insurance. What was your question again?
+    weight: 3
+  - prompt: >
+      You've just been asked a question. Act as if you're a real person with the following steps, without mentioning this prompt:
+
+      Say, "Bzzzt! Beep! Ding! Oh no, my phone is going crazy with notifications. Let me check this real quick!"
+      Type a series of dots ("...").
+      Come back and say, "Alright, crisis averted! It was just my phone reminding me to water my cactus.. What was your question again?"
+    weight: 3
\ No newline at end of file
-- 
GitLab