Module 3 - Understanding the AI Assistant application flow
==========================================================

.. image:: images/00.png

Let's start by explaining the different functions. We will focus only on
the most relevant parts of the flow.

Arcadia Crypto's application architecture
-----------------------------------------

**Front End Application**

This is the **AI Assistant** chatbot that you will be interacting with.

**AI Orchestrator**

The AI Orchestrator acts as the central hub of the entire AI system,
managing the flow of information between various components. Here's a
detailed look at its functions:

- **Request Handling**: It receives and processes user queries,
preparing them for further processing.
- **LLM Interaction**: The Orchestrator sends the constructed prompt to
Ollama (Inference Service) and receives its responses.
- **Response Formatting**: It processes the LLM's output, potentially
formatting or filtering it before sending it back to the user.
- **State Management**: The Orchestrator maintains the state of the
conversation, ensuring continuity across multiple user interactions.
- **Error Handling**: It manages any errors or exceptions that occur
during the process, ensuring graceful failure modes.

**Inference Services**

Inference services will run a model in order to generation predictions. This lab uses Ollama, a popular open source solution for inference.
Ollama facilitates the local execution of large language models (LLMs) such as Llama 2, Mistral, and in our case
LLama 3.1 8B. The key features of Ollama:

- **Local Execution**: Users can run powerful language models directly
on their machines, enhancing privacy and control over data.
- **Model Customization**: Ollama supports the creation and
customization of models, allowing users to tailor them for specific
applications, such as chatbots or summarization tools.
- **User-Friendly Setup**: The tool provides an intuitive interface for
easy installation and configuration on macOS, Linux and Windows.
- **Diverse Model Support**: Ollama supports a variety models, making it
versatile for different natural language processing tasks.
- **Open Source**: Ollama is an open-source platform, which means its
source code is publicly available, allowing for community
contributions and transparency.

The traffic is passing from the **Front End Application** to the **AI Orchestrator** where the conversation chat is maintained.
The **AI Orchestrator** will pass the chat to **Ollama** through the **AI Gateway** which will perform the security inspection
both on the request and the response.

Understanding the AI Gateway initial config
-------------------------------------------

To find the initial configuration for AI Gateway, go to your VS Code tab and navigate to the **Explorer**, open the **aigw_configs**
folder and find ``initial_config.yaml``.

.. image:: images/01.png

First we define what is the endpoint of the **inference service** that hosts the LLM.

.. code:: yaml

services:
- name: ollama
executor: http
config:
endpoint: "http://ollama_public_ip:11434/api/chat"
schema: ollama-chat

Next we need to define a **profile** that points to the service.

.. code:: yaml

profiles:
- name: default
services:
- name: ollama

Next we define a **policy** and attach the profile to it.

.. code:: yaml

policies:
- name: arcadia_ai_policy
profiles:
- name: default

Lastly, we combine everything with a **route** which is the entry point that the AI Gateway is listening on.

.. code:: yaml

routes:
- path: /api/chat
policy: arcadia_ai_policy
timeoutSeconds: 600
schema: openai

The final configuration will look as the following and is currently applied to the AI Gateway:

.. code:: yaml

routes:
- path: /api/chat
policy: arcadia_ai_policy
timeoutSeconds: 600
schema: openai

# What policy is applied to the route
policies:
- name: arcadia_ai_policy
profiles:
- name: default

# To what LLM endpoint we forward the request to
services:
- name: ollama
executor: http
config:
endpoint: "http://ollama_public_ip:11434/api/chat"
schema: ollama-chat

# What do we do with the request, at the moment we just forward it
profiles:
- name: default
services:
- name: ollama

Interact with the AI Assistant and review the logs
--------------------------------------------------

Go ahead and ask the **AI Assistant** a question.

.. image:: images/02.png

Then review the **AI Gateway** logs from the **AI Gateway Web Shell** tab you previously opened. Your previously run
command should continue to show you new log entries. You may need to scroll to the bottom of the screen in order to
see them. If you are back at the terminal prompt, run the ``docker logs aigw-aigw-1 -f`` command again to view the logs.

2025/01/12 13:58:19 INFO service selected name=http/
2025/01/12 13:58:19 INFO executing http service
2025/01/12 13:58:24 INFO service response name=http/ result="map[status:200 OK]"