Module 3 - Understanding the AI Assistant application flow ========================================================== .. image:: images/00.png Let's start by explaining the different functions. We will focus only on the most relevant parts of the flow. Arcadia Crypto's application architecture ----------------------------------------- **Front End Application** This is the **AI Assistant** chatbot that you will be interacting with. **AI Orchestrator** The AI Orchestrator acts as the central hub of the entire AI system, managing the flow of information between various components. Here's a detailed look at its functions: - **Request Handling**: It receives and processes user queries, preparing them for further processing. - **LLM Interaction**: The Orchestrator sends the constructed prompt to Ollama (Inference Service) and receives its responses. - **Response Formatting**: It processes the LLM's output, potentially formatting or filtering it before sending it back to the user. - **State Management**: The Orchestrator maintains the state of the conversation, ensuring continuity across multiple user interactions. - **Error Handling**: It manages any errors or exceptions that occur during the process, ensuring graceful failure modes. **Inference Services** Inference services will run a model in order to generation predictions. This lab uses Ollama, a popular open source solution for inference. Ollama facilitates the local execution of large language models (LLMs) such as Llama 2, Mistral, and in our case LLama 3.1 8B. The key features of Ollama: - **Local Execution**: Users can run powerful language models directly on their machines, enhancing privacy and control over data. - **Model Customization**: Ollama supports the creation and customization of models, allowing users to tailor them for specific applications, such as chatbots or summarization tools. - **User-Friendly Setup**: The tool provides an intuitive interface for easy installation and configuration on macOS, Linux and Windows. - **Diverse Model Support**: Ollama supports a variety models, making it versatile for different natural language processing tasks. - **Open Source**: Ollama is an open-source platform, which means its source code is publicly available, allowing for community contributions and transparency. The traffic is passing from the **Front End Application** to the **AI Orchestrator** where the conversation chat is maintained. The **AI Orchestrator** will pass the chat to **Ollama** through the **AI Gateway** which will perform the security inspection both on the request and the response. Understanding the AI Gateway initial config ------------------------------------------- To find the initial configuration for AI Gateway, go to your VS Code tab and navigate to the **Explorer**, open the **aigw_configs** folder and find ``initial_config.yaml``. .. image:: images/01.png First we define what is the endpoint of the **inference service** that hosts the LLM. .. code:: yaml services: - name: ollama executor: http config: endpoint: "http://ollama_public_ip:11434/api/chat" schema: ollama-chat Next we need to define a **profile** that points to the service. .. code:: yaml profiles: - name: default services: - name: ollama Next we define a **policy** and attach the profile to it. .. code:: yaml policies: - name: arcadia_ai_policy profiles: - name: default Lastly, we combine everything with a **route** which is the entry point that the AI Gateway is listening on. .. code:: yaml routes: - path: /api/chat policy: arcadia_ai_policy timeoutSeconds: 600 schema: openai The final configuration will look as the following and is currently applied to the AI Gateway: .. code:: yaml routes: - path: /api/chat policy: arcadia_ai_policy timeoutSeconds: 600 schema: openai # What policy is applied to the route policies: - name: arcadia_ai_policy profiles: - name: default # To what LLM endpoint we forward the request to services: - name: ollama executor: http config: endpoint: "http://ollama_public_ip:11434/api/chat" schema: ollama-chat # What do we do with the request, at the moment we just forward it profiles: - name: default services: - name: ollama Interact with the AI Assistant and review the logs -------------------------------------------------- Go ahead and ask the **AI Assistant** a question. .. image:: images/02.png Then review the **AI Gateway** logs from the **AI Gateway Web Shell** tab you previously opened. Your previously run command should continue to show you new log entries. You may need to scroll to the bottom of the screen in order to see them. If you are back at the terminal prompt, run the ``docker logs aigw-aigw-1 -f`` command again to view the logs. :: 2025/01/12 13:58:19 INFO service selected name=http/ 2025/01/12 13:58:19 INFO executing http service 2025/01/12 13:58:24 INFO service response name=http/ result="map[status:200 OK]"