.. image:: ../images/darkbanner.png Module 1 - F5 AI Gateway configuration walkthrough ================================================== A deeper look into configuring the AI Gateway components -------------------------------------------------------- In the introduction, we reviewed F5 AI Gateway's two main components, **Core** and **Processors**. There are several key configuration components to understand as well: +------------------+------------------------------------------------------+ | Component | Description | +==================+======================================================+ | ``routes`` | the applications exposed by AIGW and made | | | accessible to Gen AI apps and users | +------------------+------------------------------------------------------+ | ``services`` | the LLM models AIGW will route requests to | +------------------+------------------------------------------------------+ | ``policies`` | a set of reusable rules that are evaluated at | | | runtime on a per-request basis, picking the best | | | processing profile for the request | +------------------+------------------------------------------------------+ | ``profiles`` | applies processors to incoming and response requests | +------------------+------------------------------------------------------+ | ``processors`` | the AI middleware that provide enhancements, | | | protections and performance improvements for AI | | | applications | +------------------+------------------------------------------------------+ Component diagram ----------------- .. mermaid:: flowchart LR A[Upstream Proxy] --> AIGW[AIGW Core] subgraph AIGW[AIGW Core] direction LR Routes@{shape: procs} --> Policies@{shape: procs} --> Profiles@{shape: procs} -->Services@{shape: procs} end subgraph AIGWProcessors[AIGW Processors] direction LR Processors@{shape: procs} end Profiles --> AIGWProcessors Services --> OpenAI Services --> Anthropic Services --> Ollama Services --> OtherLLM[Other external LLMs] The **AIGW core** component acts as an HTTP router for incoming AI requests. The core is configured by defining ``routes`` that will accept incoming requests that have been processed by the upstream proxy. It will evaluate against the associated ``policies`` which has ``profiles`` that define whether to send the request to a ``processor`` and then on to the LLMs using the ``services`` that AIGW has been configured with. | The **AIGW processors** are the AI middleware that provide protections, enhancements and improvements to incoming AI specific requests. There are several different F5 processors available, including for the `OWASP Top 10 for LLM and GenAI Apps <https://genai.owasp.org/llm-top-10/>`__. | Once the ``processor`` has inspected the prompt and taken necessary actions, it sends the request (or response) back to the **core** for routing of requests. **NOTE**: AIGW will support 9 of the 10 OWASP Top 10 protections. Breaking down a sample AIGW configuration file ---------------------------------------------- AI Gateway's configuration is written in YAML and consists of a number of sections which are described below. General ~~~~~~~ The general settings section allows you configure a port to listen for incoming requests as well as the ``adminServer`` port. .. code:: yaml mode: standalone server: address: :4141 adminServer: address: :8080 The additional configurable components for this section: =================== ===================================================================================================== **Setting** **Description** =================== ===================================================================================================== **mode** Options include ``standalone`` or ``upstream``. In ``upstream`` mode, AI gateway expects a proxy to forward requests to AI Gateway. **server:** This section defines the settings for the AI Gateway core server. **- address** The address and port where AI Gateway core listens for incoming requests. **- tls** Enable TLS authentication and configure TLS cert and key paths. **- mtls** Enable for mTLS authentication and provide the required ``clientCertPath``. **adminServer:** This section defines the AI gateway's admin server. **- address** The address and port where the admin server listens for incoming requests. =================== ===================================================================================================== Here is an example of setting up mTLS with AIGW core: .. code:: yaml mode: standalone server: address: :8443 tls: enabled: true serverCertPath: .certs/server.crt serverKeyPath: .certs/server.key mtls: enabled: true clientCertPath: .certs/ca.crt adminServer: address: localhost:8080 Routes ~~~~~~ ``Routes`` define the endpoints that F5 AI gateway listens for and the policy that applies to each route. ``routes`` have the following settings: .. code:: yaml routes: - path: /insecure policy: insecure schema: openai The ``routes`` components that can be configured: +-------------------+-----------------------------------------------------------------------------------------------------+ | **Setting** | **Description** | +===================+=====================================================================================================+ | **path** | The URI of the endpoint where a service is offered. The ``path`` is user-defined and must be unique | | | from other routes. | +-------------------+-----------------------------------------------------------------------------------------------------+ | **policy** | The policy that applies to the requests for this route. | +-------------------+-----------------------------------------------------------------------------------------------------+ | **schema** | The input and output schema for the route. If the schema is not specified, raw text is expected. | | | Options are: raw, openai, anthropic, custom HTTP. | +-------------------+-----------------------------------------------------------------------------------------------------+ | **timeoutSeconds**| The number of seconds before requests to this route will timeout. | +-------------------+-----------------------------------------------------------------------------------------------------+ Policies ~~~~~~~~ ``Policies`` are a set of reusable rules that pick the best processing profile for a given request. These are evaluated at runtime and dynamically apply a processing profile for each request that is received by F5 AIGW. .. code:: yaml policies: - name: insecure profiles: - name: insecure - name: secure - name: secure - name: language profiles: - name: language Profiles ~~~~~~~~ ``Profiles`` configuration component defines a set of ``processors`` and ``services`` that apply to the **input** and the **output** of the AI model based on a set of rules using the ``inputStages`` and ``responseStages`` definitions. .. code:: yaml profiles: - name: phi3 limits: [] services: - name: ollama/phi - name: secure limits: [] inputStages: - name: protect steps: - name: prompt-injection services: - name: ollama/llama3 - name: language limits: [] inputStages: - name: analyze steps: - name: language-id responseStages: - name: watermark steps: - name: watermark Processors ~~~~~~~~~~ ``Processors`` are the available processors that have been enabled to be used by AIGW. They are applied to incoming requests and responses using ``profiles``. Different processors can be used for different use cases. For example, a processors can look for **prompt injection** attacks while others can inspect requests for **pii** data. You can also apply multiple processors to any given request or response. .. code:: yaml processors: - name: language-id type: external config: endpoint: "http://aigw-processors-f5:8000" namespace: "f5" version: 1 - name: system-prompt type: external config: endpoint: "http://aigw-processors-f5:8000" namespace: "f5" version: 1 - name: watermark type: external config: endpoint: "http://aigw-processors-f5:8000" namespace: "f5" version: 1 - name: pii-redactor type: external config: endpoint: "http://aigw-processors-f5:8000" namespace: "f5" version: 1 - name: prompt-injection type: external config: endpoint: "http://aigw-processors-f5:8000" namespace: "f5" version: 1 params: allow_rejection: true Processors can run in parallel ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ By default, when you apply multiple processors to a request, they will run sequentially, one after another. Alternatively, you can configure ``processors`` to run in parallel using the ``concurrency`` option in the ``profiles`` section in ``aigw.yaml``. **NOTE:** When running ``processors`` with ``concurrency`` enabled, the processors cannot modify the content of the input or output. They can only add metadata and tags to the content. **Example** .. code:: yaml profiles: - name: parallel-example concurrency: parallel inputStages: - name: protect steps: - name: language-id - name: system-prompt Services ~~~~~~~~ ``Services`` are configured upstream LLM services that AIGW has been configured to route traffic to. .. code:: yaml services: - name: ollama/phi type: phi3 executor: ollama config: endpoint: "http://llmmodel01:11434/api/generate" - name: ollama/llama3 type: llama3 executor: ollama config: endpoint: "http://llmmodel01:11434/api/generate" executor: ollama - name: ollama/llama32 type: llama3 executor: ollama config: endpoint: "http://llmmodel01:11434/api/generate" executor: ollama | The different components of ``services`` in F5 AIGW configuration: +------------------------+-------------------------------------------------------------------------------------------------+ | **Setting** | **Description** | +========================+=================================================================================================+ | **name** | The name of the service. User-defined and must be unique. | +------------------------+-------------------------------------------------------------------------------------------------+ | **type** | Indicates the type of model that the service provides. For example, for ``openAI/azure``, | | | ``ollama/llama3``. | +------------------------+-------------------------------------------------------------------------------------------------+ | **executor** | Indicates which executor to use to process the request. Options are: ``openai``, ``anthropic``, | | | or ``ollama``. | +------------------------+-------------------------------------------------------------------------------------------------+ | **config:** | The configuration of the executor, allowing additional key-value pairs to be passed to the | | | executor. | +------------------------+-------------------------------------------------------------------------------------------------+ | **- endpoint** | The endpoint URL of the service. | +------------------------+-------------------------------------------------------------------------------------------------+ | **- apiVersion** | For azure type services, obtained from Azure AI studio. The version of ``OpenAI API`` to use. | +------------------------+-------------------------------------------------------------------------------------------------+ | **- anthropicVersion** | For anthropic type services, the version of the ``Anthropic API`` to use. | +------------------------+-------------------------------------------------------------------------------------------------+ | **- secrets** | Defines the source and names of the secrets needed by the service (API Keys). | +------------------------+-------------------------------------------------------------------------------------------------+ .. Using external LLM services .. ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. F5 AIGW also supports other cloud LLM services, including Anthropic, .. OpenAI (public and azure). You will need to provide your own API key in .. order to use the cloud service with AIGW. .. Refer to `here`_ for examples of how to set this up. .. .. _here: ../externalllm.html .. image:: ../images/Designer.jpeg