LLM Agents with Knative: A Demo ¶
Published on: 2024-07-26
LLM Agents with Knative: A Demo¶
Author: Calum Murray, Software Engineering Intern @ Red Hat
In the previous blog post on LLM Agents we discussed what LLM Agents are, how LLMs call tools, and how Knative can be used to automate the tool calling process for LLM Agents. In this blog post we will be looking at a concrete example of how you can build an LLM Agent application with Knative handling the tool calling and discovery, and explaining what differentiates this approach from the other common approaches for building LLM Agents.
An example LLM Agent using Knative¶
To showcase the power of building LLM Agents with the tool discovery and calling handled by Knative, we are going to build a simple chat application that is capable of answering questions about the cost of resources being used for a fictitious application we are running. Specifically, in our application we are interested in accurately answering questions such as:
How has the trend in the average resource consumption (CPU, memory) changed over the past few months for my application, and what impact will this have on my costs?
To begin, we will be deploying the Knative LLM Tool Provider demo into a cluster we have access to following the instructions in the README in https://github.com/keventmesh/llm-tool-provider. Once we have deployed the chat app from this repository, we are able to access it by running:
kubectl port-forward svc/chat-app-service 8080:8080
If you access http://localhost:8080
in your browser you will now be able
to see the chat application running, and ask it our simple question. When
I tried this while writing the blog post, the interaction was:
Calum:
How has the trend in the average resource consumption (CPU, memory) changed over the past few months for my application, and what impact will this have on my costs?
LLM Agent:
Could you provide me with the data on the average resource consumption (CPU, GPU, memory) for your application over the past few months?
If you try this you will likely notice that your LLM agent replies similarly, and that the UI will indicate in the task list a task called “human”. This is the LLM calling the one hand-coded tool that we have in our sample app: a tool that asks the user (the human) for more information before continuing to answer the question.
From this interaction, we can see that the LLM isn’t capable of answering our
question yet: it has no information about the resource consumption or about
what the costs associated with that resource consumption are. To give the LLM
these capabilities we need to tell it about the two Knative Functions we
deployed into our cluster in the setup of the sample app:
average-resource-consumption
and resource-cost-calculator
. To do this
we will need to apply two EventTypes into our cluster which describe the
contracts expected by these Functions:
apiVersion: eventing.knative.dev/v1beta2
kind: EventType
metadata:
name: average.resource.consumption
spec:
reference:
apiVersion: serving.knative.dev/v1
kind: Service
name: average-resource-consumption
description: "This tells you the average resource consumption for recent months"
type: "average.resource.consumption"
---
apiVersion: eventing.knative.dev/v1beta2
kind: EventType
metadata:
name: resource.cost.calculator
spec:
reference:
apiVersion: serving.knative.dev/v1
kind: Service
name: resource-cost-calculator
description: "This calculates the cost for a given resource kind and average usage over a month. Use this tool whenever there is a question about the cost of any cloud resources"
schemaData: '{"resourceKind":{"type":"string","description":"The type of resource you would like to calculate the cost of"},"usage":{"type":"float","description":"The average usage of the resource"},"unit":{"type":"string","description":"The unit of the resource usage"}}'
type: "resource.cost.calculator"
Both of these EventTypes refer to functions which were deployed in the install script we ran earlier. One function returns some mock usage data for the past few months, and the other is able to calculate the cost due to the usage of a specific resource type for a specific month.
The LLM Agents we built is programmed to fetch all the EventTypes in the cluster whenever a new user session starts, and map those into tools it can use. After applying both EventTypes and refreshing our brower, we can try asking our original question again:
Calum:
How has the trend in the average resource consumption (CPU, memory) changed over the past few months for my application, and what impact will this have on my costs?
LLM Agent:
The trend in the average resource consumption for your application over the past few months is as follows:
- March: CPU - 21.33 Cores, Memory - 8200 MiB
- April: CPU - 27.56 Cores, Memory - 8117 MiB
- May: CPU - 33.17 Cores, Memory - 9217 MiB
- June: CPU - 34.51 Cores, Memory - 10117 MiB
To determine the impact on your costs, I will calculate the cost for CPU and memory based on the average usage over a month for each resource. Let's proceed with the cost calculation.The estimated impact on your costs based on the average resource consumption for your application over the past few months is as follows:
- CPU cost: $448.28
- Memory cost: $161.87
These costs are calculated based on the average monthly usage of CPU and memory resources. If you have any more questions or need further assistance, feel free to ask!
Looking into the tools window, we can see that the LLM called both of the tools we provided to it, and the answer is evidently higher quality and more accurate!
What is truly powerful about what we just did was not that the LLM was able to use tools to more accurately reply to the user, it was that the LLM was able to use tools to more accurately reply to the user without us re-deploying the LLM application. We have decoupled the development and deployment of the tools from the development and deployment of the LLM agent!
Now, we are able to declaratively bind these two components together through a glue of EventTypes, but the people building our LLM agent do not need to worry about what tools it will have access to or how it will call them, and the teams making tools can give it access to their tools without learning the codebase for the LLM agent. Furthermore, since all we did was provide metadata about a service to the LLM we can now create EventTypes to describe existing services we have already built, and the LLM agent will be able to successfully interact with those systems with no code changes.
This concept of automatic tool discovery by an LLM Agent through the use of metadata is the central concept to what we are working to build. In the next blog post we will discuss our vision for how this will evolve. If you are interested in helping us build the future of LLM Agents in the cloud, message us on the CNCF slack instance!