How-To: Define and use agents with Ragbits#

Ragbits Agent combines the reasoning power of LLMs with the ability to execute custom code through tools. This makes it possible to handle complex tasks by giving the model access to your own Python functions.

When using tool-enabled agents, the LLM reviews the system prompt and incoming messages to decide whether a tool should be called. Instead of just generating a text response, the model can choose to invoke a tool or combine both approaches.

Before using tools, you can check whether your selected model supports function calling with:

litellm.supports_function_calling(model="your-model-name")

If function calling is supported and tools are enabled, the agent interprets the user input, decides whether a tool is needed, executes it if necessary, and returns a final response enriched with tool results.

This response is encapsulated in an AgentResult, which includes the model's output, additional metadata, conversation history, and any tool calls performed.

How to build an agent with Ragbits#

This guide walks you through building a simple agent that uses a get_weather tool to return weather data based on a location.

Define a tool function#

First, define the function you want your agent to call. It should take regular Python arguments and return a JSON-serializable result.

import json


def get_weather(location: str) -> str:
    """
    Returns the current weather for a given location.

    Args:
        location: The location to get the weather for.

    Returns:
        The current weather for the given location.
    """
    if "tokyo" in location.lower():
        return json.dumps({"location": "Tokyo", "temperature": "10", "unit": "celsius"})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "72", "unit": "fahrenheit"})
    elif "paris" in location.lower():
        return json.dumps({"location": "Paris", "temperature": "22", "unit": "celsius"})
    else:

Define a prompt#

Use a structured prompt to instruct the LLM. For details on writing prompts with Ragbits, see the Guide to Prompting.

from pydantic import BaseModel
from ragbits.core.prompt import Prompt


class WeatherPromptInput(BaseModel):
    """
    Input format for the WeatherPrompt.
    """

    location: str


class WeatherPrompt(Prompt[WeatherPromptInput]):
    """
    Prompt that returns weather for a given location.
    """

    system_prompt = """
    You are a helpful assistant that responds to user questions about weather.
    """

    user_prompt = """
    Tell me the temperature in {{ location }}.

Run the agent#

Create the agent, attach the prompt and tool, and run it:

import asyncio
from ragbits.agents import Agent
from ragbits.core.llms import LiteLLM


async def main() -> None:
    """
    Run the example.
    """
    llm = LiteLLM(model_name="gpt-4o-2024-08-06", use_structured_output=True)
    agent = Agent(
        llm=llm,
        prompt=WeatherPrompt,
        tools=[get_weather],
        default_options=AgentOptions(max_total_tokens=500, max_turns=5),
    )

The result is an AgentResult, which includes the model's output, additional metadata, conversation history, and any tool calls performed.

You can find the complete code example in the Ragbits repository here.

Conversation history#

Agents can retain conversation context across multiple interactions by enabling the keep_history flag when initializing the agent. This is useful when you want the agent to understand follow-up questions without needing the user to repeat earlier details.

To enable this, simply set keep_history=True when constructing the agent. The full exchange—including messages, tool calls, and results—is stored and can be accessed via the AgentResult.history property.

Example of context preservation#

The following example demonstrates how an agent with history enabled maintains context between interactions:

async def main() -> None:
    """Run the weather agent with conversation history."""
    llm = LiteLLM(model_name="gpt-4o-2024-08-06", use_structured_output=True)
    agent = Agent(llm=llm, prompt=WeatherPrompt, tools=[get_weather], keep_history=True)

    await agent.run(WeatherPromptInput(location="Paris"))

    # Follow-up question about Tokyo - the agent retains weather context
    response = await agent.run("What about Tokyo?")
    print(response)

In this scenario, the agent recognizes that the follow-up question "What about Tokyo?" refers to weather information due to the preserved conversation history. The expected output would be an AgentResult containing the response:

AgentResult(content='The current temperature in Tokyo is 10°C.', ...)

Streaming agent responses#

For use cases where you want to process partial outputs from the LLM as they arrive (e.g., in chat UIs), the Agent class supports streaming through the run_streaming() method.

This method returns an AgentResultStreaming object — an async iterator that yields parts of the LLM response and tool-related events in real time.

Native OpenAI tools#

Ragbits supports selected native OpenAI tools (web_search_preview, image_generation and code_interpreter). You can use them together with your tools.

from ragbits.agents.tools import get_web_search_tool

async def main() -> None:
    """Run the weather agent with additional tool."""
    model_name = "gpt-4o-2024-08-06"
    llm = LiteLLM(model_name=model_name, use_structured_output=True)
    agent = Agent(llm=llm, prompt=WeatherPrompt, tools=[get_web_search_tool(model_name)], keep_history=True)

    response = await agent.run(WeatherPromptInput(location="Paris"))
    print(response)

Tool descriptions are available here. For each of these you can see detailed information on the corresponding sub-pages (i.e. here for web search). You can use default parameters or specify your own as a dict. For web search this might look like that:

from ragbits.agents.tools import get_web_search_tool

tool_params = {
        "user_location": {
            "type": "approximate",
            "country": "GB",
            "city": "London",
            "region": "London",
        }
}
web_search_tool = get_web_search_tool("gpt-4o", tool_params)