Build AI Agents Faster: 24 Examples with IBM's CUGA

Ever felt like building an AI agent involves more “plumbing” than actual intelligence? You’re not alone. Most agentic applications demand a full week of setup – choosing a framework, wiring up model clients, crafting tool adapters, and figuring out state streaming – all before your agent does anything truly useful. The exciting part, the actual agent’s purpose, often comes last.

But what if you could flip that script? IBM Research’s open-source Configurable Generalist Agent (CUGA), available with a simple pip install cuga, aims to do just that. CUGA handles the mundane but essential orchestration: the planning, the execution loop, the tool calls, and the state management. This leaves you free to focus on what truly matters: defining your agent’s tools and its core prompt.

Simplifying Agent Development with CUGA

CUGA acts as a robust agent harness for enterprise applications, taking care of the intricate details so you don’t have to. It inverts the traditional development process, allowing you to jump straight into defining your agent’s capabilities. To demonstrate this shift, the team has developed cuga-apps: two dozen lightweight, single-file applications.

From a movie recommender to an IBM Cloud architecture advisor, each app is a working example built around a single CugaAgent instance. These examples serve as a practical guide, showcasing how to build powerful agents with minimal boilerplate. You can explore the live gallery of these apps or read through one end-to-end to see the efficiency firsthand.

CUGA’s strength lies in its ability to abstract away complexity. It plans actions, executes them using a mix of tool calls and generated code (CodeAct), and maintains state throughout long, multi-step tasks. This built-in intelligence includes a reflection step that can identify and re-plan after a bad call, preventing the agent from blindly pushing forward. This sophisticated machinery is why CUGA has consistently topped agent benchmarks like AppWorld and WebArena.

Beyond its core orchestration, CUGA offers flexible configuration options, allowing you to adjust the cost/latency tradeoff with “Fast,” “Balanced,” and “Accurate” reasoning modes. You can also specify your preferred code execution sandbox (local, Docker/Podman, or E2B cloud). This flexibility means the same agent definition can perform differently based on your needs, proving that a smaller, open-weight model can often be sufficient when supported by a robust harness.

Building Your First CUGA App: A Walkthrough

Let’s look at the IBM Cloud advisor app as an example. This agent recommends actual IBM Cloud services for an architecture. The entire application fits into one main.py file, encompassing the agent factory, tools, prompt, and a small UI. The CugaAgent constructor requires just four key arguments:

model: The Large Language Model (LLM) provider, which can be dynamically switched between OpenAI, Anthropic, watsonx, LiteLLM, or Ollama via an environment variable.
tools: A list of functions the agent can access.
special_instructions: The core prompt guiding the agent’s behavior.
cuga_folder: Where the app stores its state and policies.

The tools themselves can be a mix of local Python functions and hosted Microservice Component Platform (MCP) tools. For instance, the cloud advisor uses a search_ibm_catalog function defined inline, alongside generic capabilities like web search pulled from shared MCP servers. This “borrow the rest, write your one” philosophy drastically reduces development time.

A crucial convention across CUGA apps is the standardized tool return envelope: {"ok": true, "data": {...}} for success and {"ok": false, "code": "...", "error": "..."} for failure. This seemingly simple pattern is load-bearing, enabling CUGA’s planner to gracefully handle declared failures and recover, rather than derailing on unexpected exceptions.

Ensuring Responsible AI with Declarative Guardrails

While a demo agent recommending cloud services is low-stakes, deploying an agent that writes files, runs shell commands, or interacts with production systems demands robust guardrails. CUGA addresses this critical concern with a built-in policy system that integrates directly into the agent’s runtime, not as an afterthought.

You can attach various policy types to your CugaAgent instance. For example, an Intent Guard can block destructive commands like git --force before the agent even picks a tool. Other policy types include:

Tool Approval: Inspects which tools the agent’s generated code uses.
Output Formatter: Fires only after the final message is generated, ensuring adherence to specific formats or content restrictions.
Input Cleaner: Sanitizes user input.
Factuality Check: Verifies the accuracy of generated statements.
Tool Guide: Directs the agent to use specific tools under certain conditions.

These policies can trigger based on semantic similarity, agent state, or specific tool invocations, not just exact keyword matches. The policies themselves are stored in the .cuga folder, versioned alongside your code, ensuring consistency and governance. For a hands-on example of governance in action, explore the Ouroboros app, a seven-agent lead-generation system that demonstrates the power of CUGA’s policy framework.

Source: Hugging Face Blog

Kristine Vior

With a deep passion for the intersection of technology and digital media, Kristine leads the editorial vision of HubNextera News. Her expertise lies in deciphering technical roadmaps and translating them into comprehensive news reports for a global audience. Every article is reviewed by Kristine to ensure it meets our standards for original perspective and technical depth.

Simplifying Agent Development with CUGA

Building Your First CUGA App: A Walkthrough

Ensuring Responsible AI with Declarative Guardrails

Kristine Vior

Related Posts