Cicaddy: How Red Hat Turned CI/CD Pipelines Into AI Agent Runtimes
Every organization building AI agents faces the same infrastructure question: where do these things actually run? The default answer — spin up dedicated agent servers, provision GPU nodes, manage a separate orchestration layer — creates a parallel infrastructure that doubles your operational surface area overnight.
Red Hat's answer is different, and it might be the most pragmatic take on production AI agents in 2026: run them inside the CI/CD pipeline you already have.
Cicaddy is a Python-based, platform-agnostic framework that turns any CI/CD job into an AI agent runtime [1]. Not a sidecar. Not a separate service. The agent executes directly inside your existing pipeline step, uses the same container, the same permissions model, and the same scheduling triggers your team already understands. When the job finishes, the agent is gone. No lingering processes. No infrastructure to manage between runs.
The spine of this idea is simple: your CI/CD system is already a secure, scheduled, event-driven execution environment with access to your code. Why build another one?
The Problem With Dedicated Agent Infrastructure
Platform engineering teams that have piloted AI agents in production know the pattern. You start with a proof-of-concept: an agent that reviews merge requests or runs daily health checks. It works on a developer's laptop. Then comes the question of deployment.
Suddenly you need a long-running service, a queue for incoming events, webhook routing, secrets management separate from your existing vault, container orchestration for the agent itself, monitoring, alerting, and a security review that asks uncomfortable questions about what this always-on process can access.
Most pilot projects die here. Not because the agent did not work, but because the operational cost of running it in production exceeded the value it delivered. The agent was useful; the infrastructure around it was not worth building.
Cicaddy sidesteps this entirely by reframing the question. Instead of asking "where do we deploy agents?", it asks "where do we already run automated tasks that respond to code events?" The answer, for virtually every engineering organization, is the CI/CD system.
Three Agent Types, Three Trigger Patterns
Cicaddy defines three agent types, each mapped to a CI/CD event pattern that platform teams already use [1]:
MR Agent triggers on merge request events — opened, updated, commented. This is your code review agent. It reads the diff, analyzes the changes against project conventions, and posts structured feedback. It runs in the same pipeline job that would normally run linters or tests.
Branch Agent triggers on branch pushes. Push to main and the agent runs analysis, generates changelogs, or updates downstream configurations. Push to a feature branch and it validates architecture decisions against documented standards.
Task Agent triggers on schedules or manual dispatch. Daily service health monitoring. Weekly DORA metrics computation. On-demand security scans. These are the agents that replace the cron jobs your team is already maintaining — except now they can reason about what they find.
The mapping is deliberate. Every CI/CD system already supports these three trigger patterns. Cicaddy does not invent new event primitives; it reuses the ones your GitLab CI, GitHub Actions, or Tekton pipeline already understands.
One-Shot Execution, Multi-Turn Reasoning
Here is where Cicaddy diverges from most agent frameworks. The agent does not persist between runs. Each pipeline invocation is a one-shot execution. But within that single execution, the agent performs multi-turn reasoning using the ReAct pattern — typically 10 to 30+ inference turns per run [1].
This matters because it solves the statefulness problem that plagues production agents. A long-running agent accumulates state, drifts from its intended behavior, and becomes harder to debug over time. A one-shot agent starts clean every time. Its behavior is reproducible. Its logs are bounded. Its failure modes are the same failure modes your team already knows how to debug in CI/CD.
The execution follows a three-phase pattern: pre-processing, AI reasoning, and post-processing [1]. Pre-processing gathers context — diffs, file contents, configuration values. The AI phase performs the multi-turn reasoning loop. Post-processing formats the output, posts comments, updates labels, or writes reports. Each phase is explicit and inspectable.
DSPy Task Definitions: YAML All the Way Down
Agent behavior is defined through DSPy task specifications written in YAML [1]. If your team already writes pipeline configurations in YAML, the authoring model will feel familiar.
A task definition specifies a persona, inputs, expected outputs, and tool constraints:
task:
persona: >
You are a senior software engineer reviewing merge requests.
You focus on correctness, security implications, and adherence
to the project's coding standards documented in CONTRIBUTING.md.
inputs:
- merge_request_diff
- project_conventions
- recent_commit_messages
outputs:
- review_summary: structured feedback with severity levels
- approval_recommendation: approve, request_changes, or comment
tool_constraints:
allowed:
- read_file
- glob_files
denied:
- write_file
- execute_command
The tool_constraints block is worth pausing on. It enforces the principle of least privilege at the task level. A code review agent can read files but not write them. A metrics agent can read and write reports but cannot execute arbitrary commands. The constraints are declarative, auditable, and enforced by the framework — not by trusting the LLM to follow instructions.
The persona field shapes the agent's reasoning. The inputs and outputs fields define the contract. When the agent finishes, its output is validated against the declared output schema. If the agent hallucinates a field or omits a required one, the framework catches it before the result reaches your pipeline's post-processing step.
MCP Integration: External Tools Without External Infrastructure
Cicaddy's power scales with the tools available to it. Built-in tools cover file operations — read_file and glob_files scoped to the CI container's workspace [1]. But real-world agents need more: documentation servers, code search APIs, vulnerability databases, project management systems.
This is where Model Context Protocol (MCP) integration comes in. Cicaddy supports four MCP transport protocols — HTTP, stdio, SSE, and WebSocket — configured through a YAML block [1]:
MCP_SERVERS_CONFIG:
context7:
transport: http
url: https://context7.example.com/mcp
description: "Library documentation and API reference lookup"
local_scanner:
transport: stdio
command: /usr/local/bin/security-scanner
args: ["--mcp-mode"]
description: "Static analysis and vulnerability detection"
Each MCP server becomes a tool the agent can invoke during its reasoning loop. The context7 server in the example above is not hypothetical — Red Hat uses Context7 MCP in production for merge request code review, giving the agent access to up-to-date library documentation while it evaluates code changes [1].
The stdio transport is particularly interesting for CI/CD environments. A locally installed binary — a linter, a scanner, a custom analysis tool — can expose itself as an MCP server without any network configuration. The agent communicates with it through standard I/O, the same way Unix pipes have connected tools for fifty years.
Security: The Container Is the Sandbox
The security model inherits directly from the CI/CD environment. Cicaddy agents run inside containerized CI jobs, typically based on Red Hat's Universal Base Image (UBI) [1]. No elevated privileges. No host network access. No persistent storage beyond what the pipeline job provides.
This is a significant advantage over dedicated agent infrastructure. When you deploy a long-running agent service, you own its security posture: network policies, secret rotation, access control lists, audit logging. When you run an agent inside a CI/CD job, the security posture is the one your platform team already defined, reviewed, and hardened for every other pipeline job.
The agent cannot access secrets it was not granted through the pipeline's secret management system. It cannot reach network endpoints outside the CI environment's egress rules. It cannot persist data between runs unless you explicitly configure artifact storage. The blast radius of a misbehaving agent is identical to the blast radius of a misbehaving test suite — bounded by the container, terminated when the job ends.
What Red Hat Runs in Production
This is not a theoretical framework. Red Hat runs Cicaddy agents in production across multiple use cases [1]:
Daily service health monitoring. A Task Agent runs on a schedule, queries service endpoints, analyzes response patterns, and generates health reports. When it detects anomalies — increased latency, error rate spikes, certificate expiration approaching — it files issues with structured severity assessments.
DORA metrics computation. Another Task Agent calculates deployment frequency, lead time for changes, mean time to recovery, and change failure rate across multiple repositories. The agent does not just aggregate numbers; it reasons about trends and flags deterioration before it becomes critical.
Merge request code review with Context7 MCP. An MR Agent triggers on every merge request, reads the diff, pulls relevant library documentation through Context7, and posts structured review feedback. The review covers correctness, security implications, and adherence to project conventions — the same things a human reviewer checks, executed consistently on every MR without reviewer fatigue.
Template-Based Adoption: One Line to Opt In
Red Hat solved the adoption problem with shared pipeline templates [1]. A team that wants to add AI-powered code review to their merge request pipeline adds a single include to their existing CI configuration:
include:
- project: 'platform/cicaddy-templates'
file: '/templates/mr-review.yml'
variables:
CICADDY_TASK: "code-review"
CICADDY_MODEL: "granite-3.1-8b"
Two lines of configuration. The template handles everything else: pulling the Cicaddy container, loading the DSPy task definition, configuring MCP servers, running the agent, and posting results. Teams do not need to understand agent internals. They need to understand two variables.
This template pattern is how Red Hat scales from "one team piloting AI agents" to "every team using AI agents." The platform team maintains the templates. Product teams consume them. The operational burden stays centralized.
Token-Aware Execution: Budget Without Surprise Bills
Production AI agents burn tokens, and tokens cost money. Cicaddy tracks token usage throughout the agent's reasoning loop and enforces budgets at the task level [1]. When an agent approaches its token limit, the framework triggers recovery mechanisms — compressing context, summarizing intermediate results, or gracefully terminating with a partial output rather than failing silently.
This is the kind of production concern that proof-of-concept frameworks ignore and production teams discover painfully. An agent that enters an infinite reasoning loop on a pathological input can consume thousands of dollars in API calls before anyone notices. Cicaddy caps the damage by design.
The budget tracking also produces operational data. Over time, teams learn the token cost of each agent type, correlate cost with output quality, and right-size their model choices. A code review agent that performs adequately with a smaller model does not need the most expensive one.
The Shift This Represents
Cicaddy is not trying to be the next LangChain or CrewAI. It is not a general-purpose agent framework for building chatbots, research assistants, or autonomous coding agents. It solves one specific problem: running AI-powered automation in the execution environment your organization already operates, secures, and trusts.
The implication for platform engineering teams is worth stating directly. You do not need to build agent infrastructure. You already have it. Your CI/CD system is an event-driven, containerized, scheduled, permission-controlled execution environment with access to your code, your secrets, your deployment targets, and your monitoring. Cicaddy treats it as exactly that.
For teams evaluating where to start: pick the MR Agent. Configure a code review task against one repository. Use the template-based approach to limit the blast radius. Let it run for two weeks alongside your human reviewers, compare the output, and calibrate. The investment is one YAML include and a task definition. The risk is a CI job that posts unhelpful comments until you tune it.
That is the lowest-stakes entry point into production AI agents you will find anywhere — and the infrastructure it runs on is already in your budget.
References
[1] Red Hat Developer — How to develop agentic workflows in a CI pipeline with cicaddy. Article