Essay March 26, 2026 11 min read

Integrating MuxAgent into your CI/CD pipeline

AI coding agents are not just interactive tools. The same CLI that powers your terminal workflows can run headless in CI, turning structured agent tasks into automated pipeline steps with approval controls intact.

Most teams discover AI coding agents through interactive use. You sit at a terminal, describe a task, and watch the agent work. That mode is powerful for exploratory work, but it leaves out a large category of repetitive engineering tasks that should not require a human at the keyboard at all.

Code review on every pull request. Migration verification after schema changes. Dependency update audits when Dependabot opens its weekly batch. Test generation for untested modules. These are tasks that benefit from an agent’s ability to read code, reason about it, and produce structured output — but they happen on a schedule, triggered by CI events, not by a human deciding “now is a good time to review this PR.”

MuxAgent’s CLI was designed around a daemon architecture that separates the task runtime from the terminal. That same separation makes it possible to run agent tasks headless in CI environments. The daemon manages the agent process, the workflow graph controls the execution stages, and the mobile app provides oversight when needed — even for automated tasks.

This guide covers how to set that up, what it looks like in practice, and where the boundaries are.

The architecture: why MuxAgent works headless

Understanding why MuxAgent can run in CI requires understanding its daemon architecture.

When you run muxagent daemon start on a machine, the daemon process starts in the background. It connects to the relay over WebSocket, registers the machine, and waits for task commands. The daemon is the long-running process — individual agent sessions are started and managed by the daemon.

In interactive mode, you launch tasks through the TUI (muxagent with no arguments), which talks to the daemon over a local HTTP API. The daemon spawns the agent runtime (Claude Code, Codex, or another configured runtime), manages the workflow graph, and streams events back to the TUI and the mobile app.

The key insight for CI integration is that the daemon’s local HTTP API is not limited to the TUI. Any process that can make HTTP requests to the daemon can start tasks, provide input, and read status. The daemon handles the complexity of managing the agent process, enforcing the workflow graph, and routing encrypted events to the mobile app.

In a CI environment, this means:

The daemon runs on the CI machine (or a persistent runner)
A CI script starts tasks via the daemon’s API
The workflow graph controls what happens at each stage
Approval gates can be configured to auto-approve or hold for human input
Results are captured from the daemon and used in CI pass/fail decisions

Setting up the daemon on a CI runner

The first decision is where the daemon runs. There are two practical options:

Option A: Persistent runner with pre-installed daemon

If you have a self-hosted GitHub Actions runner, GitLab runner, or Jenkins agent, install MuxAgent once and keep the daemon running:

# One-time setup on the runner
curl -fsSL https://raw.githubusercontent.com/LaLanMo/muxagent/main/install.sh | sh
muxagent daemon start

The daemon starts and triggers authentication. Pair it with your mobile app by scanning the QR code. This is a one-time step — the credentials persist in ~/.muxagent/ and the daemon reconnects automatically on restart.

The advantage of a persistent runner is that the daemon stays authenticated and connected. Tasks start immediately without setup overhead on each CI run.

Option B: Ephemeral runner with daemon lifecycle in the job

For ephemeral CI environments (GitHub Actions hosted runners, container-based CI), the daemon needs to start and authenticate within the job:

# .github/workflows/agent-review.yml
name: Agent Code Review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  agent-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install MuxAgent CLI
        run: |
          curl -fsSL https://raw.githubusercontent.com/LaLanMo/muxagent/main/install.sh | sh

      - name: Start daemon
        run: |
          muxagent daemon start
        env:
          MUXAGENT_RELAY_URL: ${{ secrets.MUXAGENT_RELAY_URL }}

      - name: Run agent review task
        run: |
          muxagent --config autonomous --runtime claude-code \
            --task "Review the changes in this PR for correctness, security issues, and style. Report findings as structured output."

      - name: Stop daemon
        if: always()
        run: muxagent daemon stop

The trade-off with ephemeral runners is that each run needs to authenticate. For automated pipelines, you would pre-provision credentials as CI secrets rather than scanning a QR code on each run.

Workflow configurations for CI

The choice of workflow configuration is more important in CI than in interactive use because the human is not sitting at a terminal waiting for prompts.

Autonomous mode for fully automated tasks

For tasks that should run without any human intervention — code review comments, lint-like analysis, test generation — use the autonomous configuration:

muxagent --config autonomous --task "..."

In autonomous mode, the workflow graph skips approval gates. The agent plans, implements, and verifies without waiting for human input. The task either succeeds or fails, and the CI script captures the result.

This is appropriate when:

The task is read-only (analysis, review) so there is no blast radius
The output is advisory (comments on a PR) rather than authoritative (merging code)
The cost of a wrong result is low (you review the output before acting on it)

Default mode with mobile approval for sensitive tasks

For tasks where you want the agent to plan and then wait for a human to approve before implementing — like automated migrations or refactoring — use the default configuration:

muxagent --config default --task "..."

The agent will plan the work and then pause at the approval checkpoint. You see the approval request on your mobile app, review the plan, and approve or reject it. If you approve, the agent continues to implementation and verification. If you reject, the CI job fails gracefully.

This creates a workflow where CI triggers the task automatically, but a human still controls the decision to proceed. The mobile app makes this practical even when you are not at your desk — managing approvals from your phone works the same way in CI as it does for interactive tasks.

Plan-only mode for analysis and reporting

For tasks where you want the agent to analyze the code and produce a plan without implementing anything:

muxagent --config plan-only --task "..."

The agent produces a plan and stops. The plan is captured as the task output and can be used in CI for documentation, PR comments, or decision inputs. No code changes are made.

This is useful for:

Generating migration plans for review before executing them
Producing architecture impact assessments for large PRs
Creating test coverage analysis reports

Practical CI patterns

Pattern 1: Automated PR review

The most straightforward CI integration. On every PR, an agent reviews the diff and posts findings as a comment.

agent-review:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 0  # Full history for diff analysis

    - name: Install and start MuxAgent
      run: |
        curl -fsSL https://raw.githubusercontent.com/LaLanMo/muxagent/main/install.sh | sh
        muxagent daemon start

    - name: Run review
      run: |
        DIFF=$(git diff origin/${{ github.base_ref }}...HEAD)
        muxagent --config autonomous --runtime claude-code \
          --task "Review this diff for bugs, security issues, and design problems. Be specific about line numbers and files. Focus on correctness, not style.

        $DIFF"

The agent reads the diff, analyzes it against the full codebase context, and produces a structured review. The output can be captured and posted as a PR comment using the GitHub API.

Pattern 2: Migration verification

After a database migration runs, an agent verifies that the application code correctly handles the schema changes.

verify-migration:
  needs: run-migration
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4

    - name: Run verification agent
      run: |
        muxagent --config default --task "Verify that all application code correctly handles the schema changes in the latest migration. Check for: missing column references, broken queries, type mismatches, and missing index usage. Report any issues found."

This uses default mode because the verification might involve running queries or inspecting application behavior — you want the approval gate so you can review the agent’s verification plan before it executes.

Pattern 3: Dependency update audit

When Dependabot or Renovate opens a batch of dependency updates, an agent audits each update for breaking changes.

audit-deps:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4

    - name: Audit dependency changes
      run: |
        muxagent --config autonomous --task "Review the dependency changes in this PR. For each updated package, check the changelog for breaking changes, verify that our usage patterns are compatible with the new version, and flag any required code changes."

This is autonomous because the task is read-only analysis. The agent’s output informs the decision to merge, but the agent does not modify code.

Pattern 4: Test generation for new code

When new code is added without tests, an agent generates test cases.

generate-tests:
  runs-on: ubuntu-latest
  if: github.event_name == 'pull_request'
  steps:
    - uses: actions/checkout@v4

    - name: Generate tests for uncovered code
      run: |
        muxagent --config default --task "Identify new functions and methods added in this PR that lack test coverage. Generate appropriate unit tests for them. Follow the existing test patterns in the codebase."

This uses default mode because the agent will write code (test files). The approval gate lets you review the test plan before the agent generates tests, ensuring the approach matches your testing philosophy.

Handling approval gates in CI

The most interesting architectural question in CI integration is how to handle approval gates. There are three approaches:

Auto-approve: skip the gate

For low-risk, high-frequency tasks (like read-only reviews), configure the workflow to auto-approve. The agent proceeds through all stages without human input.

Mobile approval: keep the gate

For tasks with meaningful blast radius, keep the approval gate active. When the agent reaches the approval checkpoint, you receive a notification on the mobile app. You review and approve from your phone, and the CI job continues.

This works because MuxAgent’s relay architecture means the mobile app stays connected to the daemon regardless of where the daemon is running. A CI runner in GitHub’s infrastructure is as reachable as your local workstation — the relay handles the routing, and end-to-end encryption ensures the CI provider cannot see the agent’s output.

Timeout with fallback: gate with a deadline

For tasks where you want human approval but cannot guarantee someone is available, configure a timeout. If no approval arrives within the timeout window, the task either fails gracefully (conservative) or auto-approves (progressive).

muxagent --config default --approval-timeout 30m --task "..."

The timeout approach is practical for teams where CI runs happen during off-hours. The mobile notification gives the on-call person a window to intervene, but the pipeline does not block indefinitely.

Monitoring CI agent runs

One of the advantages of running agent tasks through MuxAgent’s daemon (rather than invoking the agent CLI directly) is that CI runs are visible in the mobile app just like interactive sessions.

You can see:

Which CI jobs have active agent sessions
What stage each session is in (planning, implementing, verifying)
The full event stream including agent messages and tool calls
Cost information for the agent’s API usage

This visibility matters for debugging failed CI runs. Instead of reading log files, you can see exactly what the agent attempted and where it went wrong — the same session event stream you see for interactive tasks.

For teams running multiple CI jobs with agent tasks, the mobile app provides a fleet view: all active sessions across all CI runners in one screen. That is the same multi-machine management pattern described in the multi-machine guide, applied to CI infrastructure.

Limitations and boundaries

There are real limitations to be honest about.

Latency. Agent tasks take minutes, not seconds. A CI step that invokes an agent to review a PR will add several minutes to the pipeline. This is acceptable for async workflows (post-merge verification, scheduled audits) but may be too slow for pre-merge gates on every commit.

Cost. Each agent invocation uses API tokens (for Claude Code, Codex, etc.). Running an agent on every PR in a high-volume repository adds up. Budget the API costs as a CI expense and consider rate-limiting agent tasks to significant PRs.

Determinism. Agent output is non-deterministic. The same task run twice may produce different results. For CI use cases that require exact reproducibility (like test suites), agents are better used as advisory tools rather than pass/fail gates.

State. Ephemeral CI runners lose all state between runs. The daemon’s session history, cost tracking, and event log start fresh each time. For persistent state across CI runs, use a persistent runner or external storage.

Authentication. Setting up the initial pairing requires scanning a QR code, which is interactive. For CI, you need to pre-provision credentials as secrets. The details of credential provisioning for CI are documented in the CLI’s configuration guide.

When CI integration makes sense

CI integration with MuxAgent is most valuable when:

You have recurring tasks that an agent can handle and that currently require manual intervention on every PR or deploy
The tasks benefit from codebase context — the agent needs to understand your code, not just the diff
You want human oversight without human presence — the mobile approval pattern lets you keep control without being at a terminal
You are already using MuxAgent interactively and want to extend the same workflow patterns to automation

If you are not yet using MuxAgent, the getting-started guide covers the initial setup. The CI integration builds on that foundation — the daemon, relay, and workflow system are the same whether the task comes from a human at a TUI or from a CI script.

Getting started

The minimal path to a working CI integration:

Install the CLI on your runner
Start the daemon and authenticate (one-time for persistent runners)
Write a CI step that invokes muxagent with the appropriate config and task
Choose your approval strategy (auto, mobile, timeout)
Test with a single low-risk task before expanding

Start with an autonomous review task on a non-critical repository. Once you see the output quality and understand the latency profile, expand to more tasks and more repositories. The workflow configuration system lets you tune the approval level per task, so you can be conservative where it matters and fully automated where it does not.