PylonworksTell us what's eating your time
All posts

How I Run Five Projects From One Claude Command Center

Jordan Ellis7 min read

Five projects, one control plane, about $180 a month in Claude usage. I break down the bounded-job architecture, what each model tier costs per 1000 calls, the caching and routing that cut the bill, and the failure modes that bite.

Last month my command center spent about $180 in Claude usage to run five projects at once. The whole thing is dozens of small jobs, each a cron entry or a self-paced loop that does one task and then exits, all reporting cost and status back to a single dashboard.

What is an AI command center?

An AI command center is one control plane where a set of agents schedules, runs, and monitors automated work across several projects at the same time. Mine drives five: two product blogs, two marketing sites, and an internal tooling repo. One scheduler. One cost ledger. One status board.

The agents are short-lived. A job wakes up, does one task, writes its result somewhere durable, and exits. That last part is the entire design, and it is what makes command center automation pay off: you read a board instead of babysitting agents. I learned it the expensive way.

Why a single always-on agent falls over

The first version was one persistent agent with every tool attached and a prompt that said "manage everything." It worked in the demo. In week one it drifted to roughly 180K tokens of accumulated context, and every single call resent that history. At Sonnet 4.6 input rates of $3 per million tokens, one fat call cost $0.54 before it generated a word. Overnight a stuck retry loop on one project quietly burned about $40 while I slept.

The longer an agent stays alive, the more you pay to resend its own history on every call. Bounded jobs that exit are the cheapest reliability fix I know.

So I tore it down. Now each project's work is a set of small jobs that share nothing at runtime except a Postgres table and a log directory.

How I structure the work: bounded jobs that exit

Three mechanisms cover almost everything. Cron and systemd timers for anything on a fixed schedule. A self-paced loop for work that needs to check back at its own cadence. A one-shot invocation for everything triggered by an event.

# crontab: one bounded job per line, each exits when done
# blog drafts across the studio sites, 6am
0 6 * * *     cd /srv/cc && node jobs/draft-posts.mjs  >> /var/log/cc/draft.log 2>&1
# inbox triage, every 30 minutes
*/30 * * * *  cd /srv/cc && node jobs/triage-inbox.mjs >> /var/log/cc/triage.log 2>&1

Each job picks its own model, emits a status line the dashboard parses, and stops. The status markers are plain stdout:

echo "[CC-LABEL] pylonworks-blog"
echo "[CC-STATUS] step 2/4 . drafting the command center post"

The board reads those lines and shows me what every session is doing without my opening a single log file. When a job goes quiet for 30 seconds, a cheap Haiku summarizer writes a fallback status so nothing looks frozen.

How much does it cost to run five projects on Claude?

Most of the bill comes down to which model runs which task. I route deliberately. Haiku 4.5 handles classification and triage. Sonnet 4.6 does the bulk of the real writing and code work. Opus 4.7 only gets called when a decision actually needs judgment.

Here is what each tier costs me per 1000 calls at a typical job size, using current Anthropic API pricing:

Task Model Rate (in / out per 1M) Tokens per call Cost / 1000 calls
Inbox + lead triage Haiku 4.5 $1 / $5 8K in, 200 out ~$9
Blog draft + revision Sonnet 4.6 $3 / $15 40K in, 3K out ~$160
Build / scaffold judgment Opus 4.7 $5 / $25 30K in, 5K out ~$275

Triage runs thousands of times a month and barely registers. The Sonnet writing jobs are the real line item. Opus runs maybe 30 times a month, so its higher rate never dominates the total.

What actually cuts the bill: caching and model routing

Two things move the number more than anything else.

Prompt caching. My system prompt, tool definitions, and brand voice notes are about 12K tokens that never change between calls in a batch. Cached, those read at $0.50 per million instead of $3, which is 90% off the input rate. On a triage burst that fires 40 times inside the 5-minute cache window, that one change cut the run cost by more than half.

Model routing. The instinct is to run Sonnet for everything because it is good. Resist it. A classifier that decides "is this a real sales lead" does not need Sonnet. Moving every triage and routing decision to Haiku 4.5 dropped that whole category of work to about a cent per call. The rule I follow: the cheapest model that passes a real test gets the job.

What breaks, and the guardrails that hold

Three failure modes show up over and over.

MCP boot tax. The Claude Agent SDK boots every configured MCP server on each query() spawn, which added 6 seconds to jobs that did not need any tools. For a tight bounded task, skip it:

import { query } from "claude-agent-sdk";

const res = query({
  prompt,
  options: {
    model: "claude-haiku-4-5",
    mcpServers: {},      // skip the MCP boot tax
    strictMcpConfig: true,
    settingSources: [],
  },
});

That one change took an interactive agent from a 6-second start to 2.5 seconds. On a job that runs every 30 minutes it is invisible. On anything a person waits for it is the difference between usable and abandoned.

Concurrency. I cap concurrent agent processes at two. Past that I hit OAuth token races and the box starts swapping during a build. Five projects run through one queue, drained two at a time. There is never a need for five agents alive at once.

Retry storms. When a job hits a 429, naive code retries immediately and makes the rate limit worse. I cap retries at 3 with exponential backoff and jitter, then log the blocker and move on rather than hammering the API. A failed job that exits cleanly is cheaper than a job that fights.

This post, drafted and revised through the same pipeline, cost about $0.21 in Sonnet 4.6 tokens. The meter runs on the meta posts too.

The one thing to change first

If you are running one big agent today, split off a single task into one cron job that does that task and exits, and add one log line recording its token cost. Watch it for a week. Once you can see the per-job number, every other decision (which model, how often, what to cache) gets obvious. You cannot tune a bill you cannot see.


How much does a Claude agent run cost per 1000 calls?

It depends on the model and the context size. A small Haiku 4.5 classification job at 8K input tokens runs about $9 per 1000 calls. A Sonnet 4.6 writing job at 40K input and 3K output runs about $160 per 1000. An Opus 4.7 judgment call at 30K in and 5K out runs about $275. Context size matters as much as the rate, which is why trimming what you resend is the first optimization.

What happens when a scheduled agent hits the rate limit?

The API returns a 429 with a retry-after hint. The job should back off exponentially with jitter, cap retries at 3 to 5, then stop and log the blocker instead of looping. Immediate retries are the single most common way to turn a brief limit into a sustained one. With bounded jobs, a clean exit and a retry on the next scheduled run is almost always fine.

When should I use Haiku instead of Sonnet for agent work?

Use Haiku 4.5 for any task with a checkable right answer: classification, routing, extraction, yes/no triage. It runs at $1/$5 per million versus Sonnet's $3/$15, and for those jobs the quality difference is noise. Keep Sonnet 4.6 for open-ended writing, multi-step reasoning, and code. Route on whether the task has a checkable answer. If it does, Haiku is the default.


Tired of re-keying the same data between tools? Pylonworks builds custom automation and internal tools for businesses without a developer, on a fixed quote you approve up front. Tell us what's eating your time

Back to all posts