Landscape

Five shapes. Pick yours.

AI coding tools are routinely compared against each other in procurement spreadsheets that treat them as alternatives. They are mostly not alternatives. Below is the field as we see it — five distinct shapes, what each one is good for, when it’s the wrong choice, and the architectural facts that distinguish them. If Colony is the right shape, you’ll keep reading. If a different shape is, you’ll know which one.

01

Editor-resident assistants

Cursor · Windsurf · Copilot in the IDE

Good for

Augmenting a single human at the keyboard. Inline completions, refactor suggestions, chat about the file in front of you.

Wrong choice for

Anything that needs to run while you sleep. There is no pipeline, no cross-task state, no merge discipline; the unit of work is a session that ends when you close the tab.

02

Single-task agents

Devin · Copilot Coding Agent · OpenAI Codex

Good for

You hand the agent one ticket; it returns one PR. Good for narrow, well-specified tasks where the cost and risk of a single autonomous attempt are bounded.

Wrong choice for

A backlog. No cross-task orchestration, no shared pipeline state, no aggregate cost attribution across the portfolio of work. Forty parallel attempts at forty tickets does not add up to a pipeline.

03

Bulk-generation platforms

Blitzy

Good for

A project-shaped engagement where the deliverable is a large chunk of generated code in one go — greenfield modules, codebase translations, big migrations.

Wrong choice for

Continuous engineering. The unit of work is a long-running session that produces a deliverable, not a pipeline that picks up the next ticket on its own and the next and the next.

04

Agent SDKs and toolkits

OpenHands · open-source agent frameworks

Good for

Building your own agent on opinionated primitives. Research, experimentation, bespoke internal tooling where the team explicitly wants to own the orchestration model.

Wrong choice for

You don’t want to be in the business of building the pipeline. The toolkit is generous, but it is a toolkit; you bring the state machine, the cost ledger, and the audit trail.

05

Orchestration substrate

Colony

Good for

A pipeline of specialized agents that runs against your backlog continuously. State machine, per-issue cost ledger, audit trail, configurable merge gate. Resident, not ephemeral.

Wrong choice for

You want inline autocompletion in your editor, or a single hands-on attempt at one ticket and nothing more. Colony is the layer around those things, not a replacement for them.

Architectural facts

The dimensions a procurement team should actually compare.

Marketing claims are slippery. These are publicly verifiable from each vendor’s own documentation, pricing pages, and license files — the things a buyer can confirm without a sales call. Last verified 2026-05-19.

Shape

Cursor / Windsurf Editor-resident; per-keystroke session
Devin / Copilot Coding Agent / Codex Per-task agent; one ticket in, one PR out
Blitzy Project-shaped batch; long-running generation session
OpenHands SDK; you compose the agent and the pipeline
Colony Continuous pipeline; multi-agent, multi-issue, resident

Runtime presence

Cursor / Windsurf Ephemeral — invoked per keystroke; no context across sessions
Devin / Copilot Coding Agent / Codex Ephemeral per task — agent spawns, processes one issue, exits
Blitzy Ephemeral per project — long-running session, then teardown
OpenHands Whatever you build into your agent
Colony Resident — runs continuously; state and context accumulate in Postgres across every issue

License

Cursor / Windsurf / Devin / Codex / Copilot CA / Blitzy Closed-source
OpenHands MIT
Colony AGPL-3.0 (OSS core); proprietary Cloud

Pipeline state

Cursor / Windsurf Session-local; not queryable
Devin / Codex Vendor-internal session model; not publicly queryable
Copilot Coding Agent GitHub Actions logs
OpenHands Whatever you build into your agent
Colony Queryable Postgres state machine; thirteen documented states

Cost attribution

Cursor / Windsurf / Devin / Codex / Copilot CA Per-seat or per-task; not per-issue or per-agent
Blitzy Per-line generated (per public pricing) plus deployment tiers
OpenHands Whatever you instrument
Colony Per-issue, per-agent, per-tenant ledger; CSV export; queryable

Pricing surface

Cursor / Windsurf Per-seat subscription
Codex / Copilot CA Bundled with ChatGPT or Copilot plan
Devin Published tiers: Free · Pro $20/mo · Max $200/mo · Teams $80/mo per seat · Enterprise custom
Blitzy Published deployment tiers (eval $0–$250k; deployment $500k+/yr); per-line unit price
Colony Pilot from $35k flat (6 weeks); production engagements scoped
Which shape is right for you?

A short decision aid.

  1. You want a faster typist sitting next to a human engineer. An editor-resident assistant. Cursor / Windsurf / Copilot in the IDE.
  2. You have a small number of narrow, well-specified tickets you’d like an agent to attempt autonomously, one at a time. A single-task agent. Devin / Copilot Coding Agent / Codex.
  3. You need a large amount of code generated in a defined engagement — a migration, a greenfield module, a translation. A bulk-generation platform. Blitzy.
  4. You explicitly want to build your own agent system, and you’re ready to own the orchestration. An agent SDK. OpenHands.
  5. You want a pipeline that runs against your backlog continuously, with per-issue cost attribution, a merge gate you configure, and an audit trail you can hand an auditor. An orchestration substrate. Colony — the receipts are public.

If the right shape is Colony, three ways to run it.

OSS, Cloud, or Colony-as-a-Service — compare them →