Landscape

Five shapes. Pick yours.

AI coding tools are routinely compared against each other in procurement spreadsheets that treat them as alternatives. They are mostly not alternatives. Below is the field as we see it — five distinct shapes, what each one is good for, when it’s the wrong choice, and the architectural facts that distinguish them. If Colony is the right shape, you’ll keep reading. If a different shape is, you’ll know which one.

Editor-resident assistants

Cursor · Windsurf · Copilot in the IDE

Good for

Augmenting a single human at the keyboard. Inline completions, refactor suggestions, chat about the file in front of you.

Wrong choice for

Anything that needs to run while you sleep. There is no pipeline, no cross-task state, no merge discipline; the unit of work is a session that ends when you close the tab.

Single-task agents

Devin · Copilot Coding Agent · OpenAI Codex

Good for

You hand the agent one ticket; it returns one PR. Good for narrow, well-specified tasks where the cost and risk of a single autonomous attempt are bounded.

Wrong choice for

A backlog. No cross-task orchestration, no shared pipeline state, no aggregate cost attribution across the portfolio of work. Forty parallel attempts at forty tickets does not add up to a pipeline.

Bulk-generation platforms

Blitzy

Good for

A project-shaped engagement where the deliverable is a large chunk of generated code in one go — greenfield modules, codebase translations, big migrations.

Wrong choice for

Continuous engineering. The unit of work is a long-running session that produces a deliverable, not a pipeline that picks up the next ticket on its own and the next and the next.

Agent SDKs and toolkits

OpenHands · open-source agent frameworks

Good for

Building your own agent on opinionated primitives. Research, experimentation, bespoke internal tooling where the team explicitly wants to own the orchestration model.

Wrong choice for

You don’t want to be in the business of building the pipeline. The toolkit is generous, but it is a toolkit; you bring the state machine, the cost ledger, and the audit trail.

Orchestration substrate

Colony

Good for

A pipeline of specialized agents that runs against your backlog continuously. State machine, per-issue cost ledger, audit trail, configurable merge gate. Resident, not ephemeral.

Wrong choice for

You want inline autocompletion in your editor, or a single hands-on attempt at one ticket and nothing more. Colony is the layer around those things, not a replacement for them.

Architectural facts

The dimensions a procurement team should actually compare.

Marketing claims are slippery. These are publicly verifiable from each vendor’s own documentation, pricing pages, and license files — the things a buyer can confirm without a sales call. Last verified 2026-05-19.

Shape

Cursor / Windsurf	Editor-resident; per-keystroke session
Devin / Copilot Coding Agent / Codex	Per-task agent; one ticket in, one PR out
Blitzy	Project-shaped batch; long-running generation session
OpenHands	SDK; you compose the agent and the pipeline
Colony	Continuous pipeline; multi-agent, multi-issue, resident

Runtime presence

Cursor / Windsurf	Ephemeral — invoked per keystroke; no context across sessions
Devin / Copilot Coding Agent / Codex	Ephemeral per task — agent spawns, processes one issue, exits
Blitzy	Ephemeral per project — long-running session, then teardown
OpenHands	Whatever you build into your agent
Colony	Resident — runs continuously; state and context accumulate in Postgres across every issue

License

Cursor / Windsurf / Devin / Codex / Copilot CA / Blitzy	Closed-source
OpenHands	MIT
Colony	AGPL-3.0 (OSS core); proprietary Cloud

Pipeline state

Cursor / Windsurf	Session-local; not queryable
Devin / Codex	Vendor-internal session model; not publicly queryable
Copilot Coding Agent	GitHub Actions logs
OpenHands	Whatever you build into your agent
Colony	Queryable Postgres state machine; thirteen documented states

Cost attribution

Cursor / Windsurf / Devin / Codex / Copilot CA	Per-seat or per-task; not per-issue or per-agent
Blitzy	Per-line generated (per public pricing) plus deployment tiers
OpenHands	Whatever you instrument
Colony	Per-issue, per-agent, per-tenant ledger; CSV export; queryable

Pricing surface

Cursor / Windsurf	Per-seat subscription
Codex / Copilot CA	Bundled with ChatGPT or Copilot plan
Devin	Published tiers: Free · Pro $20/mo · Max $200/mo · Teams $80/mo per seat · Enterprise custom
Blitzy	Published deployment tiers (eval $0–$250k; deployment $500k+/yr); per-line unit price
Colony	Pilot from $35k flat (6 weeks); production engagements scoped

Which shape is right for you?

A short decision aid.

You want a faster typist sitting next to a human engineer. An editor-resident assistant. Cursor / Windsurf / Copilot in the IDE.
You have a small number of narrow, well-specified tickets you’d like an agent to attempt autonomously, one at a time. A single-task agent. Devin / Copilot Coding Agent / Codex.
You need a large amount of code generated in a defined engagement — a migration, a greenfield module, a translation. A bulk-generation platform. Blitzy.
You explicitly want to build your own agent system, and you’re ready to own the orchestration. An agent SDK. OpenHands.
You want a pipeline that runs against your backlog continuously, with per-issue cost attribution, a merge gate you configure, and an audit trail you can hand an auditor. An orchestration substrate. Colony — the receipts are public.

If the right shape is Colony, three ways to run it.

OSS, Cloud, or Colony-as-a-Service — compare them →

Talk to us → Deployment comparison →