Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

saltpath · 2026-03-26T09:48:24 1774518504

The parallel execution model makes sense for independent tickets but I'm wondering what happens when agent A is halfway through a PR touching shared/utils.py and agent B gets assigned a ticket that needs the same file. Does the orchestrator do any upfront dependency analysis to detect that, or do you just let them both run and deal with the conflict at merge time?

stingraycharles · 2026-03-26T09:28:36 1774517316

I’ve come to the realization that these kind of systems don’t work, and that a human in the loop is crucial for task planning; the LLM’s role being to identify issues, communicate the design / architecture, etc before it’s handed off, otherwise the LLM always ends up doing not entirely the correct thing.

How is this part tackled when all that you have is GH issues? Doesn’t this work only for the most trivial issues?

mshark · 2026-03-26T10:25:46 1774520746

Had the same realization which inspired eforge (shameless plug) https://github.com/eforge-build/eforge - planning stays in the developer’s control with all engineering (agent orchestration) handed off to eforge. This has been working well for a solo or siloed developer (me) that is free to plan independently. Allows the developer to confidently stay in the planning plane while eforge handles the rest using a methodology that in my experience works well. Of course, garbage in garbage out - thorough human planning (AI assisted, not autonomous) is key.

stingraycharles · 2026-03-26T10:35:20 1774521320

To me that doesn't do enough yet in terms of up-front planning and visualization, but it's a step in the right direction. I prefer Traycer myself.

denysvitali · 2026-03-26T00:55:52 1774486552

FWIW, a "cheaper" version of this is triggering Claude via GitHub Actions and `@claude`ing your agents like that. If you run your CI on Kubernets (ARC), it sounds pretty much the same

pianopatrick · 2026-03-26T05:49:03 1774504143

I wonder, based on your experience, how hard would it be to improve your system to have an AI agent review the software and suggest tickets?

Like, can an AI agent use a browser, attempt to use the software, find bugs and create a ticket? Can an AI agent use a browser, try to use the software and suggest new features?

ramon156 · 2026-03-26T06:48:47 1774507727

I think it's more important to pin down where a human must be in order for this not to become a mess. Or have we skipped that step entirely?

stingraycharles · 2026-03-26T09:29:48 1774517388

AI agents can absolutely use web browsers to do these things, but the hard part is accurately defining the acceptance criteria.

mlsu · 2026-03-26T07:09:28 1774508968

perhaps we can give the AI a bit of money, make it the customer, then we can all safely get off the computer and go outside :)

smokeyfish · 2026-03-26T06:22:10 1774506130

Datadog have a feature like that.

naultic · 2026-03-26T02:01:35 1774490495

I'm working on something a little similar but mines more a dev tool vs process automation but I love where yours is headed. The biggest issue I've run into is handling retries with agents. My current solution is I have them set checkpoints so they can revert easily and when they can't make an edit or they can't get a test passing, they just restart from earlier state. Problem is this uses up lots of tokens on retries how did you handle this issue in your app?

jawiggins · 2026-03-26T02:04:10 1774490650

Generally I've found agents are capable of self correcting as long as they can bash up against a guardrail and see the errors. So in optio the agent is resumed and told to fix any CI failures or fix review feedback.

raised_hand · 2026-03-26T06:12:07 1774505527

Why K6? Is there a way I could run it without

MrDarcy · 2026-03-25T23:18:37 1774480717

Looks cool, congrats on the launch. Is there any sandbox isolation from the k8s platform layer? Wondering if this is suitable for multiple tenants or customers.

jawiggins · 2026-03-25T23:27:31 1774481251

Oh good question, I haven't thought deeply about this.

Right now nothing special happens, so claude/codex can access their normal tools and make web calls. I suppose that also means they could figure out they're running in a k8s pod and do service discovery and start calling things.

What kind of features would you be interested in seeing around this? Maybe a toggle to disable internet connections or other connections outside of the container?

nevon · 2026-03-26T07:21:24 1774509684

Network policies controlling egress would be one thing. I haven't seen how you make secrets available to the agent, but I would imagine you would need to proxy calls through a mitm proxy to replace tokens with real secrets, or some other way to make sure the agent cannot access the secrets themselves. Specifically for an agent that works with code, I could imagine being able to run docker-in-docker will probably be requested at some point, which means you'll need gvisor or something.

conception · 2026-03-26T00:23:17 1774484597

What’s the most complicated, finished project you’ve done with this?

jawiggins · 2026-03-26T00:28:51 1774484931

Recently I used to to finish up my re-implementation of curl/libcurl in rust (https://news.ycombinator.com/item?id=47490735). At first I started by trying to have a single claude code session run in an iterative loop, but eventually I found it was way to slow.

I started tasking subagents for each remaining chunk of work, and then found I was really just repeating the need for a normal sprint tasking cycle but where subagents completed the tasks with the unit tests as exit criteria. So optio came to my mind, where I asked an agent to run the test suite, see what was failing, and make tickets for each group of remaining failures. Then I use optio to manage instances of agents working on and closing out each ticket.

antihero · 2026-03-25T23:19:10 1774480750

And what stops it making total garbage that wrecks your codebase?

jawiggins · 2026-03-25T23:24:55 1774481095

There are a few things:

a) you can create CI/build checks that run in github and the agents will make sure pass before it merges anything

b) you can configure a review agent with any prompt you'd like to make sure any specific rules you have are followed

c) you can disable all the auto-merge settings and review all the agent code yourself if you'd like.

kristjansson · 2026-03-26T00:41:15 1774485675

> to make sure

you've really got to be careful with absolute language like this in reference to LLMs. A review agent provides no guarantees whatsoever, just shifts the distribution of acceptable responses, hopefully in a direction the user prefers.

jawiggins · 2026-03-26T00:44:43 1774485883

Fair, it's something like a semantic enforcement rather than a hard one. I think current AI agents are good enough that if you tell it, "Review this PR and request changes anytime a user uses a variable name that is a color", it will do a pretty good job. But for complex things I can still see them falling short.

SR2Z · 2026-03-26T03:11:57 1774494717

I mean, having unit tests and not allowing PRs in unless they all pass is pretty easy (or requiring human review to remove a test!).

A software engineer takes a spec which "shifts the distribution of acceptable responses" for their output. If they're 100% accurate (snort), how good does an LLM have to be for you to accept its review as reasonable?

59nadir · 2026-03-26T05:37:10 1774503430

We've seen public examples of where LLMs literally disable or remove tests in order to pass. I'm not sure having tests and asking LLMs to not merge things before passing them being "easy" matters much when the failure modes here are so plentiful and broad in nature.

ElFitz · 2026-03-26T07:00:16 1774508416

My favourite so far was Claude "fixing" deployment checks with `continue-on-error: true`

upupupandaway · 2026-03-25T23:24:35 1774481075

Ticket -> PR -> Deployment -> Incident

abybaddi009 · 2026-03-26T02:43:48 1774493028

Does this support skills and MCP?

jawiggins · 2026-03-26T05:25:22 1774502722

Yup. MCP can be configured on a repo level. At task execution time, enabled MCP servers are written as a .mcp.json file into the agent's worktree. Enabled skills are written as .claude/commands/{name}.md files in the worktree, making them available as slash commands to the agent

verdverm · 2026-03-26T05:58:05 1774504685

I love k8s, but having it as a requirement for my agent setup is a non-starter. Kubernetes is one method for running, not the center piece.

hmokiguess · 2026-03-26T00:28:28 1774484908

the misaligned columns in the claude made ASCII diagrams on the README really throw me off, why not fix them?

| | | |

jawiggins · 2026-03-26T00:37:16 1774485436

Should be fixed now :)

QubridAI · 2026-03-26T00:01:43 1774483303

[flagged]

knollimar · 2026-03-26T00:03:25 1774483405

I don't want to accuse you of being an LLM but geez this sounds like satire

weird-eye-issue · 2026-03-26T01:17:44 1774487864

It's AI.