An agentic coding IDE with agent hierarchy, full traceability, and rollback. In the browser, but your dev environment stays local.
You describe a change. Agents break it down, execute it across your codebase, and every step is recorded.
Agent hierarchy
Work flows through three layers:
- Alchemist — the orchestrator. Reads your request, understands the codebase, delegates to specialists.
- Specialists — hold deep context on specific parts of the codebase. Each specialist owns a set of tasks and decides how to implement them.
- Coders — do the actual edits. Each coder works on an isolated git worktree, so parallel execution doesn't cause conflicts.
When coders finish, their branches merge back. The specialist reviews the result against the original task. If something's wrong, it goes back.
Beyond the core three, there are purpose-built agent types: research agents for codebase exploration, planners for architecture and design, history agents that trace changes back to their origin, testers that verify work against acceptance criteria, and a workflow architect that designs automation rules. Each type has a locked-down toolset. A research agent can read but not write, a tester can run commands but not edit files.
Traceability
Every request, task, tool call, and file change is a record with a unique ID. You can walk the full chain from a changed line, to the tool call that changed it, to the task that requested it, to the original request.
Git blame is integrated with the record system. Running blame on a file shows not just which commit changed each line, but which request and task caused it. This works because every agent commit is tagged with a record ID.
The record view is how you review what agents did, approve changes, and understand decisions.
Collaborative planning
You talk through a feature or architecture change with a planner agent, and it builds the plan document in real time, section by section, so you can steer as it takes shape. You can annotate specific sections with inline comments, and the planner picks those up and revises.
When the plan is ready, the planner decomposes it into linked requests and offers a handoff to build mode. Requests carry a reference back to the plan, so implementation traces back to the design decision. Plans are versionable if you need to snapshot or restore, but the core value is the back-and-forth: you and the AI iterating on a design before any code gets written.
Agents can also generate journals, narrative summaries of completed requests covering what was done, what decisions were made, and what files were touched.
Acceptance criteria and evidence
Requests have typed acceptance criteria. When a coder completes work, it writes what it did to satisfy each criterion and how it verified the result. Testers can independently validate criteria by running configured test commands and reporting results.
Rollback
Completed requests can be reverted with one action. Rollbacks follow LIFO ordering so dependencies unwind cleanly. You can't revert request #3 if request #4 built on top of it without reverting #4 first.
File changes, git state, and records are all unwound together.
Agent communication
Agents coordinate through typed messages (reports, questions, handoffs), each with a sender, recipient, and status. Messages are stored and queryable, so the full coordination chain is visible.
Hooks, recipes, and workflows
Automation is built in layers. Hooks are event-triggered prompts: when a specific event occurs (a task completes, a session goes idle, a request is created), a hook can fire an agent with instructions. Recipes group related hooks together. Workflows bundle recipes into a named configuration you can activate per alchemist.
A workflow architect agent can design these for you. Describe the behavior you want, and it builds the hooks, recipes, and workflow structure using purpose-built tools.
Tooling
Agents have no bash access.* This is a deliberate choice. Most AI coding tools give agents a shell and let them run wild. It works remarkably well. Until it doesn't, and something gets blown up beyond repair. The popular answer is to put everything in a container, which brings its own set of problems. We're making a different bet: that the future of AI tooling isn't bash commands from half a century ago. You can't reliably record what a bash command did, you can't undo it, and you can't constrain what the model decides to run. Some tools now use a second agent to check whether the first agent's commands are safe, but who checks the checker?
AlchemyLab replaces this with purpose-built tools: structured file edits, search, git blame wired into the request system, file recovery. Every tool call records exactly what it changed, so the audit trail is complete and rollback can actually unwind it.
The trade-off is that agents can't do arbitrary things. That's the point.
* Bash can be enabled per agent when the task calls for it. But it's off by default, and the built-in tools cover the vast majority of development work.
Local-first
AlchemyLab runs on your machine, built on OpenCode. Files stay on disk, all changes go through actual git commits. Metadata (requests, tasks, records, plans) lives on a server that enables the UI and multi-device access. Provider configuration uses the OpenCode system, so anything working in OpenCode works in AlchemyLab. Model selection is configurable per agent type.
The UI runs in the browser and is accessible from any device: phone, tablet, another computer. The browser is the IDE.
Current status
AlchemyLab is early and actively developed. The core loop — requests, tasks, agents, records, rollback — works. The hook and workflow system is functional but still being refined. Testing integration is in progress. AlchemyLab is built with AlchemyLab.
If you've gotten this far you might as well try it.