Someone Finally Built a Local Debugger for AI Agents

Raindrop AI shipped Workshop, an open-source local debugger that records every token, tool call, and decision an agent makes. For builders tired of watching agents fail silently, it is the visibility layer the stack so

May 18, 20263 min read

Heavy black zine-style illustration of a local AI agent debugger: a blocky mechanical agent with visible internal streams of tokens and tool calls feeding into a stacked inspection

Debugging an AI agent right now is mostly an act of faith. You wire up a model to a few tools, give it a system prompt, and hope it does not call the weather API when the user asks for a refund. When it inevitably goes sideways, you end up reading a wall of unstructured logs or praying your cloud observability dashboard caught the right span. It is guesswork dressed up as engineering.

Raindrop AI just shipped something that actually fixes this. Workshop is an open-source, MIT-licensed tool that runs locally on your machine. It sets up a lightweight daemon and streams every token, tool call, and decision your agent makes into a single SQLite database file. You open a local dashboard and watch the agent think in real time. No cloud bill, no vendor lock-in, no waiting for a SaaS dashboard to load.

That last part matters more than it sounds. Most existing observability tools for agents are cloud-native services that charge by the trace. They are fine for production, but terrible for the iteration phase where you are running a hundred local test prompts and tweaking prompts or tool schemas. Workshop keeps everything on your laptop. The database is a plain .db file you can query with SQL. If you want to know exactly how many times your agent tried to call the same broken endpoint yesterday, you can ask it directly.

The dashboard surfaces what you actually need to see. Token streams show you exactly what the model generated before it reached for a tool. Tool call traces expose the inputs and outputs your agent passed around. Decision logs capture the branching logic. For the first time, builders have a flight recorder for agent runs that is as easy to inspect as a local web server.

The Observability Gap Nobody Talks About

We have spent the last two years stuffing language models into every product we can find, but we have not spent nearly enough time on the infrastructure that lets us understand what they are doing. Vector databases got love. Agent frameworks got hype. Even prompt registries became a conversation. But the humble debugger got ignored. That is weird. You would not ship a React app without DevTools. You should not ship an agent without trace-level visibility.

Workshop treats an agent run as a structured event stream instead of a blob of text. That shift in framing is subtle but important. When agents communicate by generating raw text, errors hide inside paragraphs. A model might apologize for a mistake, then repeat it. Or it might hallucinate a parameter name that looks right but fails at the API boundary. Seeing the raw token flow makes these failures obvious instead of buried.

Local Traces Are Only Half the Picture

Still, tracing tokens and tool calls is just the beginning. Real agents do not live in a chat window. They update databases, trigger workflows, and rerender user interfaces. A local SQLite file will tell you that your agent decided to mark an order as shipped, but it will not show you whether that mutation actually hit your database or if the reactive query updated the customer's mobile screen. The agent is one piece of a larger machine.

This is where the full-stack preview layer becomes non-negotiable. At Botflow, we see builders shipping agents inside web and mobile apps that need to stay reactive. When an agent writes to Convex, the database pushes that change to every connected client instantly. You need to see the agent decision and the UI update in the same breath. Workshop gives you the agent's black box. Make sure your platform gives you the rest of the app.

Build With the Lights On

The larger lesson here is that visibility is not a luxury. It is a prerequisite for shipping. Raindrop open-sourced Workshop because the community needed a shared foundation for agent observability. That foundation will grow. Expect people to build plugins, CI integrations, and eval pipelines on top of that SQLite schema. Smart move.

If you are building with agents today, start by making sure you can see what they are doing. Download Workshop for the agent traces. Then ship your app on a stack that lets you preview the whole thing while it runs. Building blind was never a good idea. Now, at least for the agent layer, we do not have to.