
Guaranteeing AI agent output quality with output handlers
If you’ve ever tried to put Large Language Model (LLM)-based functionality, for example using AI agents, into production, you will have noticed that while it’s extremely easy to build something that sort of works, it can seem hard to impossible to build something that works reliably enough, let alone provides any guarantees of the output’s quality.
Why is AI agent quality control hard?
A large part of the reason is in the way LLMs work — their output is inherently stochastic, and not guaranteed to produce the same output given the same input. Another important reason is the limitations of the information flow inside a traditional AI agent implementation (such as those in Langchain and LlamaIndex, for example). To wit, the agent is given a text prompt, which includes firstly, the question it’s supposed to answer (along with any relevant context), and secondly, a description of the tools it can use. It then calls these tools in the order it chooses, giving them the inputs it chooses, and when it thinks it’s done, it terminates and returns the final output. One possible refinement is that certain tools can return directly, that is if that tool is called, the agent automatically terminates and the tool’s output is treated as the agent’s output.
Where would you put validation in this flow? Not in the agent’s prompt — its behavior is the very thing you’re trying to validate. So a validator/critic must be supplied as a tool. But this comes with its own set of limitations: firstly, a tool doesn’t see the agent’s inputs; so any kind of formal validation of inputs against outputs (for example, making sure that all the links contained in the input data are also contained in the output) is impossible. Secondly, the agent doesn’t actually have to call the validator tool (you can tell it to in the prompt, but this will only make it more likely rather than certain); nor does it have to take its feedback into account while returning its final reply.
As a last resort, you can do formal validation after the agent is done, and ask it to try again if it fails — this is what people mostly resort to. But who’s to say the next attempt will be any better than the first one? Also, this way misses the opportunity of supplying the validator’s feedback back to the agent during the first run, which would have made it much more likely that the final output is valid.
Solution: output handlers
Are you feeling despair by now? Fear not, MotleyCrew has a solution. Since the traditional agent structure doesn’t allow for good validation flows, we have improved it so that it does!
From the user’s point of view, you supply an additional argument to the agent (the MotleyCrew agent wrapper, to be precise, which works with most popular agents), called the output handlers. It’s a special kind of tool with the following properties:
- The agent can only return a result by calling the output handler, not directly
- It’s a structured tool, so can have multiple inputs if you want
- The output handler has direct access to the agent’s inputs, so can use them in validation
- If the agent calls the output handler, the output handler validates the inputs, and if it doesn’t like them, raises an exception describing the reason, that is fed back to the agent
- If the output handler concludes without exceptions, its output is the agent’s output
You see how this allows a whole new level of validation guarantees.
Structured outputs
The simplest case is when you want the agent to return not just a string, but a valid structured output, such as a Mermaid diagram or a PGSQL query that is valid against a certain schema. The output handler would then simply run the validation required, and pass any errors back to the agent until success.
Not losing information
Let’s take our example with summarizing a bunch of text with links in it, without losing any links. For example, the AI agent could be writing a Suspicious Activity Report based on (human) agent notes, and the links might refer to individual suspects’ profiles within our system. Then the output handler could be simply a code snippet that compares the set of links in the agent’s input (which, unlike traditional tools, it can access directly) to the set of links in the proposed output. If they don’t match, the output handler will raise an exception listing the mismatches, which will be passed through to the agent, who will then try again until success.
Of course, sometimes it might not be possible for the agent at all to call the output handler in a way that it accepts as valid — either because the output handler is misspecified, or the LLM is not smart enough or doesn’t have enough data. In that case, after a predefined number of attempts, the agent will return a null result. This is not great, but certainly better than getting an invalid answer.
Beyond basic validation: the writer-critic pattern
The usefulness of output handlers goes beyond simple algorithmic validation such as above. An output handler could itself call an LLM to generate its response — for example, the agent could be writing a blog post, and the output handler could be proposing improvements to the post so that it doesn’t sound like it was written by AI. To prevent an endless improvement loop, in this case you might want to have the output handler auto-succeed after a predefined number of iterations.
Conclusion
Ensuring the reliability of the output of AI agents is a ubiquitous problem that is very hard to solve with existing tools. Output handlers serve like a guardian of the output, giving feedback to the agent on what still needs improving, and only letting a result through that is good enough — taking the robustness of the agent to a whole new level with almost no effort from the agent author.