
Why MotleyCrew?
Large language models (LLMs), such as GPT-4 or Claude 3, can do many things that were very hard to do only a couple of years ago, especially in interpreting and generating natural language. They become even more powerful with simple wrappers around them such as ChatGPT, that allow them to use external tools: for example, to execute code or search the web. An LLM surrounded by such a wrapper is called an "AI Agent".
But for many applications, a single agent is not smart and powerful enough, and to solve them one needs to combine several such agents. Common patterns for that include one agent using one or several others as tools; and the writer-critic pattern, where one agent tries to solve a task, and another critiques its output, whereupon the first one changes the output to address the critique, etc.

Another key component of AI systems that has been getting increased attention lately are knowledge graphs - a way of representing structured relationships among different chunks of information, such as snippets of text or code. These can both be created by LLMs, by extracting the relationship information from unstructured text, and used to retrieve the relevant information to feed into an LLM prompt.
When we started trying to build multi-agent systems, we tried using the existing open source frameworks for that task, such as Autogen, CrewAI, and MetaGPT. However, none of them proved to be quite what we were looking for, namely production-ready, fully open-source, easy to use, and allowing really flexible, arbitrarily nested user-defined interactions of agents as a matter of course.
Some didn't have observability (such as LangSmith or Lunary provides) built in; some only allowed simple chaining of agents into a sequence, or maybe agents effectively using other agents as tools (one level deep, with no deeper nesting). Where more interesting patterns were implemented (such as Autogen's nested chats or Society of Mind), they were changes to the core package API, not something a user could just create in passing.
A common limitation we saw was that virtually all the existing multi-agent frameworks expected the user to use their specific implementation of AI agents. Now building good agents is a difficult task, that is distinct from multi-agent orchestration, so it's very unlikely that a single framework would have both the best agents and the best multi-agent communication layer.
That was the starting point for MotleyCrew: providing wrappers to all the above frameworks' agents, and focussing on making their interaction as simple and powerful as possible.
Thus, for example, in MotleyCrew you can directly pass agents as tools to other agents (without introducing an additional "delegation" concept with its own semantics), and these, in turn, can have other agents as tools, and so on.
Another MotleyCrew feature, output handlers, is a simple way of implementing the writer-critic pattern mentioned above, and to provide user-specified guarantees about an agent's output. This works as follows: the agent is told (under the hood) that it has to return its final result only via the output handler. The output handler runs any verification logic the user specifies, and if that fails, tells the agent what failed and asks it to try again, until success. If successful, the output handler's output is returned as the agent's output.
This pattern takes the reliability of AI agents to a whole new level: the output handler can contain both algorithmic logic (for example, verifying that all the links contained in the agent input are also contained in the output) and agentic logic (for example, criticising the writing style, or double-checking that the output fulfills the instructions given to the agent), or any combination.
A final MotleyCrew feature worth highlighting is its built-in knowledge graph backend, which allows agents to exchange information in ways more flexible than calling each other; in particular, agents can inspect and modify the knowledge graph to dynamically identify new task units for themselves, or create them for other agents.
We are not stopping there: by continuously tackling new use cases, we will continue to identify natural usage patterns for AI agents, and make them easy to use for MotleyCrew users.
In addition to having such patterns available for use with existing or custom-written agents, MotleyCrew provides what we consider basic prerequisites for a production-worthy multiagent library - built-in observability (via Lunary),optional disk-based caching of LLM calls for easier development and testing (via motleycache), as well as synchronous, asyncio, and multithreading backends. It is also open source with the MIT license, so you can run it freely for personal or commercial projects.
Multiagent AI systems hold huge promise in tackling many challenges that simpler systems cannot. It is MotleyCrew's goal to provide a way to build these that is easy, reliable and powerful. Give it a try, and if you have any other cool patterns you'd like us to implement, please get in touch!
