MotleyCrew with Llama 3.1: fully open source multi-agent AI

MotleyCrew, like many other AI frameworks, currently uses OpenAI’s GPT-4o as the default model for its agents. It’s state-of-the-art, quite reliable, cheap enough, and supports various functionality essential for building agents, like function calling. But what if you want a different LLM? What if you have to use an in-house LLM hosted by your company, or just want to save money on inference at scale? In this case, you can look at open-source LLMs which you can host yourself, and which are becoming more and more sophisticated, today approaching the quality of proprietary models. We wanted to make it easy to use such models with MotleyCrew.

First of all, you can really be flexible on the models you use, because all you need is to provide a suitable client for the agent backend you’re using (https://motleycrew.readthedocs.io/en/latest/choosing_llms.html#using-custom-llms).

We investigated some platforms that host Meta’s Llama 3.1 model for use with MotleyCrew. We tested our agents with Llama 3.1 70B and 405B hosted on Together, Replicate, and Groq. From these platforms, only Groq had an API that supported tool calling (as of September 2024). All agents that we tested, even our sophisticated ReAct agent, worked well with Llama 3.1 405B hosted on Groq.

What about self-hosting? We tried running Llama 3.1 70B via one of the most popular solutions, Ollama. It worked quite well with simpler agents, but we found it to struggle with larger prompts, such as the one of our ReAct agent. Usually it gave an empty output if the prompt was too large. It’s possible that the 405B option would be much better, like the one we tried with Groq, but we haven’t yet tested it as it is much more expensive to self-host. We also haven’t tried other solutions for self-hosting, such as llama.cpp. If you have any experience using it for agents, please let us know!

As a result, we now support Ollama, as well as Together, Replicate and Groq providers out-of-the-box (https://motleycrew.readthedocs.io/en/latest/choosing_llms.html#providing-an-llm-to-an-agent). Try them out and tell us what you think! Also, if you want us to support other providers, please let us know through our GitHub.

MotleyCrew with Llama 3.1: fully open source multi-agent AI

How to hack your dependencies

Why I avoid Python's asyncio (by Egor)

Why too much Pydantic can be a bad thing