Building a customer support chatbot using MotleyCrew and Ray

If you read our latest post on event-driven workflows, you may have noticed that MotleyCrew becomes more about “building the Motley way” than a do-it-all toolkit.

The initial idea of MotleyCrew was to integrate different agentic frameworks, allowing users to get any agents to work together in a system. Continuing the same logic, just as real-world applications may require different frameworks depending on the use case, they may also require different orchestration mechanisms. The original MotleyCrew pattern involves a DAG of interdependent tasks, each generating one or more units of work for the agents, which can be processed in parallel. While this mechanism is useful for many cases, like the research agent, it just can’t be the solution for everything. While we bounced this thought around in our heads, many frameworks were releasing their takes on the perfect tool for creating agentic workflows. We, on other hand, are becoming more and more convinced that “agentic apps” are no more than compound systems, and the choice of tools for building such systems is more dependent on the use case than on anything else.

That’s why for this new demonstration of MotleyCrew we step away from our crew semantics and create a customer support chatbot app using Ray for task management and deployment.

In the heart of the support chatbot lies the issue database. It stores information about what issues can arise and how they can be solved in the form of a tree, where the intermediate nodes are the issue categories, and the leaves are the individual issues. The simplest way to compose such a tree is to use the information about past issues solved by a human support agent. If they are not already categorized, it can be done using an LLM.

The purpose of the AI support agent is to try to resolve the customer’s issue based on past solutions to similar problems. When a customer asks a question, the agent first needs to determine the relevant category in the issue tree. Often, the initial problem description is vague, so the agent can walk himself through the tree by asking the customer clarifying questions. When the individual issues are reached, the agent examines them and proposes a solution, if it is present among or can be derived from the resolved issues. Otherwise, the issue is escalated to a human agent.

The architecture is simple. We store the issue tree in a graph database. We provide a view of the issue tree as a list of direct children (subcategories and issues) of a given node category. The view is provided both in the initial prompt (as a list of top-level categories) and interactively via a tool that is given to the agent.

To ask the customer additional questions, the agent is provided with a chat tool. Also, there is a resolution tool, which is used either to submit a solution or to escalate to a human.

The agent follows the “every response is a tool call” design, which we believe is great for most agentic applications. The agent loop can only end with a ResolveIssueTool call (or when some constraint, like the number of iterations, is reached). We achieve this by making the tool an output handler: its output is returned directly, and any agent’s attempt to return something bypassing the tool is blocked.

If the customer has other questions after the resolution, the summary of previous conversation history is provided to the agent.

Deployment using Ray

With the agent logic set up, we need a way to expose it as an API, allowing multiple customers to connect to it at the same time. Ray is a wonderful tool for managing Python workflows (and we’re planning to expand Ray integration in the near future). Its core functionality allows scheduling and running distributed tasks, which already makes it a good backend solution for building scalable agent applications. Among other nice things, it provides Ray Serve as a way to create APIs.

In this demo, we used Ray Serve to create an API endpoint for the chatbot. It communicates with client apps using WebSockets over FastAPI. We included a simple web interface for testing the app.

In Ray, a deployment is just a Python class, which we can configure using a simple decorator like this:

@serve.deployment(num_replicas=3, ray_actor_options={"num_cpus": 1, "num_gpus": 0})
class SupportAgentDeployment:
    ...

Naturally, this app supports multiple simultaneous sessions that are balanced between replicas.

Here’s our simple interface with an example interaction:

The issue tree included the answer to the “I forgot my password” question, but not to the “I don’t have access to my email” one, thus the second question has been escalated.

Check out the brief code for the app in our repository.

We encourage you to clone or download it and play around. It shows the flexibility of MotleyCrew for building agentic apps: you don’t have to stick to any particular agent framework, orchestration model, or deployment solution. Try building the Motley way: use whatever building blocks you deem fit, define your own tools in a few lines of code, and use trusted solutions for deployment and scaling. And share your ideas on our Discord!

Building a customer support chatbot using MotleyCrew and Ray

How to hack your dependencies

Why I avoid Python's asyncio (by Egor)

Why too much Pydantic can be a bad thing