In my last video I talked about how Bloomfilter was implementing chains of large language models to more accurately accomplish difficult tasks. For AI implementors this is often know as an agentic workflows. Andrew Ng has described four approaches to Agentic Workflows: Reflection, Tool use, Planning and Multi-agent collaboration. At Bloomfilter we are currently implementing tool use and multi-agent collaboration, and our use of agent collaboration is in its early stages. We do plan on expanding our capabilities to take advantage of each of these approaches.
There are a few no-code tools for implementing agents workflows — CrewAI, and CasidyAI are a couple examples. We are implementing our agents, and their assistants, in code; we do this because we are relying heavily upon the code and models that we have already built for use in our SaaS application. Today I want to walk through how we are implementing these agents. My use case for this discussion will be eliminating tedious tasks from my routine — exactly what we hope AI would do for us. The specific case I have in mind is using AI — to use Bloomfilter — to pull the data needed to fill in my weekly KPIs.
First, let’s talk about what needs to happen for AI to give me my KPIs. It will need to know what my KPIs are. I can tell it my KPIs every week and hope that it gets them the same each time, but I’d rather tell it once what my KPIs are and have it remember my preferences. Fortunately, we already have some AI assistance that can record preferences in the database. So next, we will need an assistant that understands the language of KPIs and knows how to use Bloomfilter to fetch those values. Finally, we’ll want to let the assistant manager know about these tools, and that’s when the fun will begin. We’ll get to check and see if any of this works.
We will start with our context assistant, which now has the ability to store and retrieve KPIs. We want to give it instructions on how to do that. The instructions tell it the capability that it has. So we’re going to update its stored understanding of my KPIs. We’ll implement methods that use our existing infrastructure to save and get KPIs.
The next thing we want to do is create a new file for KPIs. This KPI assistant will be similar to our task detail assistant. We’ll import JSON and other necessary modules. We’ll set it up to get detailed information about specific KPIs.
I got interrupted there, came back, started coding, and forgot to turn my video on. So you missed some things here. I’m going to say it wasn’t very exciting, just some coding stuff. So I will catch you up on what you missed. The context assistant, which you watched me build (or rather, Cursor built it for me with my guidance), is straightforward for saving the context for our KPIs. We can save the KPIs and get the KPIs.
After that, I built a KPIs assistant. The job of this assistant is to fetch those KPIs. It needs to know a portfolio, project, start date, and KPI name. Everything here is optional because we also want to give it the ability to list things off. So you could just come in here and say, “I don’t know what KPIs I can use,” and it should give you a list of those. Other than that, we have the KPI for portfolio and get for project. All this does is construct a signature to use our existing get work period measure for those specific scopes. We have already built that for this work period assistant, so we’re just reusing that.
Let’s take a look at the results. On a fresh thread, I have prompted it with “get me last week’s KPIs.” It has no context, so it says, “Hey, you should tell me what KPIs you are interested in so I can track those.” Then we say, “Okay, great. I am interested in tracking throughput and cycle time.”
Now we can start to see some of the problems with AI and large language models, even with chains. The first problem is that it’s calling this calendar assistant with the time period “last week.” I created a new file called the calendar assistant because when I would say “last week,” it would say, “Great, I know what last week is. That is October 9th, 2023.” So, not acceptable results for last week. I gave it a tool that allows it to take “last week” as a parameter and return the actual dates for last week.
However, it’s still calling the KPIs assistant with dates from 2023, even though it knows the correct dates for last week. Because of that, we get into another problem where it’s confused. It’s calling different KPIs, some of which we don’t even have, like defect density and code coverage. The real problem is it’s not going to be able to find a velocity for these dates because we don’t have them back then.
It gets into a cycle of trying to work through this issue, talking to itself and not to me. This is the chain talking to itself, trying to figure it out. It keeps calling the KPIs with the wrong dates, and each of those tools responds that it’s not going to work because of the wrong time frame. We get stuck in a cycle of it not knowing what to do at this point.
As you can see, you didn’t miss much coding, but the real challenge here is how we structure these assistants so that they can make proper use of the tools they have to retrieve valuable information and then structure that for the user. As an implementer, somebody who’s doing the engineering work to take the science and get it to end users, there’s still some work to be done here.
I’m going to keep working on this and maybe come back with another video where we talk about how we solve this problem because it’s going to get solved one way or another. I will figure out a way to solve it; it’s just not clear to me how that happens yet. So until that video, best of luck to you.