Agents
Recently, I have been putting more time into building my voice-recognition software called alzo. It uses a SpeechRecognition library, which allows modular selection of using a Web API or running a local STT model.
The reason I brought my software up is to discuss how one would talk and classify agents in the current field of discourse. No one knows what they mean when they use the term “agent” in a discussion. Even before, when people were caring about the distinctions between “Artificial Intelligence”, “Machine Learning”, and “Deep Learning”.
Agents vs Models
I’m willing to commit to a functional distinction between Machine/Deep Learning and Agent. Machine and Deep Learning are thought of as modeling. You interact with them by calling some pred(X) function provided by a model library. What distinguishes machine learning and deep learning is whether the model is a neural network or a “traditional” stats model. Agents use a model, but do something extra based on the outputs, like running tools or re-prompting.
Agentic Frameworks
What does it mean for an AI company to deploy “Agents” rather than models? From my understanding, things like “Skills” or “Tools” are fed into an LLM. If the output were to contain a “Tool call”, which is some special formatting for the software to parse and call code, then the tool framework is what enables “Agentic” behavior. So the initial context passed into the LLM is essentially prompting it to “think” about what tools or skills it would need to execute. The software then runs the tool and then prompts an LLM again to actually generate a response1.
Reflecting on how you would go about implementing an agentic framework, it seems relatively simple. You associate the string representation, i.e., the function name, with the function pointer, and when the parser recognizes the format, it can search through the tool names to find the correct function. In an interpreted language, it might be easier as functions themselves automatically contain metadata about their name.
-
They would have had to train the models to know how to use tools, right? ↩