AI Agents will change the Workplace forever…
Welcome to this new post about my Data Analytics journey.
We’ve all heard about Artificial Intelligence and most of us have already embraced ChatGPT as our new best friend. But what if I told you that we’re only at the beginning of an era where AI takes over many more routine computer tasks.
In this new blog I want to create some awareness about a phenomenon that many of you may have never heard of: AI Agents. AI agents are ‘virtual beings’ that do their job in a workplace that you set up in a computer environment. They are highly intelligent and lightning fast. They can be assigned any role you can imagine. From Data Analyst, Code Developer, Tester, Web Designer, Customer Assistant, Blog Editor, Supervisor, Researcher, Office Manager, Web Scraper, Project Manager, Secretary… The sky is the limit. All the routine computer tasks that people do can be inserted into these models. The even greater thing is that you can create your own team of agents and let them do the work for you. Truly mindblowing
My blog focusses on the ‘how’ of AI Agents. How does it work, and how can you take advantage of it. In brief I’m going to elaborate on:
1. The way the agents take on tasks, execute them and pass them on
2. The tools required to set up the AI agent environment
So, if you’re ready for an entirely new workforce waiting for you at your fingertips, keep reading….
The ‘Why’ of AI Agents
AI agents are designed to perform tasks autonomously, minimizing human intervention. First examples are Voice Technology Agents like Amazon Alexa, Google Assistant, and Apple Siri, which allows you to interact through natural language.
The more advanced AI Agents came about at the time that large language models (LLMs) were introduced in 2022. Large Language Models (like ChatGPT and others) are the systems you can communicate with and ask to do more intelligent things for you like office tasks (text/mail writing, summarizing (vast) texts or images into documents/presentations), software engineering, analysis work, image creation, video editing and many more. The models mimic human-like behavior in understanding questions (prompts) and response generation.
At these early days AI agents were ‘born’, probably with 2 goals.
1: Making it easier for humans to communicate with technology
We have noticed the advantages of communcating with human-like technology and we like it a lot. OpenAI has grown tremendously, and so have the all other tech giants that followed in their footsteps.
But human communication with AI is not just chatting with ChatGPT. Intelligent chats can be deployed in many other places, first off in the customer facing apps and websites. Think about chatting with Amazon in case you’re interested in a particular product or with a virtual travel advisor at Booking.com. These Chat AI agents can lead you to the purchase. Other applications of these “CoPilots” are assistants that can perform various computing tasks like excel calculations or coding suggestions.
2: Taking ‘the human‘ out of the loop (Flow engineering)
Humans interact with AI models and get great benefits out of. Getting work done quicker is a nice productivity gain, but from an overall efficiency point of view it is not enough. Let me give you 5 reasons why:
– Humans interact with AI models in an inefficient way, mostly manually typing in questions
– In a lot of cases multiple attempts have to be made to get it right
– It is hard to use previous steps for recurring tasks
– Chatting with a GPT takes place in an isolated environment
– It is hard to connect the effeciency improvements that AI gives to an entire process
These reasons made it clear from the start that AI models needed to be equipped with a more powerful, flexible and configurable layer around them in order to automate not only routine tasks but even entire processes and projects (‘flows‘). This new layer is where the AI Agent domain lies.
What do AI teams look like?
In the introduction of this blog I described AI agents as ‘virtual beings’ that can be deployed in your daily work, in projects or in business processes. Human management or intervention is always possible. AI agents can be configured in systems that handle AI flows. I will explain more of this later on. But first let’s descibe how the AI agents operate. There are 4 situations:
– AI agents operate in a 2 agent chat
– AI agents operate in a sequential chat
– AI agents operate in a group chat
– AI agents operate in combined or nested chats (out of scope in this blog)
Let’s explore these options a bit further with some diagrams.
1. Two Agent Chat
2-agent chat (source Microsoft)
The input message is initialized and put in a workflow, along with some context like the LLM model to be used and tools that have been assigned to help do the work (a pdf, scraped website, csv etc.). The initializer converts the message to a data format that Agents can understand. Depending on their role, Agents can start do their ROI and execute their task while communicating with eachother. Once finished the result is summarized and fed back to the user.
Example: a user communicates with a website. One agent serves as the customer assistant, the other agent can answer questions about things that are on the website to the other agent.
Example: a user communicates with their computer in order to plan an event. An agent can access tools like their calendar, process information from the calendar in their memory and plan the next step.
Example of the way an agent operates (source LangChain)
2. Sequential Chat
A sequential chat is a sequence of chats between two agents, tied together by a Carryover, which carries a chat outcome as a new message to the next pair of agents. The carry over follows a sequentual pattern. This pattern is useful for complex tasks that can be broken down into interdependent sub-tasks. The figure below illustrates how it works.
Sequential Chat (source Microsoft)
3. Group Chat
A group chat is orchestrated by a special agent called the “GroupChatManager”. In the first step of the group chat, the Group Chat Manager selects an agent to give input. The input is given back to the Group Chat Manager. who then distributes it over the other agents. So, each agent is kept up in the loop of the chat. This iterative conversation can go on for as many interactions that are set in the software.
Group Chat (source Microsoft)
Google’s AI Agents: A Closer Look
AI Agent platforms are developing rapidly in the AI space. Where most platforms leave the choice up to you what you want to call your agents and what role you want to give it, Google has already taken it a step further and has done this part for us.
Here’s a brief overview of each type of agent that Google has identified:
- Customer Agents: Mimic the best sales and service personnel. These agents are experts in understanding customer needs and delivering product recommendations. They can operate across multiple platforms, like the web, mobile, and in-store interfaces.
- Employee Agents: These agents will enhance productivity and streamline workflow processes, assisting in everything from data entry to complex problem-solving tasks.
- Creative Agents: These creative agents assist teams in generating marketing materials and other creative outputs efficiently.
- Data Agents: Are specialized in managing and analyzing large datasets.
- Code Agents: Targeted towards the IT and software development industry, code agents assist in writing, reviewing, and debugging code, significantly speeding up the development process.
- Security Agents: With a focus on cyber security, these agents are programmed to detect and respond to security threats.
AI Agent Frameworks for flow engineering
Configurable (flexible) AI Agent Frameworks will be used to automate workflows. At this point the question is what type of workflows are ‘ideal’ for workflow engineering, and how many agents should be deployed to do the job for you. Another important element is the quality and reliability of the AI flow that will be put into place. For reasons of quality control and adjustments to the flow and to the model itself, a human would still be required to stay in the loop.
Right now almost every large IT company works on, or has already released an AI agent platform, where users can configure agents to automate a workflow. Important players right now are:
– AutoGen (Microsoft)
– Google AI Builder using Gemini
– LangChain
– CrewAI
– Super AGI
– AI Agents (OpenAI)
– Devin AI (Software Engineering)
– Palantir AIP
– Perplexity AI
Spin-offs
There are now many, many smaller AI companies that have build their own business around LLM’s (Large Language Models) and offer specialized services for smaller businesses like intelligent chat assistants, news updates, automatic content etc. It is my expectation that these businesses and services will grow substantially over the next few years.
Conclusion
In this blog post, I’ve described the a new upcoming workforce: AI agents. These agents interact with advanced AI models, and perform (routine) tasks, previously done by humans. When set up the right way AI agents can work independently and make intelligent decisions based on their roles and AI capabilities.
AI agents can do amazing things like analyze entire code repositories or generate end-to-end software solutions. Basically anything that has a workflow where information or data is analyzed and handed over to others is a candidate. And because AI agents understand human input, they can reduce your reading or search work as well.
AI agents can be trained to update computer systems, thus reducing the need for manual input of data. AI agents can create management reports, do their own research & analysis and summarize everything. We just have to ask the pre-programmed worker what we need, and it will be done in the background.
Right now big tech companies are building platforms where AI agents can be configured and connected to language models and business processes. We’re still in the early stages of this process and need to evaluate how all of this will evolve and can be implemented into businesses. But if things can be made easier and faster, there’s always a market for it.
Thanks for reading this blog. Nick.