We are in an era of rapid digital transformation where artificial intelligence is fundamentally and permanently reshaping the business landscape. This change is not just a passing trend; it is the future. Most AI applications today aim to enhance specific tasks using large language models (LLMs) and automation. As AI introduces new capabilities into business processes, companies must rethink which tasks are essential and how often they are performed. Additionally, organizations must determine the roles humans will play versus the roles machines will take over.
As the AI revolution advances, I believe that LLMs and large action models (LAMs) will completely transform the future of business operations. To fully understand this shift, we can look back at the 1990s, when the internet revolutionized business practices. Just as companies had to adapt to the digital age, they must embrace the age of artificial intelligence.
From the 1990s to Today: The Internet Revolution to AI
In the 1990s and early 2000s, machine learning gained popularity in academia and industry. Its success was driven by the availability of powerful computing technology, the collection of massive datasets, and the use of robust mathematical models. Then, in 2012, deep learning emerged as a groundbreaking technology, surpassing all previous methods. Fast forward to 2018. OpenAI introduced GPT-1, the first of its large language models, building on Google’s invention of the transformer architecture in 2017. The launch of GPT-3 in June 2020 marked a significant leap forward. By November 2022, individuals could directly interact with ChatGPT, ushering in a wave of innovations from platforms like Gemini to Cloude, Falcon, and LlaMA. Meanwhile, India made notable advancements with projects such as Krutrim and Sarvam AI.
LLM models like ChatGPT excel at understanding prompts and generating primarily text or visual outputs based on those prompts. While they can be programmed to perform actions using APIs, this requires additional setup and programming. This method—often referred to as AI agentic workflows—has gained a lot of attention lately. These workflows use sophisticated algorithms and data processing techniques to automate tasks, ranging from simple data entry and validation to intricate decision-making processes. At their core, AI agentic workflows are a structured series of actions and operations managed and executed by AI agents.
AI agents are advanced computer programs that leverage their core language models (like GPT-3.5 or GPT-4) to interpret goals and formulate action plans to achieve desired outcomes. Depending on the requirements, a workflow can involve one or multiple AI agents (known as multi-agent systems or MAS). Most AI agents operate based on condition-action rules, and some can also utilize machine learning to continuously refine their understanding and improve the quality of their outputs by processing algorithms and data inputs.
Improving LLM Output with Structured Prompts
Before diving into AI agents in a practical context, it is essential to understand our interactions with AI. Consider, for example, when you ask ChatGPT to generate a product description for your e-commerce website by simply providing the product category name without any additional details. For instance, your prompt might be: “Describe a pair of running shoes.” This approach is known as zero-shot learning (ZSL), where ChatGPT tackles prompts for which it has not been explicitly trained and leverages its extensive pre-trained knowledge base. In contrast, one-shot or few-shot learning involves training the model with examples to refine its responses. The more context we provide, the better the response.
For instance, a one-shot prompt might be: “Write a product description for a pair of running shoes. Describe the features, benefits, and unique selling points.” However, an even more effective prompt adds more context, known as a few-shot prompt: “Write a product description for a pair of running shoes. Describe the features, benefits, and unique selling points. Here is an example: ‘These lightweight running shoes are engineered for peak performance, featuring a breathable mesh upper and responsive cushioning for all-day comfort. The durable rubber outsole provides superior traction on various surfaces, making them perfect for trail and road running.’ Now, write a description within 100 words for a pair of running shoes that highlights their key features and benefits as mentioned.”
In a few-shot prompt, we increase prompt specificity, also known as prompt enrichment, by adding more context and details to help the model provide better results.
If you are following along, you now understand how we can improve our prompts and help LLMs learn more effectively. In a practical scenario, you might need to generate descriptions for thousands of products. This task would be exhausting for a human, but we can streamline it by creating a batch process for an LLM. First, list all product categories and key features in a document or spreadsheet. Then, use automation to process the spreadsheet, providing the LLM with specific inputs and generating descriptions based on structured prompts for each entry in the batch.
A structured prompt might look like this: “Generate a product description for [product category]. Include features like [feature 1], [feature 2], and [feature 3]. Describe how these features benefit the user and highlight what makes this product unique.”
Configuring Efficient Agentic AI Systems
We need to keep in mind that all current LLM models are stateless, that is, they have no memory of prior events. We need to use this structured prompt in each request to LLM. We humans can learn and recall repetitive tasks effortlessly, so it may seem surprising that LLM models—despite their human-like capabilities—cannot remember the template. However, this limitation reflects the current state of the art. This is where agentic AI comes in, enabling us to automate and repeat actions in a loop, ensuring consistency and efficiency.
To configure an efficient Agentic AI system, you need to provide the LLM agent with context, knowledge, and form.
- Context: What persona is the task meant for?
- Knowledge: What task should the AI execute?
- Form: What persona should the AI embody? For example, consider the copywriting styles of David Ogilvy and Leo Burnett. Ogilvy’s style is direct and feature-focused, while Burnett’s is more emotive and aspirational. You should also specify the output format, such as JSON, Markdown, or a table.
For example, when generating a product description, your Agentic AI could use a three-part prompt like this:
Context: You are a copywriter for e-commerce product descriptions. Your goal is to create compelling and informative descriptions that engage customers and enhance product listings.
Knowledge: Below are some instructions to help you craft the description:
- Start with a catchy headline that includes [product category].
- Detail [feature 1], [feature 2], and [feature 3] in short paragraphs.
- Describe how these features benefit the user and highlight what makes this product unique.
Form: Present the description in Markdown format. Maintain a professional and engaging tone suitable for an e-commerce platform.
This prompt contains five distinct instructions, or “shots,” following the context, knowledge, and form syntax to guide the creation of the product description:
- Catchy Headline: Start with a headline that includes the product category.
- Feature Details: Describe [feature 1], [feature 2], and [feature 3] in short paragraphs, explaining how each feature benefits the user.
- Unique Selling Points: Highlight what makes the product unique and how the features benefit the user.
- Markdown Format: Present the description in Markdown format.
- Professional and Engaging Tone: Maintain a tone suitable for an e-commerce platform.
Key Types of AI Agents
As mentioned earlier, there are various types of AI agents, ranging from simple to advanced. The type we discussed above maintains an internal model of the environment—in this case, the product category and features. This agent involves more complex reasoning and adaptation, using its internal model to process input (product features) and generate the desired output (an effective e-commerce description). Such agents are known as Model-based Reflex Agents.
Another type, Goal-based agents, not only maintain an internal model but also have a specific goal or set of goals. These agents search for action sequences that help them achieve their goals and plan these actions before executing them. In our example of an AI agent tasked with writing product descriptions, a goal could be defined as adhering to character limits.
A Utility-based Agent uses a utility function to assess and optimize its output to ensure it maximizes overall effectiveness while adhering to constraints, such as character limits. The goal is to create a description that achieves the highest possible utility by being clear, persuasive, and engaging. In such AI agents, we can set rules for each criterion and calculate the score for each version. For example, the utility function for a description could be:
Clarity + Persuasiveness + Engagement + Adherence to Constraints (character limit)
A more advanced Learning Agent assesses metrics, such as engagement rates, conversion rates, and customer feedback, once the content is deployed on a live e-commerce website. Based on this new data, the agent refines future descriptions. A Learning Agent continuously learns from ongoing data, allowing it to evolve with changing customer preferences and market trends.
Orchestrating Multiple AI Agents
So far, we have discussed how a single AI agent can address specific business problems. However, in the real world, many business challenges exceed the capacity of a single discrete agent. Solutions to such problems require multiple interacting agents that collaborate to achieve a common goal. A Multi-Agent System (MAS) distributes computational resources and capabilities across a network of interconnected agents.
In a simplified format, an MAS begins with core language models (like GPT-3.5 or GPT-4) to interpret the business problem and generate a sequence of tasks. This task list serves as the agent’s roadmap to achieving the set objective. Specialized AI agents then trigger and start executing tasks independently. Some of these agents may need to follow a specific order, which is known as orchestration. Orchestration defines how these agents collaborate, whether in a sequential, hierarchical, or bi-directional flow.
MAS also opens the door to multi-modal approaches, which aim to enhance information gathering and generation across various channels, such as language and vision. Once individual AI agents are developed, there is an opportunity to reuse them for different use cases using orchestration frameworks like AutoGen, Crew.ai, and others.
Case Study: Dynamic Pricing Optimization with Multi-Agent System (MAS)
Let us explore a hypothetical e-commerce marketing scenario: dynamic pricing optimization, to understand how business problems can be solved through a Multi-Agent System (MAS).
For dynamic pricing optimization, the model requires the following information, gathered through independent AI agents:
- Demand Predictor Agent: Forecasts how many units of each product are likely to sell.
- Competitor Price Checker Agent: Monitors the prices of similar products from competitors.
- Inventory Manager Agent: Tracks current stock levels and predicts when to reorder.
- Pricing Adjuster Agent: Suggests price changes based on demand, competitor prices, and stock levels. These suggestions are then pushed to the e-commerce platform.
- Feedback Checker Agent: Monitors the performance of pricing changes and collects feedback.
An orchestration framework connects these agents and ensures smooth coordination among them. A Multi-Agent System not only enhances model accuracy but also increases its trustworthiness. To achieve this, MAS often incorporates both machine and human agents, a concept known as human-in-the-loop (HiTL). In this scenario, data from the Feedback Checker Agent can be reviewed and labeled by human agents, and these insights are then transferred to the Pricing Adjuster Agent. This collaboration between machine and human agents allows for more comprehensive feedback analysis and model refinement, ensuring that the pricing strategy is both accurate and reliable.
Limitations of LLMs in Real-World Applications
We now understand how complex business problems can be addressed using agentic AI; however, we are still dealing with Large Language Models (LLMs), which come with the following constraints:
- Statelessness: LLMs are stateless, meaning they cannot retain information about previous interactions or make decisions based on past events.
- Fixed Dataset: LLMs are trained on a fixed dataset and cannot access new information after training, incorporate it, or adapt to changing circumstances in real-time.
- Limited Interaction: LLMs cannot interact directly with tools such as APIs or software applications.
In real-world business operations, the complexity is much greater, typically involving hundreds of use cases interconnected through various workflows. These workflows integrate and coordinate different aspects of the business. For instance, a content creation workflow for an e-commerce website may include product descriptions, category page content, blog posts, and automated emails. A search engine marketing (SEM) workflow might include keyword research, campaign creation, ad copy development, and more.
LLM agents typically perform one use case at a time. Therefore, when combined with human-in-the-loop (HiTL) processes, LLM agents can significantly slow down overall business operations. One promising development is the potential integration of LLMs into autonomous agents. However, for autonomous agents to operate effectively in the real world, they require access to memory, real-time knowledge, and tools—capabilities that are currently beyond the scope of LLMs.
The Emergence of Large Action Models (LAMs)
Large Action Model (LAM) has immense potential to bridge the gap between LLMs and autonomous agents. The Large Action Model combines the vast knowledge and capabilities of LLMs with the real-time decision-making and operational abilities of autonomous agents.
LAMs can interact with various user interfaces and perform tasks much like a human would. They are designed not only to understand prompts but also to execute the actions associated with these prompts. While LAMs are still relatively new in research and development, their full potential is yet to be realized. However, they represent a promising shift in how computers interact with humans and respond to changing circumstances.
The Road Ahead for Businesses in the Age of AI
Over the next few years, AI is going to significantly change the internet, impacting big tech, startups, product leaders, marketers, advertisers, media, customer care, e-commerce, and all of us. Large Action Models (LAMs) will not just enhance existing workflows—they will also replace and eliminate redundant parts of them. The main barrier to this operational transformation is people’s readiness to embrace the changes AI brings. Fortunately, C-level executives are likely to champion this shift, and early adopters will be rewarded with increased market share, much like in the early days of search engine marketing and social media.
AI adoption will not only boost organizational productivity but also create new types of dynamic organizations. Under this new AI-first paradigm, companies will likely move toward outcome-based pricing as operational costs decrease significantly. This shift will incentivize companies to focus on delivering tangible results for their customers rather than just providing a service, ultimately leading to greater customer satisfaction and loyalty. As more companies adopt AI technology and adapt to this new paradigm, we can expect a wave of innovation and disruption across industries, leading to a more efficient and competitive market landscape.