X

This site uses cookies and by using the site you are consenting to this. We utilize cookies to optimize our brand’s web presence and website experience. To learn more about cookies, click here to read our privacy statement.

AI Agents vs. Traditional Compute: Balancing Cost, Efficiency, and Scalability in the Future of IT Solutions

Artificial Intelligence’s beautiful offspring, the large language model (LLM), has forced a collective pivot from traditional computer coding into conversational declarative machines we know as “agents.” Just as certainly that “agentic” will be a new dictionary word by 2026, we are going to witness further democratization of AI across all aspects of our life. As IT geeks move forward with these new tools to start hacking together awesome solutions lets first remember that both traditional and AI based engineering in our industry suffer some of the same issues. Including (but not limited to);

  • Remaining compliant with regulations
  • Data safety
  • High availability
  • Maintainability (tech debt)
  • Cost

Although I have opinions on all of these, this article will cover how much cost factors into your solutions when using AI agents vs. traditional compute based solutions.

Stages for Creating a Solution

Lets follow this outline for how we develop solutions.

  1. Define Objectives: Clearly outline the goals and expected outcomes of the solution.
  2. Identify Requirements: Determine the necessary resources, data inputs, and integrations needed for the solution.
  3. Design Architecture: Plan the overall structure, including the types of agents required and their interactions.
  4. Develop Services/Agents: Create individual services/agents with specific functionalities, ensuring they can operate independently and collaboratively.
  5. Integrate Systems: Connect services/agents with external APIs, databases, and other systems to enable data flow and interaction.
  6. Test and Validate: Conduct thorough testing to ensure each agent performs as expected and the overall solution meets the objectives.
  7. Deploy Solution: Roll out the solution to the production environment, ensuring all components are operational.
  8. Monitor and Optimize: Continuously monitor the performance of the services and make necessary adjustments to improve efficiency and effectiveness.
  9. Maintain and Update: Regularly update the solution and underlying systems to adapt to new requirements and technological advancements.

If you pit traditional compute against agent based solutions and assign them a cost it might turn out like this:

StepTraditional ComputeAgentic ComputeReasoning
1. Define ObjectivesVery LowVery LowDefining objectives is similar in complexity for both approaches.
2. Identify RequirementsLowLowIdentifying requirements is comparable, though agentic systems may need additional considerations for autonomy.
3. Design ArchitectureMediumHighAgentic systems require more complex designs to handle autonomous interactions and decision-making.
4. Develop Services/AgentsHighVery HighDeveloping agents involves creating autonomous, collaborative entities, which is more complex than traditional services.
5. Integrate SystemsMediumHighAgentic systems require more sophisticated integration to enable seamless communication between autonomous agents.
6. Test and ValidateHighVery HighTesting agentic systems is more complex due to the need to validate autonomous behavior and interactions.
7. Deploy SolutionMediumHighDeploying agentic systems may involve additional challenges, such as ensuring agents operate correctly in dynamic environments.
8. Monitor and OptimizeMediumHighAgentic systems require continuous monitoring and optimization to handle evolving scenarios and improve decision-making.
9. Maintain and UpdateHighVery HighMaintaining agentic systems is more demanding due to the need to adapt to new requirements and ensure agents remain effective.

Traditional Compute Solutions

Developing solutions traditionally requires us to create some software then deploy it for use. Standard compute is your traditional infrastructure cost plus the cost of the application development and automated delivery to your chosen platform. Chosen platform’s encompass much of our current known IT infrastructure landscape. It includes serverless, vms, kube containers, bare metal and more. In this model, costs weigh in heavy at the start for initial design, development, and testing. The receipt for a traditional IT solution for your business may itemize to a longer bill than you realize.

App – Custom development, unit tests, design (often x2 for frontend vs. backend)
State – Online storage, relational/document databases, cache
CICD Substrate – Github/Gitlab, pipelines, devops automation, pipeline compute
Artifact Repository – containers, packages, Artifactory/Other
Deployment Scaffolding – Certificates, domains, backup/restore plans, disaster recovery plans, business continuity plans, more
Infrastructure – Kube cluster, serverless, vm, hosted services, cloud resources
Maintenance – Ongoing upgrades, updates, and support

Important: A good design at the very start can often seriously reduce the costs at the infrastructure end of the solution.

AI Compute

It helps to mentally abstract AI as just another compute that that you have access to when creating your solutions. AI compute just happens to be far more expensive (usually in both money and time!) than standard compute substrates for your workloads. It is also requires a fastidious eye for text and a penchant for logic, it requires a developer. It just requires much less from that developer as they will not be coding out every aspect of an API or web service running for each individual agent. Instead the developer is creating multiple english language declarative state machines working in tandem.

AI compute has sub-categories, they are clearly not all doing the same work. One agent that is categorizing an image is not the same as an agent that does a web search then spins out multiple additional agents to process results.

I had DeepSeek R1 create a decent categorized list for me with the following prompt:

Create a categorized list of AI agents by cost. For example, an AI agent that only accepts input and uses the LLM and no external searches would cost less than a web search agent which may cost less than an agent that consumes a full RAG on demand. I'd like every kind of categorized agent you can think of that would differentiate their costs. Create this list as a markdown table with qualifying attributes and a short name.
Short NameQualifying AttributesCost LevelReason for Cost
Basic LLM AgentAccepts input, processes it using a pre-trained LLM, and generates output. No external integrations.LowMinimal computational resources; no external API calls or additional processing.
Search-Augmented AgentUses LLM + external search APIs (e.g., web search, knowledge graphs) to retrieve information.MediumAdds cost of API calls, latency, and data processing for search results.
RAG AgentImplements Retrieval-Augmented Generation (RAG) to fetch and process external data on demand.HighRequires embedding models, vector databases, and additional compute for retrieval and synthesis.
Multi-Modal AgentProcesses text, images, audio, or video inputs using multi-modal models (e.g., GPT-4 Vision).HighHigher computational cost due to processing diverse data types and larger model sizes.
Fine-Tuned AgentUses a custom fine-tuned LLM for specific tasks or domains.HighCost of fine-tuning, hosting, and maintaining a specialized model.
Autonomous AgentOperates autonomously, making decisions, taking actions, and iterating without human intervention.Very HighRequires advanced planning, memory, and decision-making capabilities; often involves multiple tools.
Real-Time AgentProcesses inputs and generates outputs in real-time with low latency (e.g., for customer support).Medium-HighRequires optimized infrastructure and potentially dedicated resources for low-latency performance.
Batch Processing AgentProcesses large datasets or batches of inputs offline.MediumLower per-unit cost but scales with data volume; may require significant compute resources.
Personalized AgentTailors responses based on user history, preferences, or context (e.g., recommendation systems).Medium-HighRequires user profiling, memory storage, and context-aware processing.
Multi-Agent SystemCoordinates multiple AI agents to work together on complex tasks.Very HighHigh coordination costs, communication overhead, and resource allocation across agents.
Edge AI AgentRuns on edge devices (e.g., smartphones, IoT devices) with limited compute resources.Low-MediumOptimized for low-power environments but may require custom development and maintenance.
Enterprise AgentIntegrates with enterprise systems (e.g., CRM, ERP) and handles sensitive data securely.HighRequires robust security, compliance, and integration with legacy systems.
Generative AgentFocuses on creative tasks like writing, design, or code generation.MediumMay require larger models or specialized fine-tuning for high-quality outputs.
Analytics AgentPerforms data analysis, generates insights, and creates visualizations.Medium-HighRequires integration with data pipelines, analytics tools, and potentially large datasets.
Conversational AgentEngages in extended, context-aware dialogues with users (e.g., chatbots).MediumRequires memory and context management, but lower cost than autonomous or multi-agent systems.
Hybrid AgentCombines multiple capabilities (e.g., RAG + multi-modal + autonomous decision-making).Very HighHigh complexity and resource requirements due to combining multiple advanced features.

 

Key Cost Drivers:

  • Compute Resources: Larger models or real-time processing increase costs.
  • External Integrations: APIs, databases, and tools add to operational expenses.
  • Customization: Fine-tuning or specialized models require additional investment.
  • Scalability: Agents handling large volumes of data or users incur higher costs.
  • Autonomy: Agents with decision-making capabilities are more expensive to develop and maintain.

Simple Example – Large Document Processing

This is a very practical use of LLMs, summarize a document. But there are token limitations. Using several agents to chunk out the work and bring it all together at the end is a reasonable solution to these limitations. In the scenario below we end up with 6x the number of requests for compute.

Scenario: Multiplex LLM processing of very large documents.
Objective: Suppose you have a 10,000-token document and need to generate a detailed summary. Here’s how you could use a multi-agent framework:

  1. Partition the Document: Split the document into 5 sections of 2,000 tokens each.
  2. Assign Summarization Agents: Use 5 agents, each summarizing one section.
  3. Run Agents in Parallel: Execute all 5 agents simultaneously.
  4. Aggregate Summaries: Use a final agent to combine the 5 summaries into a single, coherent summary.
Flowchart showing user submitting a large document, divided into 2000 tokens processed by three agents, each creating a summary, merged by a fourth agent into the final summary.
Flowchart showing user submitting a large document, divided into 2000 tokens processed by three agents, each creating a summary, merged by a fourth agent into the final summary.

Advanced Example – ERP Customer Service Bot

This is a more fleshed out example of doing some realistic work. This agent workflow services a user with a bunch of great info by integrating with the internal systems to provide relevant data and responses. Though I’m pretty certain one could code out the logic for everything beyond the conversational agent and eliminate all the other agents if they were so inclined.

Scenario: Enterprise Customer Support System
Objective: Provide end-to-end customer support by resolving inquiries, analyzing customer data, retrieving relevant information, and offering personalized solutions in real-time.

Agents Involved

Short NameQualifying AttributesCost LevelReason for Cost
Basic LLM AgentAccepts input, processes it using a pre-trained LLM, and generates output. No external integrations.LowMinimal computational resources; no external API calls or additional processing.
Search-Augmented AgentUses LLM + external search APIs (e.g., web search, knowledge graphs) to retrieve information.MediumAdds cost of API calls, latency, and data processing for search results.
RAG AgentImplements Retrieval-Augmented Generation (RAG) to fetch and process external data on demand.HighRequires embedding models, vector databases, and additional compute for retrieval and synthesis.
Multi-Modal AgentProcesses text, images, audio, or video inputs using multi-modal models (e.g., GPT-4 Vision).HighHigher computational cost due to processing diverse data types and larger model sizes.
Fine-Tuned AgentUses a custom fine-tuned LLM for specific tasks or domains.HighCost of fine-tuning, hosting, and maintaining a specialized model.
Autonomous AgentOperates autonomously, making decisions, taking actions, and iterating without human intervention.Very HighRequires advanced planning, memory, and decision-making capabilities; often involves multiple tools.
Real-Time AgentProcesses inputs and generates outputs in real-time with low latency (e.g., for customer support).Medium-HighRequires optimized infrastructure and potentially dedicated resources for low-latency performance.
Batch Processing AgentProcesses large datasets or batches of inputs offline.MediumLower per-unit cost but scales with data volume; may require significant compute resources.
Personalized AgentTailors responses based on user history, preferences, or context (e.g., recommendation systems).Medium-HighRequires user profiling, memory storage, and context-aware processing.
Multi-Agent SystemCoordinates multiple AI agents to work together on complex tasks.Very HighHigh coordination costs, communication overhead, and resource allocation across agents.
Edge AI AgentRuns on edge devices (e.g., smartphones, IoT devices) with limited compute resources.Low-MediumOptimized for low-power environments but may require custom development and maintenance.
Enterprise AgentIntegrates with enterprise systems (e.g., CRM, ERP) and handles sensitive data securely.HighRequires robust security, compliance, and integration with legacy systems.
Generative AgentFocuses on creative tasks like writing, design, or code generation.MediumMay require larger models or specialized fine-tuning for high-quality outputs.
Analytics AgentPerforms data analysis, generates insights, and creates visualizations.Medium-HighRequires integration with data pipelines, analytics tools, and potentially large datasets.
Conversational AgentEngages in extended, context-aware dialogues with users (e.g., chatbots).MediumRequires memory and context management, but lower cost than autonomous or multi-agent systems.
Hybrid AgentCombines multiple capabilities (e.g., RAG + multi-modal + autonomous decision-making).Very HighHigh complexity and resource requirements due to combining multiple advanced features.

Workflow

  1. Customer Interaction:
  • The Conversational Agent receives a customer query (e.g., “Why was my order delayed?”). It classifies the query and determines if it requires external data or personalized insights.
  1. Information Retrieval:
  • If the query requires external data, the Search-Augmented Agent retrieves real-time information (e.g., shipping status from a logistics API).
  • If the query requires internal data, the RAG Agent fetches relevant information from internal documents or databases (e.g., order history, policies).
  1. Data Analysis:
  • The Enterprise Agent pulls customer data from the CRM (e.g., past orders, preferences).
  • The Analytics Agent analyzes this data to identify patterns or insights (e.g., frequent delays for a specific product).
  1. Personalization:
  • The Personalized Agent uses the analytics insights to tailor the response (e.g., “Your order was delayed due to a supplier issue. We’ve applied a 10% discount to your next purchase.”).
  1. Real-Time Coordination:
  • The Real-Time Agent ensures the response is delivered quickly and coordinates with other agents to prioritize urgent queries.
  1. Autonomous Oversight:
    • The Autonomous Agent monitors the system, resolves conflicts (e.g., if two agents provide conflicting responses), and escalates complex issues to human agents.
Flowchart depicting a process for handling order delay queries using various agents: conversational, enterprise, analytics, search-augmented, personalized, real-time, and autonomous.
Flowchart depicting a process for handling order delay queries using various agents: conversational, enterprise, analytics, search-augmented, personalized, real-time, and autonomous.

Multi-Compute Future

It will take a careful balance between agents, traditional services, and agents that build traditional services to create durable and cost effective IT solutions moving forward. As our little human language declarative state machines become more prevalent so to will their solutions. I cannot wait to see what we develop with all of these new tools at our disposal.