Emerging Architectures of LLM Applications 2025



AI Summary

Summary of Emerging Architectures of LLM Applications in 2025

Introduction

  • Speaker: Gad, CEO of T robs, an AI consulting company.
  • Special Guest: Jeff, a partner engineer at Google focusing on data and AI strategy.
  • AI Solutions Architect: Gabriel from T robs.

Main Topics

  • Agentic Workflows: Moving beyond the concept of agents to focus on workflows.
  • Underlying Architectures: Supporting smart workflows and future LLM applications.
  • Architecture Evolution: From A16Z’s 2023 architecture to updated models.
  • Design Patterns: How to build applications on top of modern LLM architectures.

LLM Applications and User Interaction

  • Chatbots and Co-pilots: Short, iterative cycles of user-AI interaction.
  • Fault Tolerance: Human as the last gateway, ensuring error correction.
  • LLM Limitations: Broad knowledge but lack of specific, contextual awareness.

Cognitive AR Architecture

  • Reference: Article on cognitive architecture for language agents.
  • Components: LLM, memory, planning, retrieval tools, actions, interfaces.
  • Orchestration Focus: Crucial for complex task management and error minimization.

Inference Time Scaling

  • Model Level Planning: Allowing models to “think” longer for better answers.
  • Examples: Riddles demonstrating the effectiveness of reasoning models.
  • Scaling: Improving accuracy by increasing test time.

Orchestrating LLM Systems

  • Auto-regressive Models: Predictions based on previous steps, leading to compounded errors.
  • Looping and Planning: External model loops and planning to avoid errors.

Agent Architectures

  • Graph-Based Orchestration: Like Langra, inspired by NetworkX.
  • Event-Driven Architecture: Pub/Sub systems where the publisher doesn’t know the consumer.
  • Hybrid Approaches: Combining graph and event-driven patterns for complex systems.

Multimodal Capabilities

  • Gemini 2.0: Google’s experimental model allowing for multimodal inputs and outputs.
  • Features: Live streaming, responsibility, function calling, grounding in Google search.

Vertex AI

  • End-to-End Development: Managing LLM operations, safety, and governance.
  • Capabilities: Model garden, fine-tuning, data set management, experiment tracking.

Multi-Agent Systems

  • Specialization: Improving specific tasks without affecting others.
  • Communication: Minimal and relevant information exchange between agents.

Conclusion

  • Context and Orchestration: Key to connecting LLMs to the environment and use cases.
  • Collaborative AI: The potential for AI to work with humans and other AI.
  • Architecture Fit: Importance of choosing the right architecture for the specific use case.

Detailed Instructions and Tips (Not Present)

No detailed instructions such as CLI commands, website URLs, or specific tips were provided in the transcript.