LLM routing explained with 3 examples, simple to advanced



AI Summary

Video Summary: Routing with LLMs

Introduction

  • The video discusses routing user queries to different language learning models (LLMs) based on query context.

Basic Routing Example

  • File 1: Basic Routing
    • Complex questions are routed to GP4 Omni.
    • Code-related questions are sent to 3.5 Sonet.
    • Simple conversations go to Lama 38 billion.
  • Routing in Action
    • Demonstrates real-time routing during a conversation.
    • Example: “Hi” routes to Lama 38 billion, while a code example request routes to Cloud 3.5 Sonet.
  • Dynamic Routing Considerations
    • Suggests using smaller models for longer conversations to save costs.
  • Availability
    • First file is free on Patreon; other files for Conosur Plus patrons.

Advanced Routing Example

  • File 2: Advanced Routing
    • Different system messages for each model based on the query.
    • Examples include routing to a friendly conversational assistant, an expert software engineer, or an empathetic listener.
  • Routing with System Messages
    • System messages tailored to the user’s query and routed model.
    • Example: “I want to learn fast API” routes to Cloud 3.5 Sonet with a full-stack web developer system message.

Customer Service Routing Example

  • File 3: Customer Service Routing
    • Routes customer service queries to appropriate departments (Electronics, Fashion, Home & Garden, Books & Media).
    • Department-specific system messages provide relevant product information.
  • Routing Demonstration
    • Example: Query about gardening peppers routes to the Home & Garden department.

Code Review and Setup

  • OpenAI and Open Router
    • Uses OpenAI’s GPT-4 in JSON mode for routing.
    • Open Router models are required for this setup.
  • Initialization
    • OpenAI and Open Router clients are initialized with respective API keys.
  • Message Handling
    • Generic message handler takes model name, user query, and optional system message.
  • Routing Logic
    • Primary and secondary routers determine the appropriate model and system message.
  • Customer Service Router
    • A dictionary maps department names to system messages.
    • User input determines the department, and the corresponding system message is used.

Patreon and Courses

  • Patreon Benefits
    • Access to code files, courses, and one-on-one sessions.
    • Mention of THX Master Class, Streamlit, and Fast API courses.
  • Requirements
    • OpenAI and term color libraries are needed.

Conclusion

  • Routing can be creative and complex.
  • Encourages checking out the free file and considering Patreon for more resources.
  • Invites viewers to follow on Twitter for additional content.