Mixture of Models (MoM) - SHOCKING Results on Hard LLM Problems!
AI Summary
Summary: Mixture of Models Experiment
Concept:
- Based on the wisdom of the crowd principle.
- Multiple models (referred to as “peasants”) solve the same problem.
- Different architectures (King, Duopoly, Democracy) are used to synthesize answers.
Architectures:
- King:
- Multiple LLMs provide answers to a query.
- A “king” model (GPT-4 Turbo) synthesizes these answers with the original query to provide a final response.
- Duopoly:
- Similar to King, but with two “co-founder” models (GPT-4 Turbo and CLA-3 Opus) discussing and agreeing on the best answer.
- Democracy:
- Each model has equal weight.
- Models vote on the best answer, and a “teller” model (GPT-4) counts votes to determine the final answer.
Implementation:
- Models include LLMs like Llama 8B, CLA-3, Hiu, etc.
- User queries are processed, and answers are collected as context.
- System messages guide models towards desired outcomes.
- HTML responses are generated for user-friendly display.
Testing:
- Problems tested include a marble logic puzzle, an age-related logic question, a hard coding problem from LeetCode, and a creative writing task.
- Results varied by architecture, with the King setup generally performing best.
Code Setup:
- Functions for different models are defined.
- Main function orchestrates the process according to the chosen architecture.
- System messages are crafted to guide the “king” or “co-founders” in their synthesis of answers.
Results:
- King architecture solved most problems correctly.
- Duopoly had mixed results, with some correct and some incorrect answers.
- Democracy was less reliable, with the potential for incorrect consensus.
Conclusion:
- The experiment showcased different ways to combine model outputs.
- The King architecture was the most successful in this test.
- The code and further details are available for channel members on GitHub and the community Discord.