Local AI Just Got Crazy Smart—And It’s Only 8B Thinking LLM!
AI Summary
Summary of Deep Hermes 3 Model Evaluation
- Model Overview:
- Deep Hermes 3 is a local thinking model with long Chain of Thought reasoning.
- It can toggle between thinking and non-thinking modes using a system prompt.
- The model is based on llama 3.1 and has 8 billion parameters.
- It was tested using LM Studio.
- Model Performance:
- Google Sheets Formula Test:
- Without thinking mode, the model failed to generate a correct formula.
- With thinking mode enabled, it successfully created a formula based on given conditions.
- Wolfram Alpha Factorization Test:
- Without thinking mode, the model provided an incorrect factorization.
- With thinking mode, it eventually arrived at the correct factorization after extensive deliberation.
- Physics Simulation Test:
- The model performed poorly in programming a bouncing ball inside a spinning hexagon.
- The provided Python code did not meet the task requirements.
- Chemistry Compound Identification Test:
- Deep Hermes 3 correctly identified the compound vanillin, outperforming other models.
- Medical Diagnosis Test:
- The model incorrectly diagnosed a set of symptoms, providing an unrelated disease.
- Math Problem Test:
- The model failed to solve an IIT-JEE entrance exam problem, even after 10 minutes of processing.
- Overall Experience:
- The model excels in reasoning within context and performs well on local machines.
- It shows improvement over the llama 3.1 8 billion parameter model, especially with thinking enabled.
- The system prompt can be customized to direct the model’s behavior.
- Usage Instructions:
- To use Deep Hermes 3, download it from the Discover Tab in LM Studio.
- The model requires about 4.7 GB of storage.
- Add the system prompt from the Hugging Face model page to LM Studio to enable thinking mode.
- Conclusion:
- The model is highly capable in certain contexts but has limitations.
- The creator encourages users to try the model and provide feedback on the tests conducted.
Detailed Instructions and URLs
- No specific CLI commands, website URLs, or detailed instructions were provided in the transcript.