DeepSeek LLM NEW Model - Best Opensource Coding Model - Closest to GPT-4!
AI Summary
Summary of DeepSeek Coder Update
- DeepSeek Coder Overview:
- Advanced open-source coding language model.
- Contains 67 billion and 7 billion parameter versions.
- Trained on 2 trillion tokens.
- Outperforms LLaMa 2’s 70 billion model and comparable to GPT-3.5.
- Competes with GPT-4’s coding capabilities.
- Recent Developments:
- New technical reports released.
- New model version (1.5) launched.
- Patreon page offers additional subscriptions.
- Performance and Capabilities:
- Leaderboard shows DeepSeek Coder surpassing most models except GPT-4.
- Krux evaluation benchmark indicates it’s the closest open-source model to GPT-4 Turbo.
- Version 1.5 includes 1.4 trillion additional tokens for coding data.
- Improved in natural language programming and math reasoning.
- Future Plans:
- Hint of a bigger and stronger coder model in development.
- Accessing DeepSeek Coder:
- Available on Hugging Face and LM Studio.
- Detailed instructions provided for downloading and using the new model.
- Fine-tuned on 2 billion tokens of instruction data.
- Employs a window size of 4K and next token prediction objective.
- Demonstration and Resources:
Demonstrations of coding tasks and comparisons available on the blog.
New tech report refines research paper and provides detailed comparisons with other models.