DeepSeek LLM NEW Model - Best Opensource Coding Model - Closest to GPT-4!



AI Summary

Summary of DeepSeek Coder Update

  • DeepSeek Coder Overview:
    • Advanced open-source coding language model.
    • Contains 67 billion and 7 billion parameter versions.
    • Trained on 2 trillion tokens.
    • Outperforms LLaMa 2’s 70 billion model and comparable to GPT-3.5.
    • Competes with GPT-4’s coding capabilities.
  • Recent Developments:
    • New technical reports released.
    • New model version (1.5) launched.
    • Patreon page offers additional subscriptions.
  • Performance and Capabilities:
    • Leaderboard shows DeepSeek Coder surpassing most models except GPT-4.
    • Krux evaluation benchmark indicates it’s the closest open-source model to GPT-4 Turbo.
    • Version 1.5 includes 1.4 trillion additional tokens for coding data.
    • Improved in natural language programming and math reasoning.
  • Future Plans:
    • Hint of a bigger and stronger coder model in development.
  • Accessing DeepSeek Coder:
    • Available on Hugging Face and LM Studio.
    • Detailed instructions provided for downloading and using the new model.
    • Fine-tuned on 2 billion tokens of instruction data.
    • Employs a window size of 4K and next token prediction objective.
  • Demonstration and Resources:
    • Demonstrations of coding tasks and comparisons available on the blog.

    • New tech report refines research paper and provides detailed comparisons with other models.