Mistral’s new 7B Model with Native Function Calling



AI Summary

Summary: Mistral 7B Model Update

  • New Release:
    • Mistral released a new version of their 7B model.
    • Available on Hugging Face without prior announcement.
  • Model Changes:
    • Includes a new base model, not just fine-tuning.
    • No extensive benchmarks available yet.
    • Preliminary benchmarks show Llama 3 outperforms in certain areas.
  • New Features:
    • Support for function calling with an extended tokenizer vocabulary.
    • New special tokens related to tool calls.
  • Testing the Model:
    • A notebook is provided for testing the instruct model.
    • Different methods to run the model: Hugging Face transformers, pipeline, and Mistral’s SDK.
    • The new tokenizer has 768 more tokens, possibly including other languages.
    • The model maintains user-assistant message format without system roles.
  • Performance Observations:
    • Llama 3 is generally better, but Mistral is less censored.
    • Mistral models favor bullet-point formats.
    • Companies are creating fine-tuned datasets with specific output styles.
  • Functionality:
    • The model handles math problems well when broken down into steps.
    • Shows improvement in function calling, especially in React prompts.
  • Mistral Inference Package:
    • Allows downloading model weights from Hugging Face.
    • Includes a command-line interface and code for chat completion.
    • Function calling uses special tokens for tool-related interactions.
  • Licensing and Access:
    • Model is under Apache 2 license but requires opting in with personal information.
  • Conclusion:
    • Mistral models are good for fine-tuning and specific styles.
    • Llama 3 8B may be a better choice for general use.
    • The community is expected to explore and fine-tune the new Mistral model further.

For more details, users can refer to the provided notebook and test the model themselves.