Mistral’s new 7B Model with Native Function Calling
AI Summary
Summary: Mistral 7B Model Update
- New Release:
- Mistral released a new version of their 7B model.
- Available on Hugging Face without prior announcement.
- Model Changes:
- Includes a new base model, not just fine-tuning.
- No extensive benchmarks available yet.
- Preliminary benchmarks show Llama 3 outperforms in certain areas.
- New Features:
- Support for function calling with an extended tokenizer vocabulary.
- New special tokens related to tool calls.
- Testing the Model:
- A notebook is provided for testing the instruct model.
- Different methods to run the model: Hugging Face transformers, pipeline, and Mistral’s SDK.
- The new tokenizer has 768 more tokens, possibly including other languages.
- The model maintains user-assistant message format without system roles.
- Performance Observations:
- Llama 3 is generally better, but Mistral is less censored.
- Mistral models favor bullet-point formats.
- Companies are creating fine-tuned datasets with specific output styles.
- Functionality:
- The model handles math problems well when broken down into steps.
- Shows improvement in function calling, especially in React prompts.
- Mistral Inference Package:
- Allows downloading model weights from Hugging Face.
- Includes a command-line interface and code for chat completion.
- Function calling uses special tokens for tool-related interactions.
- Licensing and Access:
- Model is under Apache 2 license but requires opting in with personal information.
- Conclusion:
- Mistral models are good for fine-tuning and specific styles.
- Llama 3 8B may be a better choice for general use.
- The community is expected to explore and fine-tune the new Mistral model further.
For more details, users can refer to the provided notebook and test the model themselves.