OpenThinker (Fully Tested) - This NEW REASONING MODEL is QUITE CRAZY!



AI Summary

Summary of Video Transcript

  • Introduction to a new model called Open Thinker.
  • Open Thinker is fully open, with model weights, dataset, and training code available.
  • It is a state-of-the-art open data reasoning model.
  • Open Thinker comes in two sizes: 7B and 32B variants.
  • The models are based on the Quen 2532 B model.
  • Open Thinker 32B was trained on the Open Thoughts 114k dataset.
  • The dataset includes 173k questions with reasoning traces and solution attempts.
  • Verified dataset training scores higher than unverified dataset training.
  • Open Thinker performs well on benchmarks, with fewer tokens than Deep Seek’s R1 model.
  • The 7B model is also available and performs similarly to the Deep Seek distill variant.
  • The model is accessible on AMA for local use.
  • The video demonstrates the model’s capabilities with 13 questions.
  • The 32B model performs well on general tasks and is the first <70B model to answer a word question correctly.
  • The 32B model can run on an RTX 4090.
  • Open-source research and open data are highlighted as beneficial for future research.
  • The video expresses a desire for more model variants, such as F4, to be available in the future.

Detailed Instructions and URLs

  • No specific CLI commands, website URLs, or detailed instructions were provided in the transcript.