ThirdBrAIn.tech

ThirdBrAIn.tech

Search

❯

❯

❯

❯

❯

o3 & o4-Mini NEW SOTA LLMs! BEST Coding Model Ever + Tool Use (Fully Tested)

Apr 17, 20251 min read

o3 & o4-Mini NEW SOTA LLMs! BEST Coding Model Ever + Tool Use (Fully Tested)

AI Summary

OpenAI Model Updates

Launch of two new models: O3 and O4 Mini.

O3: Most powerful reasoning model, excels in coding, math, and visual analysis.

Pricing: $10/1 M in p u tt o k e n s,$ 250/1M cached input, $40/1M output tokens.

20% fewer major errors.

Ideal for programming, business, and creative tasks.

O4 Mini: Cost-efficient, high throughput model.

Pricing: $110/1 M in p u tt o k e n s,$ 275/1M cached input, $4.40/1M output tokens.

Surpasses O3 Mini and offers great pricing/value.

Key benchmark scores:

O3: 69.1% on Swaybench, leading in reasoning tasks.

O4 Mini: 93.4% on AIM 2024 and 2025 benchmarks, excellent at math coding and reasoning.

Recommendations:

O4 Mini is suggested for coding tasks due to its efficiency and cost benefits.

Anticipation for further developments, including 03 Pro and GPT-5 expected in July.

Assessment of models via various prompts demonstrated strengths in coding and reasoning tasks.

Tasks included creating a modern note-taking app, the Game of Life, SVG design, math problem-solving, and logical deduction scenarios.

Overall, the models are powerful and offer significant advancements over previous generations, with competitive pricing for functionality.

openai-o4-mini
03
openai-o3
new-openai-ai-coder
openai-codex
new-openai-coder-free
new-free-openai-coder
new-ai-coder
openai-o4-mini-coder
openai-o3-coder-free
new-ai-coder-fully-free
latest-ai-news
artificial-intelligence
o4-mini
ai-news
worldofai
chatgpt
chatgpt-o3
chatgpt-o4-mini
gpt-4-mini
ai-benchmarks
ai-model-comparison
google-gemini-2.5-pro
gemini-vs-openai
ai-tool-use
agentic-ai
ai-web-search
ai-performance
new-ai-models
codex-cli

Graph View

Backlinks

YT-VIDEO 2025-04
YT-VIDEO Last 7 days

Created with Quartz v4.2.3 © 2025

GitHub
Discord Community