Beyond Chatbots: AI Models That Think More…

Jan 21

How to Understand, Prompt, and Run Your Own Local Reasoning Models (Like OpenAI's o1)

2 Comments

A very interesting read. Note that Deepseek V3 is not ranking in the top 5 for coding on LLMSYS (my go-to on how users perceive the efficacy of different LLMs). A lot of these 'benchmarks' are manipulated unfortunately.

Expand full comment

Reply (1)

Sjoerd Tiemensma

Jan 21

I don't really think that's the case here. Deepseek V3 is their regular LLM, not their reasoning model, it ranks right around sonnet 3.5, GPT-4o and Gemini-exp 1206, so it is up there with the other biggest and best models out there, while being significantly cheaper.

I personally use aider, and on the leaderboard there the reasoning model from deepseek is right between o1 and sonnet 3.5, at a fraction of the cost:

https://aider.chat/docs/leaderboards/

I get the sense that this is probably a very fair placement for R1, and time will tell if that is actually the case. I have been using Deepseek v3 extensively and it's not as good as sonnet, but it's still really strong and up there with the absolute best models available for coding.

Expand full comment

Use AI

Beyond Chatbots: AI Models That Think More…