2 Comments
User's avatar
Marenco Kemp's avatar

A very interesting read. Note that Deepseek V3 is not ranking in the top 5 for coding on LLMSYS (my go-to on how users perceive the efficacy of different LLMs). A lot of these 'benchmarks' are manipulated unfortunately.

Expand full comment
Sjoerd Tiemensma's avatar

I don't really think that's the case here. Deepseek V3 is their regular LLM, not their reasoning model, it ranks right around sonnet 3.5, GPT-4o and Gemini-exp 1206, so it is up there with the other biggest and best models out there, while being significantly cheaper.

I personally use aider, and on the leaderboard there the reasoning model from deepseek is right between o1 and sonnet 3.5, at a fraction of the cost:

https://aider.chat/docs/leaderboards/

I get the sense that this is probably a very fair placement for R1, and time will tell if that is actually the case. I have been using Deepseek v3 extensively and it's not as good as sonnet, but it's still really strong and up there with the absolute best models available for coding.

Expand full comment