大模型的各种榜单还具有参考价值吗?

大模型的各种榜单还具有参考价值吗?
大模型的各种榜单还具有参考价值吗?

现在有各种各样的大模型榜单,但是大伙好像对榜单的认可度不高?而且还有和体感不符的情况?
那么现在要找编码能力强的模型那个榜单更有参考价值呢?

附上自己在看的榜单:

artificialanalysis.ai

AI Model & API Providers Analysis | Artificial Analysis

Comparison and analysis of AI models and API hosting providers. Independent benchmarks across key performance metrics including quality, price, output speed & latency.

WebDev AI Leaderboard - Best AI Models for Web Development

WebDev AI Leaderboard - Best AI Models for Web Development

View overall rankings across AI models on front-end web development tasks, including agentic coding workflows that require multi-step reasoning and tool use.

DeepSWE

DeepSWE

DeepSWE measures frontier coding agents on original, long-horizon software engineering tasks.

bf5beeb684a3421f1851c74dcb8ece47
537ef083acf3b322ce722be399e2ccbb
正在处理:2a4a7846-b75f-496f-bdc7-d43cf60fcad8.png…

5 个帖子 - 4 位参与者

阅读完整话题

来源: LinuxDo 最新话题查看原文