Qwen2 大模型架构图

实测腾讯混元T1正式版.vs.DeepSeek.vs.Qwen2.5-Max,推理能力哪家强？

此次测试的选手是大家熟知的DeepSeek R1、Qwen2.5-Max以及腾讯新出的混元T1正式版。先来一道简单的推理题开开胃。测试题一：谁说谎？有三个人 ...

分析表明，DeepSeek-V3-Base 已经展现出「顿悟时刻」，而 Qwen2.5 基础模型即使没有提示模板也表现出强大的推理能力，这表明存在潜在的预训练偏差。

ByteDance advances DeepSeek work in AI reasoning with open-source project led by intern

2024 using Alibaba Group Holding’s Qwen2.5-32B base model, compared with 47 points attained by R1 when applying the same Alibaba model, the paper showed. Alibaba owns the South China Morning Post.

站长之家3d

Fin-R1：基于Qwen2.5-7B强化学习训练的金融大模型，7B参数击败行业巨头

这款基于Qwen2.5-7B的金融专用大模型通过强化学习训练，在多项金融基准测试中达到了领先水平。令人惊叹的是，Fin-R1仅凭7B参数规模，就成功超越了大多数同等规模甚至数十倍规模的竞争对手。

51CTO3d

揭秘DeepSeek R1-Zero训练方式，GRPO还有极简改进方案

分析表明，DeepSeek-V3-Base 已经展现出「顿悟时刻」，而 Qwen2.5 基础模型即使没有提示模板也表现出强大的推理能力，这表明存在潜在的预训练偏差。此外，作者还在群体相对策略优化（GRPO ...

GitHub2d

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Your access to and use of this dataset are at your own risk. We do not guarantee the accuracy of this dataset. The dataset is provided “as is” and we make no warranty or representation to you with ...

GitHub6d

modelscope/awesome-deep-reasoning

Qwen-QwQ - Qwen 2.5 official repository, with QwQ. S1 from stanford - From Feifei Li team, a distillation and test-time compute impl which can match the performance of O1 and R1.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results