超越记忆:通过循环、记忆和测试时间计算扩展推理深度

发表
MIKHAIL BURTSEVMIKHAIL BURTSEV 提交
作者: Ivan RodkinIvan Rodkin, Daniil Orel, Konstantin Smirnov, Arman Bolatov, BIlal ElbouardiBilal Elbouardi, besher hassanBesher Hassan, Yury KuratovYuri Kuratov, Aydar Bulatov, Preslav Nakov, Timothy Baldwin, Artem Shelmanov, MIKHAIL BURTSEVMikhail Burtsev

摘要

推理是大型语言模型的核心能力,然而理解它们如何学习和执行多步推理仍然是一个悬而未决的问题。在本研究中,我们探索了不同的架构和训练方法如何在一个细胞自动机框架内影响模型的**多步推理能力**。通过在具有随机初始条件的随机布尔函数生成的**状态序列**上进行训练,以排除记忆,我们证明了大多数神经网络架构都能学会抽象底层规则。虽然模型在**下一状态预测**方面取得了高精度,但如果需要**多步推理**,它们的性能会急剧下降。我们证实,增加模型**深度**对于顺序计算至关重要。我们证明,通过**递归、记忆和测试时间计算**扩展有效模型深度可以**大幅增强推理能力**。
查看 arXiv 页面查看 PDF
超越记忆:通过循环、记忆和测试时间计算扩展推理深度

评论

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-36-20 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-36-54 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-37-25 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-37-47 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-38-12 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-38-33 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-39-03 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png

MIKHAIL BURTSEVMIKHAIL BURTSEV
论文作者
论文提交者

Screenshot 2025-08-26 at 08-39-48 Beyond Memorization Extending Reasoning Depth with Recurrence Memory and Test-Time Compute Scaling - 2508.16745v1.pdf.png