⏶23

预训练大语言模型在语境中学习隐马尔可夫模型

06月08日发表

06月10日由 Zhaolin Gao 提交

作者: Yijia Dai, Zhaolin Gao, Yahya Sattar Yahya Satter, Sarah Dean, Jennifer J. Sun

摘要

Hidden Markov Models (HMMs) are foundational tools for modeling sequential data with latent Markovian structure, yet fitting them to real-world data remains computationally challenging. In this work, we show that pre-trained large language models (LLMs) can effectively model data generated by HMMs via in-context learning (ICL)—their ability to infer patterns from examples within a prompt. On a diverse set of synthetic HMMs, LLMs achieve predictive accuracy approaching the theoretical optimum. We uncover novel scaling trends influenced by HMM properties, and offer theoretical conjectures for these empirical observations. We also provide practical guidelines for scientists on using ICL as a diagnostic tool for complex data. On real-world animal decision-making tasks, ICL achieves competitive performance with models designed by human experts. To our knowledge, this is the first demonstration that ICL can learn and predict HMM-generated sequences—an advance that deepens our understanding of in-context learning in LLMs and establishes its potential as a powerful tool for uncovering hidden structure in complex scientific data.

查看 arXiv 页面查看 PDF

Zhaolin Gao

论文作者

论文提交者

此评论已隐藏。

Yijia Dai

论文作者