⏶23
预训练大语言模型在语境中学习隐马尔可夫模型
发表
由
Zhaolin Gao 提交

作者:
Yijia Dai,
Zhaolin Gao,
Yahya Satter,
Sarah Dean, Jennifer J. Sun

摘要
Hidden Markov Models (HMMs) are foundational tools for modeling sequential
data with latent Markovian structure, yet fitting them to real-world data
remains computationally challenging. In this work, we show that pre-trained
large language models (LLMs) can effectively model data generated by HMMs via
in-context learning (ICL)—their ability to infer patterns from
examples within a prompt. On a diverse set of synthetic HMMs, LLMs achieve
predictive accuracy approaching the theoretical optimum. We uncover novel
scaling trends influenced by HMM properties, and offer theoretical conjectures
for these empirical observations. We also provide practical guidelines for
scientists on using ICL as a diagnostic tool for complex data. On real-world
animal decision-making tasks, ICL achieves competitive performance with models
designed by human experts. To our knowledge, this is the first demonstration
that ICL can learn and predict HMM-generated sequences—an
advance that deepens our understanding of in-context learning in LLMs and
establishes its potential as a powerful tool for uncovering hidden structure in
complex scientific data.
评论

论文作者
论文提交者
此评论已隐藏。