论文标题

真正的贝叶斯熵估计

Truly Bayesian Entropy Estimation

论文作者

Papageorgiou, Ioannis, Kontoyiannis, Ioannis

论文摘要

估计离散时间序列的熵率是一个具有挑战性的问题,其中包括神经科学,基因组学,图像处理和自然语言处理在内的许多领域的重要应用。为此任务开发了许多方法,通常基于通用数据压缩算法或基础过程分布的统计估计器。在这项工作中,我们提出了一种完全bayesian的方法进行熵估计。以最近引入的贝叶斯上下文树(BCT)框架为基础,将离散时间序列建模为可变记忆马尔可夫链,我们表明可以直接从熵率的诱导后验中采样。这可以用来估计整个后验分布,提供比点估计值更丰富的信息。我们为熵率的后验分布开发理论结果,包括一致性和渐近正态性的证据。该方法的实际实用性在模拟和现实世界数据上进行了说明,在该数据中,发现它的表现优于最先进的替代方案。

Estimating the entropy rate of discrete time series is a challenging problem with important applications in numerous areas including neuroscience, genomics, image processing and natural language processing. A number of approaches have been developed for this task, typically based either on universal data compression algorithms, or on statistical estimators of the underlying process distribution. In this work, we propose a fully-Bayesian approach for entropy estimation. Building on the recently introduced Bayesian Context Trees (BCT) framework for modelling discrete time series as variable-memory Markov chains, we show that it is possible to sample directly from the induced posterior on the entropy rate. This can be used to estimate the entire posterior distribution, providing much richer information than point estimates. We develop theoretical results for the posterior distribution of the entropy rate, including proofs of consistency and asymptotic normality. The practical utility of the method is illustrated on both simulated and real-world data, where it is found to outperform state-of-the-art alternatives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源