Convergence of Discrete MDL for Sequential Prediction
Keywords: Minimum Description Length, Sequence Prediction,
Convergence, Discrete Model Classes, Universal Induction,
Stabilization, Algorithmic Information Theory.
Abstract: We study the properties of the Minimum Description Length principle for
sequence prediction, considering a two-part MDL estimator which is chosen from
a countable class of models. This applies in particular to the important case
of universal sequence prediction, where the model class corresponds to all
algorithms for some fixed universal Turing machine (this correspondence is by
enumerable semimeasures, hence the resulting models are stochastic). We prove
convergence theorems similar to Solomonoff's theorem of universal induction,
which also holds for general Bayes mixtures. The bound characterizing the
convergence speed for MDL predictions is exponentially larger as compared to
Bayes mixtures. We observe that there are at least three different ways of
using MDL for prediction. One of these has worse prediction properties, for
which predictions only converge if the MDL estimator stabilizes. We establish
sufficient conditions for this to occur. Finally, some immediate consequences
for complexity relations and randomness criteria are proven.
BibTeX Entry
@InProceedings{Hutter:04mdl2p,
author = "Marcus Hutter",
title = "Convergence of Discrete {MDL} for Sequential Prediction",
booktitle = "Proc. 17th Annual Conf. on Learning Theory ({COLT-2003})",
series = "Lecture Notes in Artificial Intelligence",
editor = "J. Shawe-Taylor and Y. Singer",
publisher = "Springer",
address = "Berlin",
pages = "300--314",
year = "2004",
http = "http://www.hutter1.net/ai/mdl2p.htm",
url = "http://arxiv.org/abs/cs.LG/0404057",
ftp = "ftp://ftp.idsia.ch/pub/techrep/IDSIA-03-04.pdf",
keywords = "Minimum Description Length, Sequence Prediction,
Convergence, Discrete Model Classes, Universal Induction,
Stabilization, Algorithmic Information Theory.",
abstract = "We study the properties of the Minimum Description Length principle for
sequence prediction, considering a two-part MDL estimator which is chosen from
a countable class of models. This applies in particular to the important case
of universal sequence prediction, where the model class corresponds to all
algorithms for some fixed universal Turing machine (this correspondence is by
enumerable semimeasures, hence the resulting models are stochastic). We prove
convergence theorems similar to Solomonoff's theorem of universal induction,
which also holds for general Bayes mixtures. The bound characterizing the
convergence speed for MDL predictions is exponentially larger as compared to
Bayes mixtures. We observe that there are at least three different ways of
using MDL for prediction. One of these has worse prediction properties, for
which predictions only converge if the MDL estimator stabilizes. We establish
sufficient conditions for this to occur. Finally, some immediate consequences
for complexity relations and randomness criteria are proven.",
}