On the Convergence Speed of MDL Predictions for Bernoulli Sequences
Keywords: MDL, Minimum Description Length, Convergence Rate,
Prediction, Bernoulli, Discrete Model Class.
Abstract: We consider the Minimum Description Length principle for online
sequence prediction. If the underlying model class is discrete,
then the total expected square loss is a particularly interesting
performance measure: (a) this quantity is bounded, implying
convergence with probability one, and (b) it additionally
specifies a `rate of convergence'. Generally, for MDL only
exponential loss bounds hold, as opposed to the linear bounds for
a Bayes mixture. We show that this is even the case if the model
class contains only Bernoulli distributions. We derive a new upper
bound on the prediction error for countable Bernoulli classes.
This implies a small bound (comparable to the one for Bayes
mixtures) for certain important model classes. The results apply
to many Machine Learning tasks including classification and
hypothesis testing. We provide arguments that our theorems
generalize to countable classes of i.i.d. models.
BibTeX Entry
@InProceedings{Hutter:04mdlspeed,
author = "J. Poland and M. Hutter",
title = "On the convergence speed of {MDL} predictions for {B}ernoulli sequences",
booktitle = "Proc. 15th International Conf. on Algorithmic Learning Theory ({ALT-2004})",
address = "Padova",
series = "LNAI",
volume = "3244",
editor = "S. Ben-David and J. Case and A. Maruoka",
publisher = "Springer, Berlin",
pages = "294--308",
year = "2004",
url = "http://www.hutter1.net/ai/mdlspeed.htm",
http = "http://arxiv.org/abs/cs.LG/0407039",
ftp = "ftp://ftp.idsia.ch/pub/techrep/IDSIA-13-04.pdf",
keywords = "MDL, Minimum Description Length, Convergence Rate,
Prediction, Bernoulli, Discrete Model Class.",
abstract = "We consider the Minimum Description Length principle for online
sequence prediction. If the underlying model class is discrete,
then the total expected square loss is a particularly interesting
performance measure: (a) this quantity is bounded, implying
convergence with probability one, and (b) it additionally
specifies a `rate of convergence'. Generally, for MDL only
exponential loss bounds hold, as opposed to the linear bounds for
a Bayes mixture. We show that this is even the case if the model
class contains only Bernoulli distributions. We derive a new upper
bound on the prediction error for countable Bernoulli classes.
This implies a small bound (comparable to the one for Bayes
mixtures) for certain important model classes. The results apply
to many Machine Learning tasks including classification and
hypothesis testing. We provide arguments that our theorems
generalize to countable classes of i.i.d. models.",
}