This strictly mathematical and information-theoretic book seeks to solve the quest for the optimal and universal algorithm of intelligent behavior and, to the extent that I have been able to verify and check this bold attempt, succeeds.
The book basically derives this result by cleverly unifying two well-known but different realms, sequential decision theory and Solomonoff's theory of universal induction, to arrive at a parameter free model of a certain universal agent that, as is argued, behaves optimally in any environment.
So, the case is closed from a mathematician's point of view, but not for any pragmatist trying to actually implement the theory (for example, the mathematical model) in software. The real drawback (I have to credit the author for openly admitting this) consists of, first, the fact that the theory turns out to be incomputable (in the usual Turing/partial recursive sense), and, second, that asymptotic approximations to the true theory, while they exist, converge awfully slowly, with no clear criterion available for measuring the degree of convergence.
The book provides a clear exposition for nonspecialists interested in the foundations of AI. The author succeeds in this remarkable feat by providing a short tour through the whole book, right at the beginning (chapter 1); concisely and carefully introducing all theoretical concepts, guaranteeing a fairly self-contained work, with ample room given to references in the particular field and the history thereof; and concluding the book (chapter 8) with a through discussion of what has been achieved, what problems, assumptions, or limitations of the chosen approach are worth noting, and what open questions still remain to be answered.
Before touching on the contents of the individual chapters of the book, I want to highlight the semantics of the three basic definitions underlying the theory: intelligence, universality, and optimality. "Intelligence" or "intelligent behavior" is modeled as goal driven, in the sense of maximizing some utility function. An algorithm is "universal" if it is free of any parameters, and does not depend on the specific environment in which it (or the respective agent) operates. The algorithm is "optimal" with respect to a suitable intelligence order relation introduced by the author, which, as he argues and believes, is reasonable and true.
Chapter 2 introduces the necessary ingredients and constructs from algorithmic information theory (namely, Kolmogorov complexity), algorithmic probability, and universal induction (namely, Solomonoff's universal prior and Martin-Löf randomness).
Chapter 3 focuses on universal sequence prediction, based on a suitable (weighted) approximation of the true but unknown prior probability distribution (in the sense of Bayes). Error and loss bounds are derived for predictors based on this approximation, and some relations to games of chance are shown.
Chapter 4 introduces the concept of an agent, as a system that cyclically interacts with an environment, where the action (output) of the agent in a given cycle is determined by a suitable policy that only depends on the input/output history of the preceding cycles. In addition, the policy depends on the prior probability distribution.
Chapter 5 generalizes the agent model to the approximated prior probability distribution introduced in chapter 3, and argues that this agent represents a universal optimum with regard to several reasonable concepts of optimality, namely, consistency, self-tunability, self-optimization, efficiency, unbiasedness, and various others.
Chapter 6 expands the scope of the optimality proof for the agent model, by applying it to a number of problem classes, such as sequence prediction, strategic games, function minimization problems, and supervised learning from examples.
Chapter 7 addresses the computational aspects of the still noncomputable theoretical model (for instance, introducing the notion of the fastest algorithm), and refines it further to develop a time- and length-bounded, but optimally fast, algorithm for implementing the intelligent agent model.
Chapter 8 reflects on and critically discusses the achievements, as well as some possibly problematic issues and controversial assumptions, and presents an outlook on still open questions. An extensive bibliography and index conclude this remarkable book.
This work is a rare example of a book that really fulfills the goals stated on its blurb: "to excite a broader AI audience about abstract algorithmic information theory concepts, and conversely to inform theorists about exciting applications to AI."
Reviewer: Christoph F. Strnadl
Review #: CR131175