Artificial Intelligence Page of Marcus Hutter

Marcus' AI Recommendation Page

Short AI Reader's Guide

Long AI Reader's Guide

Recommended Textbooks

Undergraduate Studies Guide
Undergraduate Books and Courses
AGI Job Roles

Update: The study recommendations below are tailored to students interested in deeply understanding what artificial intelligence is, and/or who want to become great AGI Research Scientists. Since DeepMind (*2010), OpenAI (*2015), and ChatGPT (*2022), there is now a proliferation of diverse AGI job opportunities with different skill requirements than outlined below.

Short AI Reader's Guide

Please find below suggestions for some textbooks which I found most relevant for understanding and modeling intelligent behavior in general, and for developing the AIXI model in particular. If you are confused by the amount, diversity or complexity of the references below, I suggest you to start with the Reinforcement Learning book by Sutton and Barto. It requires no background knowledge, describes the key ideas, open problems, and great applications of this field. Don't be surprised about the ease of the book, it teaches understanding, not proofs. It gets really tough to make things work in practice and to prove things. The Artificial Intelligence book by Russell and Norvig gives a comprehensive overview over AI in general. The Kolmogorov Complexity book by Li and Vitanyi is an excellent introduction to algorithmic information theory. If you have some background knowledge in decision theory and algorithmic information theory you may be interested in the Theory of Universal Artificial Intelligence.

Long AI Reader's Guide

For the impatient. If you are the sort of impatient student who wants to build super intelligent machines right away without "wasting" time reading or learning too much, well, others have tried in the last 50 years and failed, and so will you. If you can't hold back, at least read Legg (2008) [Leg08]. This is an excellently written non-technical thesis on the necessary ingredients for super intelligent machines.
     It will not help you much building one, since in order to properly understand the general theory and to bridge the gap to "narrow" but practical existing AI algorithms, you need a lot more background. Nevertheless, [Leg08] might motivate you to consider reading the books I'll recommend now.

Artificial Intelligence. Russell and Norvig (2020) [RN20] is the textbook to learn about Artificial Intelligence. The book gives a broad introduction, survey, and solid background of all aspects of AI. There is no real alternative. Whatever subarea of AI you specialize later, you should understand all introduced concepts, and have implemented and solved at least some of the exercises.
     The textbooks below are relevant for understanding and modeling general intelligent behavior. If you already got attracted to some specific AI applications, they may not be relevant for you. One axis of categorizing AI is into (1) logical (2) planning and (3) learning aspects. CSL@ANU has experts in all 3 areas. Historically, AI research started with (1) in the 1950s, which is still relevant for many concrete practical applications. Since at least in humans, high-level logical reasoning seems to emerge from the more basic learning and planning aspects, it is conceivable that (1) will play no fundamental role in a general AI system. So I will concentrate on (2) and (3). If put together, learning+planning under uncertainty is mainly the domain of reinforcement learning (RL), also called adaptive control or sequential decision theory in other fields.

Reinforcement Learning. Sutton and Barto (2018) [SB18] is the excellent default RL textbook. It requires no background knowledge, describes the key ideas, open problems, and great applications of this field. Don't be surprised about the ease of the book, it teaches understanding, not proofs. It gets really tough to make things work in practice or to prove things [BT96].
     If you want to bring order into the bunch of methods and ideas you've learned so far, and want to understand more deeply their connection either for curiosity or to extend the existing systems to more general and powerful ones, you need to learn about some concepts that at first seem quite disconnected and theoretical.

Information theory. Intelligence has a lot to do with information processing. Algorithmic information theory (AIT) is a branch of information theory that is powerful enough to serve as a foundation for intelligent information processing. It can deal with key aspects of intelligence, like similarity, creativity, analogical reasoning, and generalization, which are fundamentally connected to the induction problem and Ockham's razor principle. Li and Vitanyi's (2019) AIT book [LV97] provides an excellent introduction. Kolmogorov complexity, Minimal Description Length, universal Solomonoff induction, universal Levin search, and all that. It requires a background in theoretical computer science in general and computability theory in particular, which can be obtained from the classic textbook [HMU06].

Universal AI. Now you are in a position to read [Hut05]. This book develops a sound and complete mathematical theory of an optimal intelligent general-purpose learning agent. The theory is complete in the sense that it gives a complete description of this agent, not just an incomplete framework with gaps to be filled. But be warned, it is only a theory. Like it is a long way from e.g. the minimax theory of optimally playing games like chess to real chess programs, it is a long way from this theory to a practical general purpose intelligent agent. This journey you can find in the prequel and sequel [HQC24]. It gives a more gentle introduction to all relevant mathematical concepts, but also the theoretical and practical developments since the first book, though it omits much of the more advanced work you can find in [Hut05] and papers since then.

Machine/Deep Learning. Bishop (2006) [Bis06] is an excellent default textbook in statistical machine learning, and should be put on your reading list. Deep Neural Networks and especially Transformers have revolutionized the field of AI and brought us closer to the holy grail of Artificial General Intelligence. My book and course recommendation focusses on foundations and theoretical understanding, since these survive AI winters and trends. But there may be no further AI winter and Neural Networks may really be all we need to get to AGI, so a Deep Learning book recommendation is in order. There are many great DL coding books and courses and some not so good monographs, but Prince (2023) [Pri23] is the (only excellent) textbook on deep learning.

Peripheral Areas. The other recommended books below can be regarded as further readings that provide more background and deepen your understanding of various important aspects in AI research. Some Bayesian probability book will be useful [Pre02, Jay03]. How multiple rational agents interact [SLB08] is the domain of game theory [OR96]. Computer vision [FP02], natural language understanding [JJ08], and robotics [TBF05] interfaces abstract agents with the real world. Bostrom (2014) [Bos14] is a broad philosophical discussion of the paths towards and dangers of super-intelligent machines and mitigation strategies. Alchin (2006) [Alc06] gently and broadly introduces you to philosophy of science in general and Earman (1992) [Ear92] to the induction problem in particular. Reading enough hard science fiction is essential to broaden your horizon and keep an open mind about possible futures. Despite there being only one past but many potential futures, schools and universities only offer history classes but no futurology classes. This is absurd. Absent the latter, hard SciFi has to fill this gap, even if many ideas feel unrealistic and only a small fraction will be realized. Below is just a tiny selection.

S. Legg. Machine Super Intelligence (pdf)
Lulu, PhD Thesis (2008) [for the philosophically inclined. Won the $10000 Singularity Prize]
M. Hutter and D. Quarel and E. Catt. Introduction to Universal Artificial Intelligence (2024)
Chapman & Hall/CRC, 500 pages (2024) [prequel and sequel to ...]
M. Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability
Springer, Berlin, 300 pages (2005) [for the mathematically inclined]
J. Veness and K. S. Ng and M. Hutter and W. Uther and D. Silver. A Monte Carlo AIXI Approximation
Journal of Artificial Intelligence Research, 40 (2011) 95-142 [for the practically inclined]

Introduction to theoretical computer science

J. E. Hopcroft and R. Motwani and J. D. Ullman (2006) Introduction to Automata Theory, Language, and Computation
M. Sipser (2012) Introduction to the Theory of Computation

Introduction to and survey of AI [ANU Course COMP3620-s1]

S. J. Russell and P. Norvig (2020) Artificial Intelligence. A Modern Approach

Information Theory (Kolmogorov, Solomonoff, Levin, algorithmic, Shannon, MML, MDL, data compression, ...)

S. Rathmanner and M. Hutter (2011) A Philosophical Treatise of Universal Induction
M. Li and P. M. B. Vitanyi (2019) An introduction to Kolmogorov complexity and its applications
Cover & Thomas (2006) Elements of Information Theory
Matt Mahoney (2012) Data Compression Explained
C. S. Wallace (2005) Statistical and Inductive Inference by Minimum Message Length

Interface: Vision, Language, Robotics

David A. Forsyth and Jean Ponce (2002) Computer Vision: A Modern Approach
Daniel Jurafsky and James H. Martin (2008) Speech and Language Processing
Sebastian Thrun and Wolfram Burgard and Dieter Fox (2005) Probabilistic Robotics

Philosophy of AGI, Science, Logic, and Induction

N. Bostrom (2014) Superintelligence: Paths, Dangers, Strategies
N. Alchin (2006) Theory of Knowledge (2nd ed. not 3rd|4th!)
G. Restall (2006) Logic: An Introduction
P. Godfrey-Smith (2003) Theory and Reality: An Introduction to the Philosophy of Science
J. Earman (1992) Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory

Science Fiction

I. Asimov (1950) I, Robot
S. Lem (1957-1971) The Star Diaries and Memoirs of a Space Traveller
D. F. Jones (1966) Colossus (The 1970 movie adaptation is still the most realistic SciFi movie of AI taking over the world)
D. Adams (1979-1980) The Hitchhiker's Guide to the Galaxy and The Restaurant at the End of the Universe
P. Domingos (2024) 2040: A Silicon Valley Satire

Undergraduate Studies Guide

Fields. The study and construction of intelligent agents that learn from experience draws from many fields, including

computer science (artificial intelligence, machine learning, computation),
engineering (information theory, adaptive control),
economics (rational agents, game theory),
mathematics (statistics, probability),
psychology (behaviorism, motivation, incentives, perception, control),
biology (neuroscience, evolution),
philosophy (mind, reasoning, language, induction, knowledge),

This is particularly true for the information-theoretic reinforcement learning approach to universally intelligent agents (Universal AI). This means that you have to acquire a lot of background knowledge first. Textbooks on an advanced undergraduate or graduate level to most of these subjects have been recommended above. A background in the following subjects should be sufficient to understand them.

Topics. The following 11 items list the most relevant undergraduate background subjects: Logic and philosophy are cornerstones of reasoning. Analysis and linear algebra handle real-valued information. Probability and statistics are required for dealing with uncertainty and learning aspects. Information theory deals with data and knowledge. Programming, numerics, algorithms, and computability are needed for the algorithmic, implementation, and applied aspects of AI.

Philosophy (of mind, knowledge, science, reasoning, induction, deduction)
Most engineering grows out of science, and all science grew out of philosophy. The mind, knowledge, intelligence, rationality, reasoning, induction, etc. pose a plethora of philosophical questions. Also, the emergence of human-level AIs will have deep social, ethical, and economic consequences, and raises deep philosophical questions. Finally, some exposure to philosophy sharpens your analytical thinking, and trains you to think out of the box by questioning common assumptions which may be wrong.
Logic (predicate logic, reasoning, deduction, proof, completeness, soundness)
Critical thinking and proper argumentation is fundamental to science and a key trait of intelligence. Since logic formalizes rational arguments and deductive reasoning, it plays a vital role in the field of (good old fashioned) artificial intelligence. Even when you pursue an approach to AI where formal logic plays no direct role (like machine learning and Universal AI), some training in formal logic sharpens your analytical thinking, and the AI problem is difficult and abstract enough that you will definitely profit from it.
Analysis (real numbers and functions, inequalities, limits, differentiation, integration)
Real numbers and functions are fundamental to describing our (physical) world. Since intelligent agents usually interact with this world, some of their observations, knowledge, and states are naturally represented by real numbers or functions, e.g. battery level, degrees of belief, or trajectory of objects. Limits and derivatives are e.g. needed for dealing with temporal or spatial change and probability densities.
Linear Algebra (linear equations, vectors, matrices, determinants, eigenvectors, quadratic forms)
Most problems in linear algebra have efficient algorithmic solutions. Therefore many real-world problems in machine learning (ML) and other fields are attempted to formalize in terms of vectors and matrices and often are or can be reduced to or (locally) approximated by linear problems. Also, data is often naturally represented as (e.g. feature) vectors or (e.g. dependency) matrices.
Probability (random variable, conditional probability, expectation, measure theory, densities)
An agent's subjective uncertainty about our world (e.g. do ETs exist?) and objective random processes (e.g. dice or nuclear decay) can both be modeled by probabilities. Probability theory allows to predict the likelihood of (future) events, so is crucial for inductive reasoning.
Statistics (estimation, likelihood, prior, Bayes rule, hypotheses, central limit theorem)
Statistics estimates probabilities or models or hypotheses or related quantities from an agent's past observations (e.g. number of heads versus tails), so is also crucial for inductive reasoning. Bayes rule allows to update an agent's belief about our world given new evidence. Note that the Bayesian approach quite blurs the distinction between probability and statistics.
Programming / C (data types/structure, operators, expressions, control flow, functions, I/O)
Any theory or model or algorithm of an intelligent system has to be implemented before it can be applied in practice (analytical solutions or by hand are out of question). C has among the programming languages a status similar to English among the natural languages. Both are communication defaults. Real academics/programmers are able to read and write English/C. Note that each (GOF)AI paradigm has spawned its own special-purpose programming language (Prolog, Lisp, Scheme, Smalltalk, Haskell, ...). Currently you should add Python to your list of AI/ML programming languages.
Numerics (interpolation, integration, function evaluation, root finding, optimization, linear problems)
Numerical algorithms approximate the solution of problems that involve (functions of) real numbers. Most problems in machine learning involve maximizing or minimizing some functions, which requires optimization algorithms or, after differentiation, solving (non)linear problems.
Information Theory (entropy, information, data compression, channel capacity)
Intelligent agents are information processing systems. Agents perceive, process, and store information (e.g. bits from a video camera) from the environment and transmit information (e.g. angles for a robotic arm) to their environment. Therefore information theory plays a key role in (Universal) AI.
Algorithms (sorting, data structures, graph algorithms)
Algorithms are (finite) sequences of elementary (mathematical,logical,branching) operations. Effective solutions of virtually all AI problems involve many different and complex algorithms and data structures.
Computability (languages, automata, Turing machines, complexity classes, randomization)
Computability and complexity theory classify problems according to their inherent difficulty. They can be used to determine the computational complexity of a particular (AI) problem, i.e. whether there exists an (efficient) algorithmic solution to it. Universal Turing machines play a prominent role in Universal AI.

Undergraduate Books and Courses

Below I recommend some books and courses which take you from high school level to a college bachelor level, and should enable you to read the textbooks above. Note that just reading websites or short (Wikipedia) articles about these topic is not a substitute for working through a textbook. You need to do a fair amount of exercises, e.g. those in the books suggested below. The list below is quite comprehensive. Of course the depth, breadth, and focus can depend on what you want to do: Just implement and apply some AI or ML algorithms, understand the algorithms, understand the mathematics behind them, or even further develop the theory.
ANU. The Australian National University (ANU) offers a Bachelor of Advanced Computing. The Computational Foundations Major and the Intelligent Systems Major both have a good overlap with my book recommendations. In case you are doing one of these majors and are seriously interested in Universal AI, add as many mathematics and statistics courses/books as possible to this curriculum.
Notation. For most books I indicate level, prerequisites, and related courses at the ANU: The indicated year gives you a rough indication about the level of the subject and amount of presumed background knowledge and year in which this course could be taken, presuming a degree that follows the book selection below. Whenever available, I indicate for a book the closest matching ANU course. The course might cover only certain aspects or not be that close after all. The bold-face title books are recommended, the others are optional further reading.

Nicolas Alchin (2006) Theory of Knowledge
(0th year Uni. ANU PHIL1004-s1/PHIL2057. Prerequisites: none. Avoid 3rd&later ed. They're worse!)
This book is a fantastic introduction into (western) philosophy. If you care at all about philosophy, then this is the book to start with. It is immensely broad without being superficial, avoids obscure philosophical jargon, seems very balanced, and contains a wealth of contemporary popular material from other resources. It covers philosophies of science, arts, math, reason, history, empiricism, paradigms, culture, language, ethics, politics, religion, emotions, truth, and more. Just the collection of quotes at the beginning of each chapter are worth the book. The book makes you question everything you know. This book needs to be read very slowly. It's important to reflect on the material and discuss with others. Maybe form a reading club. This book is actually used for the International Baccalaureate. Peter Godfrey-Smith (2003) Theory and Reality is a clear and thorough undergraduate textbook introduction to the philosophy of science.

Greg Restall (2006) Logic: An Introduction
(1st year Uni. ANU COMP2620-s1/PHIL2080-s1/MATH3343-s1. Prerequisites: none.)
This beautiful little book is the ideal first introduction to logic. It is a rare (the only?) book which covers the informal philosophical aspects as well as the formal mathematical aspects of predicate logic. The former is used to discuss, motivate, and justify the latter. It even includes very accessible completeness and soundness proofs of first-order logic, and all that in just 200 pages. A more serious book, ideal for CS students, is Boolos&al (2007) Computability and Logic
Here are some further logic books: There are many good elementary introductions to logic, e.g. Nancy Rodgers (2000) Learning to Reason: An Introduction to Logic, Sets and Relations. There are also many books on formal logic. E.g. Joseph Shoenfield (1967) Mathematical Logic is compact and complete but too dry and lacks motivation. Good as a reference or second book. There are also various good informal books on the philosophical aspects of logic, critical thinking and reasoning like Michael Scriven (1977) Reasoning.

Ghorpade & Limaye (2006) A Course in Calculus and Real Analysis
(0th year Uni. ANU MATH1115-s1/MATH1116-s2. Prerequisites: none.)
There are plenty of books on Analysis or its light-weight version Calculus, and most would do. The major choice is to find one at the right level. The suggested book is very elementary. On the other hand, J.M.Dudley (2002) Real Analysis and Probability (ANU MATH2320-s1/MATH3320-s1/MATH3325-s2) is an advanced text on Analysis with the (for our purpose) appropriate emphasis on measure and probability theory.

Shilov (1977) Linear Algebra or Strang (2009) Linear Algebra and Its Applications
(0th year Uni. ANU MATH1115-s1/MATH1116-s2. Prerequisites: none.)
There are plenty of books on Linear Algebra, and most would do. The two above are classics and are still among the most popular ones.

Grinstead & Snell (2006) Introduction to Probability
(1st year Uni. ANU STAT2001-s1. Pre: calculus. free online)
This book can serve as a first introduction to probability theory. "It contains lots of examples and an easy development of theory without any sacrifice of rigor, keeping the abstraction to a minimal level. It is indeed a valuable addition to the study of probability theory."

DeGroot & Schervish (2011) Probability and Statistics
(2nd year Uni. ANU STAT3013-s2/STAT3056-s2. Pre: calculus.
3rd year Uni. MATH3029-s1 Pre: statistics & analysis)
This book is another elementary introduction to probability but also properly covers statistics.

Kernighan & Ritchie (1988) C Programming Language
(1st year Uni. ANU COMP1730-s2. Pre: access to a computer with a C compiler.)
This is a classic unexcelled introduction to (ANSI) C, written by their developers: slim, complete, and to the point. It includes an introduction to programming, to C, a reference manual, and explanation of all standard libraries, and many tips and tricks.
There's one programming language you should master (completely). And that is C. All others are optional and can be learned on demand to any desired degree. C is not very forgiving when you make errors, but this is good. It educates you to become a careful programmer. Currently Python is the most popular AI/ML programming language.

Press & al.(1992) Numerical Recipes in C
(2nd year Uni. ANU MATH3511-s1/MATH3512-s2/MATH3514. Pre: programming experience in C. higher math for some chapters. free online)
This is a classic book covering a broad range of numerical algorithms, i.e. those involving real numbers, for interpolation, integration, function evaluation, root finding, optimization, fourier transform, differential equations, and many linear algebra problems. The complete C-code is in the book. Emphasis is on comprehensibility rather than optimized black-box libraries. Indeed, the book contains and hence can serve as a compact hands-on introduction to the various mathematical fields. Chapters 1,2,3,7,8,9,10,11,14,15,20 are an absolute must, and Chapters 4,5,6 recommended. If you can't afford whole courses or books on linear algebra, analysis, statistics, etc, this single book may even serve as a poor-man's substitute, although some chapters presuppose the corresponding mathematical knowledge.

David MacKay (2003) Information Theory, Inference and Learning Algorithms
(2nd year Uni. ANU COMP2610-s2. Pre: some background in elementary statistics and probabilities and programming experience.)
Part I-III contain a gentle introduction to information theory with ample of motivation and examples. Cover & Thomas (2006) Elements of Information Theory provides a more advanced, comprehensive, and deeper treatment.

Cormen & al.(2009) Introduction to Algorithms
(2nd year Uni. ANU COMP3600-s2/COMP4600-s2. Pre: some programming experience, elementary calculus, proof by induction.)
This comprehensive book is the default textbook on data structures, efficient algorithms, and their analysis, and can also serve as a reference book.

Hopcroft & Motwani & Ullman (2006) Introduction to Automata Theory, Language, and Computation
(3rd year Uni. ANU COMP3630-s1. Pre: introduction to algorithms.)
The 3rd edition of this textbook is heavily simplified compared to its classical originals. While an introduction to algorithms focusses on concrete problems and their efficient algorithmic solutions, computability and complexity theory studies the classes of problems that can(not) be solved by various (restricted) computational devices. (Unrestricted) Turing machine computability and (non)deterministic polynomial-time computability being just the three most famous classes. The last chapter gives a brief glimpse into computational complexity theory. Arora & Barak (2009) Computational Complexity: A Modern Approach can serve as a first tour through the vast complexity zoo.

[previous] [home] [search] [science] [calculators] [personal] [contact] [up] [next]

... Marcus Hutter

Short AI Reader's Guide

Long AI Reader's Guide

Recommended Textbooks

Universal Artificial Intelligence

Introduction to theoretical computer science

Introduction to and survey of AI [ANU Course COMP3620-s1]

Information Theory (Kolmogorov, Solomonoff, Levin, algorithmic, Shannon, MML, MDL, data compression, ...)

Reinforcement Learning = Sequential Decision Theory = Adaptive control theory

Machine/deep learning [ANU Course COMP4670-s1]

Probability theory (from easy to harder?) and statistics

Game Theory = Multi-Agent Theory [ANU Course ECON2142-s2]

Interface: Vision, Language, Robotics

Philosophy of AGI, Science, Logic, and Induction

Science Fiction

Undergraduate Studies Guide

Undergraduate Books and Courses