[previous] [home] [search] | Marcus'
AI Recommendation Page |
[contact] [up] [next] |

It will not help you much building one, since in order to properly understand the general theory and to bridge the gap to "narrow" but practical existing AI algorithms, you need a lot more background. Nevertheless, [Leg08] might motivate you to consider reading the books I'll recommend now.

The textbooks below are relevant for understanding and modeling

If you want to bring order into the bunch of methods and ideas you've learned so far, and want to understand more deeply their connection either for curiosity or to extend the existing systems to more general and powerful ones, you need to learn about some concepts that at first seem quite disconnected and theoretical.

- S. Legg. Machine Super Intelligence
(pdf)

*Lulu, PhD Thesis (2008) [for the philosophically inclined. Won the $10000 Singularity Prize]* - M. Hutter.
Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability

*Springer, Berlin, 300 pages (2005) [for the mathematically inclined]* - J. Veness and K. S. Ng and M. Hutter and W. Uther and D. Silver. A Monte Carlo AIXI Approximation

*Journal of Artificial Intelligence Research, 40 (2011) 95-142 [for the practically inclined]*

- J. E. Hopcroft and R. Motwani and J. D. Ullman.
Introduction to Automata Theory, Language, and Computation

*Addison-Wesley, 3rd Edition (2006)* - M. Sipser.
Introduction to the Theory of Computation

*Cengage Learning, 3rd Edition (2012)*

- S. J. Russell and P. Norvig.
Artificial Intelligence. A Modern Approach

*Prentice-Hall, Englewood Cliffs, 4th Edition (2020)*

- M. Li and P. M. B. Vitanyi (2008)
An introduction to Kolmogorov complexity and its applications
- Cover & Thomas (2006) Elements of Information Theory
- Matt Mahoney (2012) Data Compression Explained
- C. S. Wallace (2005) Statistical and Inductive Inference by Minimum Message Length

- R. Sutton and A. Barto.
Reinforcement learning: An introduction

*Cambridge, MA, MIT Press (2018)* - D. P. Bertsekas and J. N. Tsitsiklis.
Neuro-Dynamic Programming

*Athena Scientific, Belmont, MA (1996)* - P. R. Kumar and P. P. Varaiya.
Stochastic Systems: Estimation, Identification, and Adaptive Control

*Prentice Hall, Englewood Cliffs, NJ (1986)*

- C. M. Bishop.
Pattern Recognition and Machine Learning

*Springer (2006)* - S. Prince.
Understanding Deep Learning

*The MIT Press (2023)*

- E. T. Jaynes.
Probability Theory: The Logic of Science

*Cambridge University Press (2003)* - S. J. Press.
Subjective and Objective Bayesian Statistics

*Wiley-Interscience, 2nd edition (2002)* - R. C. Jeffrey.
The Logic of Decision

*University of Chicago Press, Chicago, Illinois, 2nd edition (1983)* - W. Feller.
An Introduction to Probability Theory and its Application

*Volume 1, John Wiley and Sons, New York (1970)* - T. S. Ferguson.
Mathematical Statistics: A Decision Theoretic Approach

*Academic Press, New York, 3rd edition (1967)*

- M. J. Osborne and A. Rubenstein.
A Course in Game Theory

*MIT Press, Cambridge, MA (1996)* - Y. Shoham and K. Leyton-Brown.
Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations

*Cambridge University Press (2008)*

- David A. Forsyth and Jean Ponce.
Computer Vision: A Modern Approach

*Prentice Hall, USA (2002)*.*[ANU Course ENGN4528-s1]* - Daniel Jurafsky and James H. Martin.
Speech and Language Processing

*Prentice Hall, 2nd edition (2008)*.*[ANU Course COMP4650-s2]* - Sebastian Thrun and Wolfram Burgard and Dieter Fox.
Probabilistic Robotics

*The MIT Press (2005)*.*[ANU Course ENGN4627-s2]*

- N. Bostrom.
Superintelligence: Paths, Dangers, Strategies

*Oxford University Press (1992)* - N. Alchin.
Theory of Knowledge

*Hodder Murray Press, 2nd (not 3rd!) edition (2006)* - G. Restall.
Logic: An Introduction

*Fundamentals of Philosophy, Routledge (2006)* - P. Godfrey-Smith.
Theory and Reality: An Introduction to the Philosophy of Science

*Chicago Press (2003)* - J. Earman.
Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory

*MIT Press, Cambridge, MA (1992)*

**computer science**(artificial intelligence, machine learning, computation),**engineering**(information theory, adaptive control),**economics**(rational agents, game theory),**mathematics**(statistics, probability),**psychology**(behaviorism, motivation, incentives, perception, control),**biology**(neuroscience, evolution),**philosophy**(mind, reasoning, language, induction, knowledge),

**Philosophy**(*of mind, knowledge, science, reasoning, induction, deduction*)

Most engineering grows out of science, and all science grew out of philosophy. The mind, knowledge, intelligence, rationality, reasoning, induction, etc. pose a plethora of philosophical questions. Also, the emergence of human-level AIs will have deep social, ethical, and economic consequences, and raises deep philosophical questions. Finally, some exposure to philosophy sharpens your analytical thinking, and trains you to think out of the box by questioning common assumptions which may be wrong.

**Logic**(*predicate logic, reasoning, deduction, proof, completeness, soundness*)

Critical thinking and proper argumentation is fundamental to science and a key trait of intelligence. Since logic formalizes rational arguments and deductive reasoning, it plays a vital role in the field of (good old fashioned) artificial intelligence. Even when you pursue an approach to AI where formal logic plays no direct role (like machine learning and Universal AI), some training in formal logic sharpens your analytical thinking, and the AI problem is difficult and abstract enough that you will definitely profit from it.

**Analysis**(*real numbers and functions, inequalities, limits, differentiation, integration*)

Real numbers and functions are fundamental to describing our (physical) world. Since intelligent agents usually interact with this world, some of their observations, knowledge, and states are naturally represented by real numbers or functions, e.g. battery level, degrees of belief, or trajectory of objects. Limits and derivatives are e.g. needed for dealing with temporal or spatial change and probability densities.

**Linear Algebra**(*linear equations, vectors, matrices, determinants, eigenvectors, quadratic forms*)

Most problems in linear algebra have efficient algorithmic solutions. Therefore many real-world problems in machine learning (ML) and other fields are attempted to formalize in terms of vectors and matrices and often are or can be reduced to or (locally) approximated by linear problems. Also, data is often naturally represented as (e.g. feature) vectors or (e.g. dependency) matrices.

**Probability**(*random variable, conditional probability, expectation, measure theory, densities*)

An agent's subjective uncertainty about our world (e.g. do ETs exist?) and objective random processes (e.g. dice or nuclear decay) can both be modeled by probabilities. Probability theory allows to predict the likelihood of (future) events, so is crucial for inductive reasoning.

**Statistics**(*estimation, likelihood, prior, Bayes rule, hypotheses, central limit theorem*)

Statistics estimates probabilities or models or hypotheses or related quantities from an agent's past observations (e.g. number of heads versus tails), so is also crucial for inductive reasoning. Bayes rule allows to update an agent's belief about our world given new evidence. Note that the Bayesian approach quite blurs the distinction between probability and statistics.

**Programming / C**(*data types/structure, operators, expressions, control flow, functions, I/O*)

Any theory or model or algorithm of an intelligent system has to be implemented before it can be applied in practice (analytical solutions or by hand are out of question). C has among the programming languages a status similar to English among the natural languages. Both are communication defaults. Real academics/programmers are able to read and write English/C. Note that each (GOF)AI paradigm has spawned its own special-purpose programming language (Prolog, Lisp, Scheme, Smalltalk, Haskell, ...).

**Numerics**(*interpolation, integration, function evaluation, root finding, optimization, linear problems*)

Numerical algorithms approximate the solution of problems that involve (functions of) real numbers. Most problems in machine learning involve maximizing or minimizing some functions, which requires optimization algorithms or, after differentiation, solving (non)linear problems.

**Information Theory**(*entropy, information, data compression, channel capacity*)

Intelligent agents are information processing systems. Agents perceive, process, and store information (e.g. bits from a video camera) from the environment and transmit information (e.g. angles for a robotic arm) to their environment. Therefore information theory plays a key role in (Universal) AI.

**Algorithms**(*sorting, data structures, graph algorithms*)

Algorithms are (finite) sequences of elementary (mathematical,logical,branching) operations. Effective solutions of virtually all AI problems involve many different and complex algorithms and data structures.

**Computability**(*languages, automata, Turing machines, complexity classes, randomization*)

Computability and complexity theory classify problems according to their inherent difficulty. They can be used to determine the computational complexity of a particular (AI) problem, i.e. whether there exists an (efficient) algorithmic solution to it. Universal Turing machines play a prominent role in Universal AI.

(

This book is a fantastic introduction into (western) philosophy. If you care at all about philosophy, then this is the book to start with. It is immensely broad without being superficial, avoids obscure philosophical jargon, seems very balanced, and contains a wealth of contemporary popular material from other resources. It covers philosophies of science, arts, math, reason, history, empiricism, paradigms, culture, language, ethics, politics, religion, emotions, truth, and more. Just the collection of quotes at the beginning of each chapter are worth the book. The book makes you question everything you know. This book needs to be read very slowly. It's important to reflect on the material and discuss with others. Maybe form a reading club. This book is actually used for the International Baccalaureate. Peter Godfrey-Smith (2003) Theory and Reality is a clear and thorough undergraduate textbook introduction to the philosophy of science.

(

This beautiful little book is the ideal first introduction to logic. It is a rare (the only?) book which covers the informal philosophical aspects as well as the formal mathematical aspects of predicate logic. The former is used to discuss, motivate, and justify the latter. It even includes very accessible completeness and soundness proofs of first-order logic, and all that in just 200 pages. A more serious book, ideal for CS students, is Boolos&al (2007) Computability and Logic

Here are some further logic books: There are many good elementary introductions to logic, e.g. Nancy Rodgers (2000) Learning to Reason: An Introduction to Logic, Sets and Relations. There are also many books on formal logic. E.g. Joseph Shoenfield (1967) Mathematical Logic is compact and complete but too dry and lacks motivation. Good as a reference or second book. There are also various good informal books on the philosophical aspects of logic, critical thinking and reasoning like Michael Scriven (1977) Reasoning.

(

There are plenty of books on Analysis or its light-weight version Calculus, and most would do. The major choice is to find one at the right level. The suggested book is very elementary. On the other hand, J.M.Dudley (2002) Real Analysis and Probability (ANU MATH2320-s1/MATH3320-s1/MATH3325-s2) is an advanced text on Analysis with the (for our purpose) appropriate emphasis on measure and probability theory.

(

There are plenty of books on Linear Algebra, and most would do. The two above are classics and are still among the most popular ones.

(

This book can serve as a first introduction to probability theory. "It contains lots of examples and an easy development of theory without any sacrifice of rigor, keeping the abstraction to a minimal level. It is indeed a valuable addition to the study of probability theory."

(

3rd year Uni. MATH3029-s1 Pre: statistics & analysis

This book is another elementary introduction to probability but also properly covers statistics.

(

This is a classic unexcelled introduction to (ANSI) C, written by their developers: slim, complete, and to the point. It includes an introduction to programming, to C, a reference manual, and explanation of all standard libraries, and many tips and tricks.

There's one programming language you should master (completely). And that is C. All others are optional and can be learned on demand to any desired degree. C is not very forgiving when you make errors, but this is *good*. It educates you to become a careful programmer.

(

This is a classic book covering a broad range of numerical algorithms, i.e. those involving real numbers, for interpolation, integration, function evaluation, root finding, optimization, fourier transform, differential equations, and many linear algebra problems. The complete C-code is in the book. Emphasis is on comprehensibility rather than optimized black-box libraries. Indeed, the book contains and hence can serve as a compact hands-on introduction to the various mathematical fields. Chapters 1,2,3,7,8,9,10,11,14,15,20 are an absolute must, and Chapters 4,5,6 recommended. If you can't afford whole courses or books on linear algebra, analysis, statistics, etc, this single book may even serve as a poor-man's substitute, although some chapters presuppose the corresponding mathematical knowledge.

(

Part I-III contain a gentle introduction to information theory with ample of motivation and examples. Cover & Thomas (2006) Elements of Information Theory provides a more advanced, comprehensive, and deeper treatment.

(

This comprehensive book is the default textbook on data structures, efficient algorithms, and their analysis, and can also serve as a reference book.

(

The 3rd edition of this textbook is heavily simplified compared to its classical originals. While an introduction to algorithms focusses on concrete problems and their efficient algorithmic solutions, computability and complexity theory studies the classes of problems that can(not) be solved by various (restricted) computational devices. (Unrestricted) Turing machine computability and (non)deterministic polynomial-time computability being just the three most famous classes. The last chapter gives a brief glimpse into computational complexity theory. Arora & Barak (2009) Computational Complexity: A Modern Approach can serve as a first tour through the vast complexity zoo.

© 2000 by ... | [previous] [home] [search] [science] [calculators] [personal] [contact] [up] [next] | ... Marcus Hutter |