Market-Based Reinforcement Learning in Partially Observable Worlds
Authors: | Ivo Kwee,
Marcus Hutter,
Juergen Schmidhuber
(2001) |
Comments: | 8 LaTeX pages, 2 postscript figures |
Subj-class: | Artificial Intelligence; Learning; Multiagent Systems; Neural and Evolutionary Computing |
ACM-class: | I.2 |
Reference: | Proceedings of the 11th International Conference on Artificial Neural Networks (2001)
(ICANN-2001) 865-873 |
Report-no: | IDSIA-10-01 and
cs.AI/0105025 |
Keywords: Hayek system; reinforcement learning; partial observable environment
Abstract:
Unlike traditional reinforcement learning (RL), market-based
RL is in principle applicable to worlds described by partially
observable Markov Decision Processes (POMDPs), where an agent needs
to learn short-term memories of relevant previous events in order to
execute optimal actions. Most previous work, however, has focused
on reactive settings (MDPs) instead of POMDPs. Here we reimplement
a recent approach to market-based RL and for the first time evaluate
it in a toy POMDP setting.
Contents:
- Introduction
- Market-based RL: History & state of the Art
- The Hayek4 System
- Implementation
- Adding Memory to Hayek
- Conclusion
BibTeX Entry
@Article{Hutter:01market,
author = "Ivo Kwee and Marcus Hutter and Juergen Schmidhuber",
title = "Market-Based Reinforcement Learning in Partially Observable Worlds",
number = "IDSIA-10-01",
institution = "Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA)",
address = "Manno(Lugano), CH",
month = aug,
year = "2001",
pages = "865--873",
journal = "Proceedings of the International Conference on Artificial Neural Networks (ICANN-2001)",
editor = "Georg Dorffner and Horst Bishof and Kurt Hornik",
publisher = "Springer",
series = "Lecture Notes in Computer Science (LNCS 2130)",
url = "http://www.hutter1.net/ai/pmarket.htm",
url2 = "http://arxiv.org/abs/cs.AI/0105025",
ftp = "ftp://ftp.idsia.ch/pub/techrep/IDSIA-10-01.ps.gz",
categories = "I.2. [Artificial Intelligence]",
keywords = "Hayek system; reinforcement learning; partial observable environment",
abstract = "Unlike traditional reinforcement learning (RL), market-based
RL is in principle applicable to worlds described by partially
observable Markov Decision Processes (POMDPs), where an agent needs
to learn short-term memories of relevant previous events in order to
execute optimal actions. Most previous work, however, has focused
on reactive settings (MDPs) instead of POMDPs. Here we reimplement
a recent approach to market-based RL and for the first time evaluate
it in a toy POMDP setting.",
}