The accuracy-based XCS classifier system has been shown to solve typical
data mining problems in a machine-learning competitive way. However,
successful applications in multistep problems, modeled by a Markov
decision process, were restricted to very small problems. Until now, the
temporal difference learning technique in XCS was based on deterministic
updates. However, since a prediction is actually generated by a set of
rules in XCS and Learning Classifier Systems in general, gradient-based
update methods are applicable. The extension of XCS to gradient-based
update methods results in a classifier system that is more robust and more
parameter independent solving large and difficult maze problems reliably.
Additionally, the paper highlights the relation of XCS to other
function approximation methods in reinforcement learning.
Gradient Descent Methods in Learning Classifier Systems