Knowledge Base Optimization of the HFRIQ- Learning
Tompa, Tamás
Kovács, Szilveszter
2025-08-19T08:43:20Z
2025-08-19T08:43:20Z
2024
1785-8860
hu_HU
http://hdl.handle.net/20.500.14044/32424
The learning process of conventional reinforcement learning methods, such as Q-
learning and SARSA typically start with an empty knowledge base. In each iteration step, the
initial empty knowledge base is gradually constructed by reinforcement signals obtained
from the environment. Even only if a fragment of knowledge is available regarding the system
behavior which can be injected into the learning process, the learning performance can be
improved. In Heuristically Accelerated Fuzzy Rule Interpolation-based Q-learning (HFRIQ-
learning), the external knowledge can be represented in the form of human experts defined
state-action fuzzy rules. If the expert knowledge base contains inaccuracies, i.e., incorrect
state-action rules, it can negatively impact the learning performance. The main goal of this
paper is to introduce a methodology for correcting (optimizing) the inaccurate a priori expert
knowledge and as an additional benefit of optimization, to reduce the size of the Q-function
representation fuzzy rule-base during the learning phase. The paper also introduces some
examples how the quality of expert knowledge influences the HFRIQ-learning performance
on a well-known reinforcement learning benchmark problem.
hu_HU
dc.format
PDF
hu_HU
en
hu_HU
Knowledge Base Optimization of the HFRIQ- Learning