Nakamura Yutaka; Mori Takeshi; Ishii Shin
(2005)
There has been a problem called " exploration-exploitation problem " in the field of reinforcement learning. An agent must decide whether to explore a better action which may not necessarily exist, or to exploit many rewards ...