A reinforcement learning framework for guiding the agent to perform exploration based on clustering

05.21.25 | Higher Education Press

Comparison between clustering-based bonus rewards with novelty alone (η = 1.0) and clustering-based bonus rewards (η = 0.5). Here, the collected states (blue dots) are clustered into 5 clusters and the agent is rewarded with 1 in the orange area and receives no reward in other areas. Credit: Xiao MA, Shen-Yi ZHAO, Zhao-Heng YIN, Wu-Jun LI

Exploration strategy design is a challenging problem in reinforcement learning (RL), especially when the environment contains a large state space or sparse rewards. During exploration, the agent tries to discover unexplored (novel) areas or high reward (quality) areas. However, most existing methods perform exploration by only utilizing the novelty of states.

To solve the problems, a research team led by Prof. Wu-Jun LI published their new research on 15 Apr 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

The team proposed a novel reinforcement learning framework, clustered reinforcement learning (CRL), for efficient exploration in RL. This framework is evaluated in four continuous control tasks and six hard-exploration Atari-2600 games. Compared with the existing research results, the proposed method can effectively guide the agent to perform efficient exploration.

In the research, they analyze the limited effectiveness of existing exploration strategies, which only use the novelty of states to guide the agent to perform exploration. To use the novelty and quality of states for exploration simultaneously, they adopt clustering to divide the collected states into several clusters based on which a bonus reward reflecting both novelty and quality in the neighboring area (cluster) of the current state is given to the agent. Furthermore, their proposed method can be combined with existing exploration strategies to boost their performance, as the bonus rewards employed by these existing exploration strategies solely capture the novelty of states. The experiments are performed on four continuous control tasks and six hard-exploration Atari-2600 games. The experimental results show that the proposed method can perform better than the existing exploration strategies.

Frontiers of Computer Science

10.1007/s11704-024-3194-1

Experimental study

Not applicable

Clustered reinforcement learning

15-Apr-2025

Article Information

Journal

Frontiers of Computer Science

DOI

10.1007/s11704-024-3194-1

Method of Research

Experimental study

Subject of Research

Not applicable

Article Publication Date

2025-04-15

Article Title

Clustered reinforcement learning

Contact Information

Rong Xie

Higher Education Press

xierong@hep.com.cn

How to Cite This Article

A reinforcement learning framework for guiding the agent to perform exploration based on clustering

Apple iPhone 17 Pro

Keywords

Article Information

Contact Information

Source

How to Cite This Article