Ensuring AI works with the right dose of curiosity

MIT News  November 10, 2022 To address the challenge of exploration, incentivizing the agent to visit novel states using an exploration bonus can lead to excellent results on hard exploration tasks but can suffer from intrinsic reward bias and underperform when compared to an agent trained using only task rewards. An international team of researchers (USA – MIT, Finland) has proposed a principled constrained policy optimization procedure that automatically tunes the importance of the intrinsic reward: it suppresses the intrinsic reward when exploration is unnecessary and increases it when exploration is required. According to the researchers this resulted in superior […]