New Memory Model Boosts Learning Efficiency and Performance in Sparse Environments
A new model for rewards in learning has been developed. Instead of just being surprised, the reward comes from how new and unexpected the surprise is. This model uses a memory network to keep track of surprises and their novelty. By doing this, the model helps explorers stay interested in new things without getting distracted by random or noisy stuff. Tests show that this model, combined with different ways of predicting surprises, helps explorers learn faster and do better in tricky situations like sparse rewards and challenging games.