Similar to the traditional form of temporal-difference RL—on which the dopamine theory was based—distributional RL assumes that reward-based learning is driven by a RPE, which signals the difference ...
Understanding intelligence and creating intelligent machines are grand scientific challenges of our times. The ability to learn from experience is a cornerstone of intelligence for machines and living ...