In this paper, we propose a new deep neural network architecture, called NMD net, that has been specifically designed to learn adaptive behaviours. This architecture exploits a biological mechanism called neuromodulation that sustains adaptation in biological organisms. This architecture has been introduced in a deep-RL architecture for interacting with MDPs in a meta-reinforcement learning setting where the action space is continuous. The deep-RL architecture is trained using an advantage actor-critic algorithm. Experiments are carried on several test problems. Results show that the neural network architecture with neuromodulation provides significantly better results than state-of-the-art recurrent neural networks which do not exploit this mechanism.
And to end this post, we provide hereafter an animated gif (you need to click on it) that describes the architecture, and the results it leads to on a benchmark problem. More results in the paper 🙂