AI Safe Exploration: Reinforced learning with a blocker in unsafe environments

University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

Abstract: Artificial intelligence can be trained with a trial and error based approach. In an environment where a catastrophe can not be accepted a human overseer can be used, but this might lower the efficiency of the learning. The study includes implementation of an artifact meant to replace the human overseer when training an AI in simulated unsafe environments. The results of testing the implemented blocker shows that it can be used for avoiding catastrophes and finding a path to reach the goal in 17 out of 18 runs. The single failed execution shows that the implemented blocker is in need of improvement in terms of data efficiency. Shaping rewards solely to reduce number of steps and catastrophes for a reinforcement learning agent has been done successfully to some degree, but further steps can be taken to lower the number of catastrophes and steps.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)