Seminar: Andreas Krause - ETH Zurich

Date August 27, 2020
Author Hrvoje Stojic

Safe and Efficient Exploration in Reinforcement Learning

Abstract

At the heart of Reinforcement Learning lies the challenge of trading exploration -- collecting data for identifying better models -- and exploitation -- using the estimate to make decisions. In simulated environments (e.g., games), exploration is primarily a computational concern. In real-world settings, exploration is costly, and a potentially dangerous proposition, as it requires experimenting with actions that have unknown consequences. In this talk, I will present our work towards rigorously reasoning about safety of exploration in reinforcement learning. I will discuss a model-free approach, where we seek to optimize an unknown reward function subject to unknown constraints. Both reward and constraints are revealed through noisy experiments, and safety requires that no infeasible action is chosen at any point. I will also discuss model-based approaches, where we learn about system dynamics through exploration, yet need to verify safety of the estimated policy. Our approaches use Bayesian inference over the objective, constraints and dynamics, and -- under some regularity conditions -- are guaranteed to be both safe and complete, i.e., converge to a natural notion of reachable optimum. I will also present recent results harnessing the model uncertainty for improving efficiency of exploration, and show experiments on safely and efficiently tuning cyber-physical systems in a data-driven manner.

Notes

  • Andreas Krause is a Professor of Computer Science and Director of Learning & Adaptive Systems Group at ETH Zurich. His personal website can be found here .
Share
,,

Related articles

Seminar: Gergely Neu - Pompeu Fabra University

Seminar: Philipp Hennig - University of Tuebingen

Seminar: Vincent Adam - Secondmind & Aalto University

Seminar: Laurence Aitchison - University of Bristol

Seminar: Ciara Pike-Burke - Imperial College London

Optimization Engine
    Learn more
Solutions
Insights
Company
Research
©2024 Secondmind Ltd.