Deisenroth & Rasmussen (ICML, 2011): PILCO Deisenroth et al. (PAMI): Gaussian processes for dataefficient learning in robotics and control
How do you control systems for which you do not have a good model for?  Bayesian Machine Learning
We want to learn policies fully autonomously:  Infinite number of combinations of state and control
 Millions of experiments are impractical
We could use three approaches to learn a control  Reinforcement learning
 Imitation learning
 Inverse RL  Ng & Russell 2000
 Behavioral cloning  Pomerleau 1989
 Probabilistic imitation learning  Interesting that the "guidance" can actually affect the learning.
 Bayesian optimization
 Jones 2011
 Brochu et al 2010
 Hennig & Schuler 2012
 Similar to active learning
 Calandra  Seyfarth : Bipedal robot
 Limited to 10  20 parameters
 Build a model of the objective function
 Find the minimum
 Evaluate the true objective function
 Update the model objective function
Reinforcement learning x_t+1 = f(x_t, u_t) + w
 u_t = p(x_t, z)
 min J(z)
 Probabilistic model for transition function "f" to be robust to model errors
 Gaussian process: mean and covariance function
 Compute longterm predictions of p(x,z)
 Policy improvement
 Apply controller
