Kaige Yang

Kaige Yang

Reseacher at Amazon

I earned by BE in Computer & Communication Engineering from the American University of Beirut (AUB) in 2020 and my MSc in Human Robotics from Imperial College London in 2021. I have also completed a research internship within the Systems Group at ETH Zurich in Summer 2019. From January 2022, I joined the LASP Group as a Phd student. My current research interest is the design of reinforcement learning algorithms in such a way that the energy costs remain reasonable while retaining high performance. My goal is to develop novel data-efficient and generalisable learning strategies.

I joined LASP as a Phd student with UCL Overseas Research Scholarship in April 2018. Prior to joining the group, I earned my BEng and MSc in Electrical Engineering from UCL during 2012-2017.

My research interests lie in the field of machine learning. In particular, I am interested in sequential decision making and reinforcement learning.

Research

Graph Bandit

We consider the problem of stochastic linear bandit with multiple users where a user graph characterizes the affinity between users is available. The goal is to design an asymptotic optimal and computational light algorithm with improved finite-time regret guarantee. The question to answer is: On the basis of existing provably asymptotic optimal algorithms, could the user graph be exploited to improve the finite-time behaviour of asymptotic optimal algorithms, while keeping the computational complexity low.

Link to ArXiv

Laplacian-regularized Estimator Error Analysis

We provide a theoretical analysis of the representation learning problem aimed at learning the latent variables (design matrix) Θ of observations Y with the knowledge of the coefficient matrix X. The design matrix is learned under the assumption that the latent variables Θ are smooth with respect to a (known) topological structure G. To learn such latent variables, we study a graph Laplacian regularized estimator, which is the penalized least squares estimator with penalty term proportional to a Laplacian quadratic form.

Link to Arxiv

Interests
  • Reinforcement Learning

Latest