# Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

@article{Jing2021AsynchronousDR, title={Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent}, author={Gangshan Jing and He Bai and Jemin George and Aranya Chakrabortty and Piyush Kumar Sharma}, journal={ArXiv}, year={2021}, volume={abs/2107.12416} }

Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale networks. In this paper, we propose a novel distributed zeroth-order algorithm by leveraging the… Expand

#### References

SHOWING 1-10 OF 45 REFERENCES

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents

- Computer Science, Mathematics
- ICML
- 2018

This work appears to be the first study of fully decentralized MARL algorithms for networked agents with function approximation, with provable convergence guarantees, and can be implemented in an online fashion. Expand

Zeroth-Order Stochastic Block Coordinate Type Methods for Nonconvex Optimization

- Computer Science, Mathematics
- 2019

The proposed classes of zeroth-order stochastic block coordinate type methods and the first time that a two-phase BCCG method has been developed to achieve the $(\epsilon, \Lambda)$-solution of nonconvex composite optimization problem are proposed. Expand

Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator

- Computer Science, Mathematics
- ICML
- 2018

This work bridges the gap showing that (model free) policy gradient methods globally converge to the optimal solution and are efficient (polynomially so in relevant problem dependent quantities) with regards to their sample and computational complexities. Expand

A Globally Convergent Algorithm for Nonconvex Optimization Based on Block Coordinate Update

- Mathematics, Computer Science
- J. Sci. Comput.
- 2017

An algorithm for nonconvex optimization is proposed and its global convergence (of the whole sequence) to a critical point is established and its asymptotic convergence rate is given and numerically demonstrated. Expand

On the Exponential Number of Connected Components for the Feasible Set of Optimal Decentralized Control Problems

- Computer Science
- 2019 American Control Conference (ACC)
- 2019

A measure of problem complexity in terms of connectivity, and it is shown that there is no polynomial upper bound on the number of connected components for the set of static stabilizing decentralized controllers. Expand

ZONE: Zeroth-Order Nonconvex Multiagent Optimization Over Networks

- Computer Science, Mathematics
- IEEE Transactions on Automatic Control
- 2019

This paper develops efficient distributed algorithms for optimizing a class of nonconvex problems and under the challenging setting, where each agent can only access the zeroth-order information of its local functions. Expand

Distributed LQR Design for Identical Dynamically Decoupled Systems

- Engineering, Computer Science
- IEEE Transactions on Automatic Control
- 2008

The design procedure proposed in this paper illustrates how stability of the large-scale system is related to the robustness of local controllers and the spectrum of a matrix representing the desired sparsity pattern of the distributed controller design problem. Expand

Improving the Convergence Rate of One-Point Zeroth-Order Optimization using Residual Feedback

- Computer Science
- ArXiv
- 2020

This paper proposes a novel one-point feedback scheme that queries the function value only once at each iteration and estimates the gradient using the residual between two consecutive feedback points and shows that this scheme achieves the same convergence rate as that of ZO with two- point feedback with uncontrollable data samples. Expand

Computing Stabilizing Linear Controllers via Policy Iteration

- Computer Science
- 2020 59th IEEE Conference on Decision and Control (CDC)
- 2020

This paper gives a model-free, off-policy reinforcement learning algorithm for computing a stabilizing controller for deterministic LQR problems with unknown dynamics and cost matrices. Expand

On the Linear Convergence of Random Search for Discrete-Time LQR

- Mathematics
- IEEE Control Systems Letters
- 2021

Model-free reinforcement learning techniques directly search over the parameter space of controllers. Although this often amounts to solving a nonconvex optimization problem, for benchmark control… Expand