Verifiable and Compositional Reinforcement Learning Systems

Cyrus Neary, Christos Verginis, Murat Cubuktepe, and Ufuk Topcu
The International Conference on Automated Planning and Scheduling (ICAPS) 2022.

We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL sub-systems – each of which learns to accomplish a separate sub-task – are composed to achieve an overall task. By defining interfaces between the sub-systems, the framework enables automatic decompositons of task specifications, e.g., reach a target set of states with a probability of at least 0.95, into individual sub-task specifications, i.e. achieve the sub-system’s exit conditions with at least some minimum probability, given that its entry conditions are met. This in turn allows for the independent training and testing of the sub-systems; if they each learn a policy satisfying the appropriate sub-task specification, then their composition is guaranteed to satisfy the overall task specification.

Read More | Paper | Code

Taylor-Lagrange Neural Ordinary Differential Equations: Toward Fast Training and Evaluation of Neural ODEs

Franck Djeumou*, Cyrus Neary*, Eric Goubault, Sylvie Putot, and Ufuk Topcu
Under review at The International Joint Conferences on Artificial Intelligence (IJCAI) 2022.
* indicates equal contribution.

We accelerate the training and evaluation of Neural Ordinary Differential Equations (NODEs) by proposing a data-driven approach to their numerical integration. A suite of numerical experiments – including modeling dynamical systems, image classification, and density estimation – demonstrate that Taylor-Lagrange NODEs can be trained more than an order of magnitude faster than state-of-the-art approaches, without any loss in performance.

Read More | Paper | Code

Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss

Mustafa O. Karabag*, Cyrus Neary*, and Ufuk Topcu
The International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2022.
* indicates equal contribution.

In a cooperative multiagent system, a collection of agents executes a joint policy in order to achieve some common objective. The successful deployment of such systems hinges on the availability of reliable inter-agent communication. However, many sources of potential disruption to communication exist in practice – such as radio interference, hardware failure, and adversarial attacks. In this work, we develop joint policies for cooperative multiagent systems that are robust to potential losses in communication. Numerical experiments show that the proposed minimum-dependency policies require minimal coordination between the agents while incurring little to no loss in performance.

Read More | Paper | Code

Neural Networks with Physics-Informed Architectures and Constraints for Dynamical Systems Modeling

Franck Djeumou*, Cyrus Neary*, Eric Goubault, Sylvie Putot, and Ufuk Topcu
The Learning for Dynamics and Control Conference (L4DC) 2022.
* indicates equal contribution.

Effective inclusion of physics-based knowledge into deep neural network models of dynamical systems can greatly improve data efficiency and generalization. We accordingly develop a framework to learn dynamics models from trajectory data while incorporating a-priori system knowledge as inductive bias. More specifically, the proposed framework uses physics-based side information to inform the structure of the neural network itself, and to place constraints on the values of the outputs and the internal states of the model. By exploiting a-priori system knowledge during training, the proposed approach learns to predict the system dynamics two orders of magnitude more accurately than a baseline approach that does not include prior knowledge, given the same training dataset.

Read More | Paper | Code

Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Cyrus Neary, Zhe Xu, Bo Wu, and Ufuk Topcu
The International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2021.

In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal. We propose the use of reward machines (RMs) to encode the team’s task, which allows the team-level task to be automatically decomposed into sub-tasks for individual agents. We present algorithmically verifiable conditions guaranteeing that distributed completion of the sub-tasks leads to team behavior accomplishing the original task. This framework for task decomposition provides a natural approach to decentralized learning: agents may learn to accomplish their sub-tasks while observing only their local state and abstracted representations of their teammates. We accordingly propose a decentralized Q-learning algorithm and demonstrate its superior sample complexity through two numerical experiments involving three agents and ten agents, respectively.

Read More | Paper | Presentation | Slides | Code

Smooth Convex Optimization Using Sub-Zeroth-Order Oracles

Mustafa O. Karabag, Cyrus Neary, and Ufuk Topcu
The AAAI Conference on Artificial Intelligence (AAAI) 2021.

We present an algorithm to minimize smooth, Lipschitz, convex functions over a compact, convex sets using sub-zeroth-order oracles: an oracle that outputs the sign of the directional derivative for a given point and a given direction, an oracle that compares the function values for a given pair of points, and an oracle that outputs a noisy function value for a given point. We show that the sample complexity of optimization using these oracles is polynomial in the relevant parameters. We also give an algorithm for the noisy-value oracle that incurs sublinear regret in the number of queries and polynomial regret in the number of dimensions.

Read More | Paper