EurekAlert August 10, 2020
The existing reinforcement learning schemes can only be applied in a centralized manner which requires pooling the state information of the entire swarm at a central learner resulting in increased computational complexity and communication requirements. To address this problem a team of researchers in the US (Oklahoma State University, Army Research Laboratory, North Carolina State University) is developing a theoretical foundation for data-driven optimal control for large-scale swarm networks, where control actions will be taken based on low-dimensional measurement data instead of dynamic models. It decomposes the global control objective into multiple hierarchies and a broad swarm-level macroscopic control. With each hierarchy having own learning loop with respective local and global reward functions they were able to significantly reduce the learning time…read more. TECHNICAL ARTICLE