Home > Scientific Research > Paper Publications

Centralized Optimization for Dec-POMDPs Under the Expected Average Reward Criterion

Hits:

DOI number:10.1109/TAC.2017.2702203
Journal:IEEE Trans. on Automatic Control
Key Words:Centralized optimization, decentralized partially observable Markov decision process (Dec-POMDP), large-scale system, sensitivity analysis, stochastic approximation.
Abstract:In this paper, the decentralized partially observable Markov decision process (Dec-POMDP) systems with discrete state and action spaces are studied from a gradient point of view. Dec-POMDPs have recently emerged as a promising approach to optimizing multiagent decision making in the partially observable stochastic environment. However, the decentralized nature of the Dec-POMDP framework results in a lack of shared belief state, which makes the decision maker impossible to estimate the system state based on local information. In contrast to the belief-based policy, this paper focuses on optimizing the decentralized observationbased policy, which is easily to be applied and does not have the sharing problem. By analyzing the gradient of the objective function, we have developed a centralized stochastic gradient policy iteration algorithm to find the optimal policy on the basis of gradient estimates from a single sample path. This algorithm does not need any specific assumption and can be applied to most practical Dec-POMDP problems. One numerical example is provided to demonstrate the effectiveness of the algorithm.
First Author:Xiaofeng Jiang (姜晓枫)
Co-author:Xiaodong Wang,Hongsheng Xi,Falin Liu
Discipline:Engineering
Document Type:J
Volume:62
Issue:11
Page Number:6032-6038
Translation or Not:no
Date of Publication:2017-11-01
Included Journals:SCI、EI

Pre One:A Novel Dual-Band Controllable Bandpass Filter Based on Fan-Shaped Substrate Integrated Waveguide

Next One:One-bit in-phase observation for direct learning-based digital predistortion with modified frequency-domain delay estimation and alignment

ZipCode:
OfficePhone:
Email: