访问量:   最后更新时间:--

刘发林

博士生导师
硕士生导师
教师姓名:刘发林
教师英文名称:LIU Falin
教师拼音名称:Liu Falin
电子邮箱:
学历:博士研究生毕业
联系方式:0551-63601922
学位:工学博士学位
职称:研究员
毕业院校:中国科学技术大学
所属院系:信息科学技术学院
学科:电子科学与技术    信息与通信工程    
其他联系方式

邮编:

办公室电话:

邮箱:

论文成果
Finding Optimal Observation-Based Policies for Constrained POMDPs Under the Expected Average Reward Criterion
发布时间:2022-07-04    点击次数:

DOI码:10.1109/TAC.2015.2497904

发表刊物:IEEE Trans. on Automatic Control

关键字:POMDPs, Constraints, Performance derivative, Simulation-based optimization, Observation-based policy.

摘要:In this technical note, constrained partially observable Markov decision processes with discrete state and action spaces under the average reward criterion are studied from a sensitivity point of view. By analyzing the derivatives of performance criteria, we develop a simulation-based optimization algorithm to find the optimal observation-based policy on the basis of a single sample path. This algorithm does not need any overly strict assumption and can be applied to the general ergodic Markov systems with the imperfect state information. The performance is proved to converge to the optimum with probability 1. One numerical example is provided to illustrate the applicability of the algorithm.

合写作者:Hongsheng Xi,Xiaodong Wang,Falin Liu

第一作者:Xiaofeng Jiang (姜晓枫)

论文类型:期刊论文

学科门类:工学

文献类型:J

卷号:61

期号:10

页面范围:3070-3075

是否译文:

发表时间:2016-10-01

收录刊物:SCI、EI