报告题目:混合智能体的均值场马尔科夫决策过程(MF-MDP)及其在网约车市场中的应用
特邀嘉宾:朱政博士 香港科技大学研究助理教授
时间地点:2020年7月23日 下午14:00-16:00,浙江大学紫金港校区安中大楼A326会议室
主持人:王殿海教授 浙江大学智能交通研究所所长
报告内容:网约车的便捷和高效使它成为人们日常出行中不可或缺的交通模式。由于乘客需求和司机供应在时空上的不平衡,司机在等待订单或者接远距离乘客时会造成系统资源的浪费。为提升供需平衡,平台往往为司机提供空闲迅游建议,让他们移动到乘客多的区域;然而司机有自己的目标和空闲巡游策略,并不一定会遵循平台的建议。因此,平台可能需要通过提供时空补贴来诱导司机改变原有的空闲巡游决策,以达到供需平衡的效果。本研究将均值场马尔可夫决策过程(MF-MDP)模型应用到网约车市场的时空补贴和空车迅游研究中:网约车平台通过时空补贴策略提升派单数量和利润,司机通过自身的空闲巡游策略提高个人收益。研究者运用均值场(MF)近似原理,大幅减少了复杂环境中决策智能体的个数,提升运算效率和收敛几率。通过简单的算例,证明了MF-MDP相对一般MDP收敛快的效果,并分析了网约车平台时空补贴的效益和对司机空车巡游决策的影响。MF-MDP这一研究方法也适用于其他混合、多智能体的交通系统决策分析,必将在多模式智能交通系统的动态运营管理中发挥举足轻重的作用。
嘉宾简介:朱政博士(主页:https://www.ce.ust.hk/people/zheng-zhu-zhuzheng),现任香港科技大学研究助理教授(Research Assistant Professor)。朱博士毕业于马里兰大学(2018届土木博士、2014届土木硕士,导师为张磊教授;2017届统计硕士,导师为Prof. Benjamin Kedem)和清华大学(2012届水利学士)。朱博士曾获多项学术、学业奖励,包括2017年国家优秀自费留学生奖学金、University of Maryland Outstanding Graduate Assistant Award,2012届清华大学优良毕业生。朱博士学术势头迅猛,现已发表21篇SCI论文,在入职港科大的两年期间,他主持了1个港府纵向项目(General Research Fund),共同主持了2个其他研究项目,完成了13篇高质量的学术论文(12篇一作或通讯),其中4篇已被交通核心SCI期刊(Transportation Research Part A、C、E等)录取。
Abstract: Ride-sourcing services are increasingly popular because of their ability to accommodate on-demand travel needs. A difficulty faced by ride-sourcing platforms is supply-demand imbalance, which causes wasted time for drivers while idly cruising or en-route to pick up remote passengers. Some platforms attempt to mitigate the imbalance by providing relocation guidance for idle drivers, who, however, may not follow the suggestions. Platforms then seek to provide spatial-temporal subsidies for drivers with certain self-relocation strategies in order to induce their relocation and, in turn, mitigate any supply-demand imbalance. This research proposes a mean-field Markov decision process (MF-MDP) model to describe the dynamics in ride-sourcing markets with mixed players, whereby the platform aims to optimize some objectives from a system perspective by leveraging a spatial-temporal subsidy policy, and a number of drivers aim to maximize their income by following certain self-relocation strategies. We simplify the MF-MDP model with multiple drivers by only considering the decision-making processes of the platform and one driver, who represents multiple players. We present a representative-agent reinforcement learning algorithm to solve the model with significant computational advantages, fast convergence, and better performance. In a set of numerical studies, we demonstrate that by providing spatial-temporal subsidies, the platform is able to well balance the short-term objective of maximizing immediate revenue and the long-term objective of maximizing service rate, while drivers can earn higher income.
Bio: Dr. Zheng Zhu is now a Research Assistant Professor in the Department of Civil and Environmental Engineering, the Hong Kong University of Science and Technology (HKUST). He used to be a geek in mathematics with a 96.4/100 GPA in math and physics courses at Tsinghua University (first in the department) during B.S (2012). Later on, he received a M.S. degree (2014) and a Ph.D. degree (2018) in Civil and Environmental Engineering and a M.A. degree (2017) in Mathematics and Statistics at the University of Maryland. He also won numerous honors and awards, such as 2017 National Award for Outstanding Self-financed Chinese Students Studying Abroad, 2017 University of Maryland Outstanding Graduate Assistant Award, and 2012 Tsinghua University Excellent Undergraduate Student. Dr. Zhu has published 21 SCI journal articles, during the two years at HKUST, he has been the PI of 1 Hong Kong GRC project and finished 13 journal manuscripts, 4 of which have been accepted in top Transportation Engineering journals (e.g., Transportation Research Part A, C, and E).