标题：Cascading Bandits with Costs
时间：2017年11月22日（周三）12:00 am. - 1:00 pm
地点：Room 1A-200, SIST Building
报告人：Prof. Cong Shen
Inviter: Prof. Xiliang Luo
We propose a cost-aware cascading bandits model–a new variant of multi-armed bandits with cascading feedback. Under this model, the player is presented a list of arms in each time frame and allowed to pull them sequentially. Each arm has two possible states ON and OFF, which evolve according to an i.i.d. Bernoulli distribution. The reward that a player receives within one time frame equals one if one of the arms that have being pulled in that frame has an ON state; Otherwise, it equals zero. Different from the classical cascading bandits model, where the user is presented L arms in each time frame, our model considers a“soft”constraint by assigning a cost for pulling each arm. Therefore, an arm may not be pulled if the cost outweighs the expected reward gain by pulling it. Our objective is then to maximize the net reward, i.e., reward-minus-cost, by deciding the list of arms to be pulled in each time frame, the order to pull them, as well as the stopping condition to quit pulling in that frame. We first study the optimal offline policy with a priori knowledge of the state and cost statistics. Then, we propose a cost-aware cascading UCB algorithm for the online setting where both the reward and cost statistics are unknown beforehand, and show that it is order optimal. Finally, we evaluate our online algorithms with synthetic data.
Cong Shen received his B.S. and M.S. degrees, in 2002 and 2004 respectively, from the Department of Electronic Engineering, Tsinghua University, China. He obtained the Ph.D. degree from the Electrical Engineering Department, UCLA, in 2009. From 2009 to 2014, He worked for Qualcomm Research in San Diego, CA, focusing on 4G research. In 2015, he returned to academia and joined University of Science and Technology of China (USTC) as the 100 Talents Program Professor in the School of Information Science and Technology. His research interests include wireless communications, wireless networks, and machine learning. He currently serves as an editor for the IEEE Transactions on Wireless Communications.