Cookies?
Library Header Image
LSE Research Online LSE Library Services

Combinatorial bandits for maximum value reward function under value-index feedback

Wang, Yiliu, Chen, Wei and Vojnovic, Milan ORCID: 0000-0003-1382-022X (2024) Combinatorial bandits for maximum value reward function under value-index feedback. In: ICLR 2024 The Twelfth International Conference on Learning Representations, 2024-05-07 - 2024-05-11, Messe Wien Exhibition and Congress Center, Vienna, Austria, AUT.

[img] Text (Combinatorial Bandits for Maximum Value Reward Function under Value-Index Feedback) - Published Version
Download (662kB)

Abstract

We investigate the combinatorial multi-armed bandit problem where an action is to select $k$ arms from a set of base arms, and its reward is the maximum of the sample values of these $k$ arms, under a weak feedback structure that only returns the value and index of the arm with the maximum value. This novel feedback structure is much weaker than the semi-bandit feedback previously studied and is only slightly stronger than the full-bandit feedback, and thus it presents a new challenge for the online learning task. We propose an algorithm and derive a regret bound for instances where arm outcomes follow distributions with finite supports. Our algorithm introduces a novel concept of biased arm replacement to address the weak feedback challenge, and it achieves a distribution-dependent regret bound of $O((k/\Delta)\log(T))$ and a distribution-independent regret bound of $\tilde{O}(\sqrt{T})$, where $\Delta$ is the reward gap and $T$ is the time horizon. Notably, our regret bound is comparable to the bounds obtained under the more informative semi-bandit feedback. We demonstrate the effectiveness of our algorithm through experimental results.

Item Type: Conference or Workshop Item (Paper)
Additional Information: © 2024 The Author(s)
Divisions: Statistics
Subjects: H Social Sciences > HA Statistics
Date Deposited: 19 Jun 2024 10:39
Last Modified: 01 Oct 2024 04:02
URI: http://eprints.lse.ac.uk/id/eprint/123919

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics