{"id":1889,"date":"2025-02-17T07:03:16","date_gmt":"2025-02-17T07:03:16","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/02\/17\/2502-10020\/"},"modified":"2025-02-17T07:03:16","modified_gmt":"2025-02-17T07:03:16","slug":"2502-10020","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/02\/17\/2502-10020\/","title":{"rendered":"Improved Online Confidence Bounds for Multinomial Logistic Bandits"},"content":{"rendered":"<p>    Improved Online Confidence Bounds for Multinomial Logistic Bandits<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>arXiv:2502.10020v1 Announce Type: new<br \/>\nAbstract: In this paper, we propose an improved online confidence bound for multinomial logistic (MNL) models and apply this result to MNL bandits, achieving variance-dependent optimal regret. Recently, Lee &amp; Oh (2024) established an online confidence bound for MNL models and achieved nearly minimax-optimal regret in MNL bandits. However, their results still depend on the norm-boundedness of the unknown parameter $B$ and the maximum size of possible outcomes $K$. To address this, we first derive an online confidence bound of $Oleft(sqrt{d log t} + B right)$, which is a significant improvement over the previous bound of $O (B sqrt{d} log t log K )$ (Lee &amp; Oh, 2024). This is mainly achieved by establishing tighter self-concordant properties of the MNL loss and introducing a novel intermediary term to bound the estimation error. Using this new online confidence bound, we propose a constant-time algorithm, OFU-MNL++, which achieves a variance-dependent regret bound of $O Big( d log T sqrt{ smash[b]{sum_{t=1}^T} sigma_t^2 } Big) $ for sufficiently large $T$, where $sigma_t^2$ denotes the variance of the rewards at round $t$, $d$ is the dimension of the contexts, and $T$ is the total number of rounds. Furthermore, we introduce an Maximum Likelihood Estimation (MLE)-based algorithm that achieves an anytime, OFU-MN$^2$L, poly($(B)$)-free regret of $O Big( d log (BT) sqrt{ smash[b]{sum_{t=1}^T} sigma_t^2 } Big) $.<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Joongkyu Lee, Min-hwan Oh<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/arxiv.org\/abs\/2502.10020\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Improved Online Confidence Bounds for Multinomial Logistic Bandits arXiv:2502.10020v1 Announce Type: new Abstract: In this paper, we propose an improved online confidence bound for multinomial logistic (MNL) models and apply this result to MNL bandits, achieving variance-dependent optimal regret. Recently, Lee &amp; Oh (2024) established an online confidence bound for MNL models and achieved nearly [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,113,112],"tags":[1760,1722,1674],"class_list":["post-1889","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-cs-lg","category-stat-ml","tag-bound","tag-confidence","tag-online"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1889"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=1889"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/1889\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=1889"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=1889"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=1889"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}