https://hagino3000.blogspot.com/2015/07/thompson-sampling.html https://hagino3000.blogspot.com/2016/12/linear-bandit.html
#Reinforcement Learning
This page is auto-translated from /nishio/トンプソンサンプリング using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.