This problem is called &#39;<a href="/en/exploration-exploitation%20tradeoff">exploration-exploitation tradeoff</a>&#39; in the field of <a href="/en/reinforcement%20learning">reinforcement learning</a>. You can not find better options if you choose only the option that looks the best from your experiences. It is a lack of <a href="/en/exploration">exploration</a>. (*1)
On the other hand, if you are looking for better options and only choosing inexperienced options, your experiences are not used. It is a lack of <a href="/en/exploitation">exploitation</a>.
Since exploration and exploitation are in a trade-off relationship, it is necessary to execute both in a well-balanced manner, not on one side. So how can we make the well-balanced choices?
<hr>
Footnote *1:
<ul>
<li>The discussion went detail in the field of <a href="/en/reinforcement%20learning">reinforcement learning</a>.<ul>
<li><a href="https://en.wikipedia.org/wiki/Multi-armed_bandit">https://en.wikipedia.org/wiki/Multi-armed_bandit</a></li>
</ul>
</li>
<li>However, its origin is unclear. The cencept is used in wide domain.<ul>
<li><a href="/en/Box%2C%20G.%20E.">Box, G. E.</a>, 1954. The exploration and exploitation of response surfaces: some general considerations and examples. Biometrics, 10(1), pp.16-60.</li>
<li>March, J.G., 1991. Exploration and exploitation in <a href="/en/organizational%20learning">organizational learning</a>. Organization science, 2(1), pp.71-87.</li>
</ul>
</li>
</ul>
<img src='https://scrapbox.io/api/pages/nishio-en/en/icon' alt='en.icon' height="19.5"/>

(2.2.3.1) Exploration-exploitation tradeoff