Ing. David Milec

Všechny publikace

Autoři: Ing. David Milec, Ing. Ondřej Kubíček, doc. Mgr. Viliam Lisý, MSc., Ph.D.,
Publikace: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems. County of Richland: IFAAMAS, 2024. p. 2393-2395. vol. 2024-May. ISSN 1558-2914.
Rok: 2024

Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
In zero-sum games, the optimal strategy is well-defined by the Nash equilibrium. However, it is overly conservative when playing against suboptimal opponents and it can not exploit their weaknesses. Limited look-ahead game solving in imperfect-information games allows superhuman play in massive real-world games such as Poker, Liar's Dice, and Scotland Yard. However, since they approximate Nash equilibrium, they tend to only win slightly against weak opponents. We propose theoretically sound methods combining limited look-ahead solving with an opponent model, in order to 1) approximate a best response in large games or 2) compute a robust response with control over the robustness of the response. Both methods can compute the response in real time to previously unseen strategies. We present theoretical guarantees of our methods. We show that existing robust response methods do not work combined with limited look-ahead solving of the shelf, and we propose a novel solution for the issue. Our algorithm performs significantly better than multiple baselines in smaller games and outperforms state-of-the-art methods against SlumBot.

Autoři: Ing. David Milec, Černý, J., doc. Mgr. Viliam Lisý, MSc., Ph.D., An, B.
Publikace: Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence. Palo Alto, California: Association for the Advancement of Artificial Intelligence (AAAI), 2021. p. 5575-5583. 35. ISSN 2374-3468. ISBN 978-1-57735-866-4.
Rok: 2021

Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited. One type of subrationality that describes human behavior well is the quantal response. While there exist algorithms for computing solutions against quantal opponents, they either do not scale or may provide strategies that are even worse than the entirely-rational Nash strategies. This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games. Our contributions are: (1) we define two different solution concepts related to exploiting quantal opponents and analyze their properties; (2) we prove that computing these solutions is computationally hard;(3) therefore, we evaluate several heuristic approximations based on scalable counterfactual regret minimization (CFR); and (4) we identify a CFR variant that exploits the bounded opponents better than the previously used variants while being less exploitable by the worst-case perfectly-rational opponent.