Ing. Karel Horák, Ph.D.

Solving zero-sum one-sided partially observable stochastic games

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D., Kovařík, V., Kiekintveld, C.
Publikace: Artificial Intelligence. 2023, 316 ISSN 0004-3702.
Rok: 2023

DOI: 10.1016/j.artint.2022.103838
Odkaz: https://doi.org/10.1016/j.artint.2022.103838
Pracoviště: Centrum umělé inteligence
Anotace:
Many real-world situations are dynamic, with long-term interactions between multiple agents with uncertainty and limited observations. The agents must reason about which actions to take while also predicting and learning about what actions the other agents will take and how their choices will interact. In the most general setting, there is no limitation on the length of the sequence of actions the agent can perform - that is, there is no fixed horizon that can be used as an endpoint for analysis. These settings can be modeled as partially observable stochastic games (POSGs). Many adversarial domains (e.g., security settings) can be modeled as strictly competitive (or zero-sum) variants of these games. While these models are capable of modeling a wide variety of realistic problems, solving general POSGs is computationally intractable, so we focus on a broad subclass of POSGs called one-sided POSGs. In these games, only one agent has imperfect information while their opponent has full knowledge of the current situation. We provide a complete approach for solving zero-sum, one-sided POSGs: we (1) give a theoretical analysis of one-sided POSGs and their value functions, (2) show that a variant of a value-iteration algorithm converges in this setting, (3) adapt the heuristic search value-iteration algorithm for solving one-sided POSGs, (4) describe how to use approximate value functions to derive strategies in the game, and (5) experimentally demonstrate that our algorithm can solve one-sided POSGs of non-trivial sizes and analyze the scalability of our algorithm in three different domains: pursuit-evasion, patrolling, and search games.(c) 2022 Elsevier B.V. All rights reserved.

Solving Partially Observable Stochastic Shortest-Path Games

Autoři: Ing. Petr Tomášek, Ing. Karel Horák, Ph.D., Aradhye, A., doc. Mgr. Branislav Bošanský, Ph.D., Chatterjee, K.
Publikace: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2021. p. 4182-4189. ISSN 1045-0823. ISBN 978-0-9992411-9-6.
Rok: 2021

DOI: 10.24963/ijcai.2021/575
Odkaz: https://doi.org/10.24963/ijcai.2021/575
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game.

Compact Representation of Value Function in Partially Observable Stochastic Games

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D., Kiekintveld, Ch., Kamhoua, Ch.
Publikace: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2019. p. 350-356. ISSN 1045-0823. ISBN 978-0-9992411-4-1.
Rok: 2019

DOI: 10.24963/ijcai.2019/50
Odkaz: https://doi.org/10.24963/ijcai.2019/50
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Value methods for solving stochastic games with partial observability model the uncertainty of the players as a probability distribution over possible states, where the dimension of the belief space is the number of states. For many practical problems, there are exponentially many states which causes scalability problems. We propose an abstraction technique that addresses this curse of dimensionality by projecting the high-dimensional beliefs onto characteristic vectors of significantly lower dimension (e.g., marginal probabilities). Our main contributions are (1) a novel compact representation of the uncertainty in partially observable stochastic games and (2) a novel algorithm using this representation that is based on existing state-of-the-art algorithms for solving stochastic games with partial observability. Experimental evaluation confirms that the new algorithm using the compact representation dramatically increases scalability compared to the state of the art.

Optimizing Honeypot Strategies Against Dynamic Lateral Movement Using Partially Observable Stochastic Games

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D., Ing. Petr Tomášek, Kiekintveld, Ch., Kamhoua, Ch.
Publikace: Computers & Security. 2019, 87 ISSN 0167-4048.
Rok: 2019

DOI: 10.1016/j.cose.2019.101579
Odkaz: https://doi.org/10.1016/j.cose.2019.101579
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Partially observable stochastic games (POSGs) are a general game-theoretic model for capturing dynamic interactions where players have partial information. The existing algorithms for solving subclasses of POSGs have theoretical guarantees for converging to approximate optimal strategies, however, their scalability is limited and they cannot be directly used to solve games of realistic sizes. In our problem, the attacker uses lateral movement through the network in order to reach a specific host, while the defender wants to discover the attacker by dynamically reallocating honeypots. We demonstrate that restricting to a specific domain allows us to substantially improve existing algorithms: (1) we formulate a compact representation of uncertainty the defender faces, (2) we exploit the incremental strategy-generation method that over iterations expands the possible actions for players. The experimental evaluation shows that our novel algorithms scale several orders of magnitude better compared to the existing state of the art.

Solving Partially Observable Stochastic Games with Public Observations

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D.,
Publikace: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence. Menlo Park, California: AAAI Press, 2019. p. 2029-2036. ISSN 2159-5399. ISBN 978-1-57735-809-1.
Rok: 2019

DOI: 10.1609/aaai.v33i01.33012029
Odkaz: https://doi.org/10.1609/aaai.v33i01.33012029
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
n many real-world problems, there is a dynamic interactionbetween competitive agents. Partially observable stochasticgames (POSGs) are among the most general formal mod-els that capture such dynamic scenarios. The model capturesstochastic events, partial information of players about the en-vironment, and the scenario does not have a fixed horizon.Solving POSGs in the most general setting is intractable.Therefore, the research has been focused on subclasses ofPOSGs that have a value of the game and admit designing(approximate) optimal algorithms. We propose such a sub-class for two-player zero-sum games with discounted-sumobjective function—POSGs withpublic observations(PO-POSGs)—where each player is able to reconstruct beliefs ofthe other player over the unobserved states. Our results in-clude: (1) theoretical analysis of PO-POSGs and their valuefunctions showing convexity (concavity) in beliefs of maxi-mizing (minimizing) player, (2) a novel algorithm for approx-imating the value of the game, and (3) a practical demon-stration of scalability of our algorithm. Experimental resultsshow that our algorithm can closely approximate the value ofnon-trivial games with hundreds of states

Approximating Maxmin Strategies in Imperfect Recall Games Using A-loss Recall Property

Autoři: Čermák, J., doc. Mgr. Branislav Bošanský, Ph.D., Ing. Karel Horák, Ph.D., doc. Mgr. Viliam Lisý, MSc., Ph.D., prof. Dr. Michal Pěchouček, MSc.,
Publikace: International Journal of Approximate Reasoning. 2018, 93 290-326. ISSN 0888-613X.
Rok: 2018

DOI: 10.1016/j.ijar.2017.11.010
Odkaz: https://doi.org/10.1016/j.ijar.2017.11.010
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Extensive-form games with imperfect recall are an important model of dynamic games where the players are allowed to forget previously known information. Often, imperfect recall games result from an abstraction algorithm that simplifies a large game with perfect recall. Solving imperfect recall games is known to be a hard problem, and thus it is useful to search for a subclass of imperfect recall games which offers sufficient memory savings while being efficiently solvable. The abstraction process can then be guided to result in a game from this class. We focus on a subclass of imperfect recall games called A-loss recall games. First, we provide a complete picture of the complexity of solving imperfect recall and A-loss recall games. We show that the A-loss recall property allows us to compute a best response in polynomial time (computing a best response is NP-hard in imperfect recall games). This allows us to create a practical algorithm for approximating maxmin strategies in two-player games where the maximizing player has imperfect recall and the minimizing player has A-loss recall. This algorithm is capable of solving some games with up to 5⋅109 states in approximately 1 hour. Finally, we demonstrate that the use of imperfect recall abstraction can reduce the size of the strategy representation to as low as 0.03% of the size of the strategy representation in the original perfect recall game without sacrificing the quality of the maxmin strategy obtained by solving this abstraction.

Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D., Chatterjee, K.
Publikace: Proceedings of the International Joint Conferences on Artifical Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2018. p. 4764-4770. ISSN 1045-0823. ISBN 978-0-9992411-2-7.
Rok: 2018

DOI: 10.24963/ijcai.2018/662
Odkaz: https://doi.org/10.24963/ijcai.2018/662
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Partially observable Markov decision processes (POMDPs) are the standard models for planning under uncertainty with both finite and infinite horizon. Besides the well-known discounted-sum objective, indefinite-horizon objective (aka Goal-POMDPs) is another classical objective for POMDPs. In this case, given a set of target states and a positive cost for each transition, the optimization objective is to minimize the expected total cost until a target state is reached. In the literature, RTDP-Bel or heuristic search value iteration (HSVI) have been used for solving Goal-POMDPs. Neither of these algorithms has theoretical convergence guarantees, and HSVI may even fail to terminate its trials. We give the following contributions: (1) We discuss the challenges introduced in Goal-POMDPs and illustrate how they prevent the original HSVI from converging. (2) We present a novel algorithm inspired by HSVI, termed Goal-HSVI, and show that our algorithm has convergence guarantees. (3) We show that Goal-HSVI outperforms RTDP-Bel on a set of well-known examples.

Computing Maxmin Strategies in Extensive-form Zero-sum Games with Imperfect Recall

Autoři: doc. Mgr. Branislav Bošanský, Ph.D., Čermák, J., Ing. Karel Horák, Ph.D., prof. Dr. Michal Pěchouček, MSc.,
Publikace: Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2. Porto: SciTePress - Science and Technology Publications, 2017. p. 63-74. ISBN 978-989-758-220-2.
Rok: 2017

DOI: 10.5220/0006121200630074
Odkaz: https://doi.org/10.5220/0006121200630074
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Extensive-form games with imperfect recall are an important game-theoretic model that allows a compact representation of strategies in dynamic strategic interactions. Practical use of imperfect recall games is limited due to negative theoretical results: a Nash equilibrium does not have to exist, computing maxmin strategies is NP-hard, and they may require irrational numbers. We present the first algorithm for approximating maxmin strategies in two-player zero-sum imperfect recall games without absentmindedness. We modify the wellknown sequence-form linear program to model strategies in imperfect recall games resulting in a bilinear program and use a recent technique to approximate the bilinear terms. Our main algorithm is a branch-andbound search that provably reaches the desired approximation after an exponential number of steps in the size of the game. Experimental evaluation shows that the proposed algorithm can approximate maxmin strategies of randomly generated imperfect recall games of sizes beyond toy-problems within few minutes.

Dynamic Programming for One-sided Partially Observable Pursuit-evasion Games

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D.,
Publikace: Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2. Porto: SciTePress - Science and Technology Publications, 2017. p. 503-510. ISBN 978-989-758-220-2.
Rok: 2017

DOI: 10.5220/0006190605030510
Odkaz: https://doi.org/10.5220/0006190605030510
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Pursuit-evasion scenarios appear widely in robotics, security domains, and many other real-world situations. We focus on two-player pursuit-evasion games with concurrent moves, infinite horizon, and discounted rewards. We assume that the players have partial observability, however, the evader has an advantage of knowing the current position of pursuer’s units. This setting is particularly interesting for security domains where a robust strategy, maximizing the utility in the worst-case scenario, is often desirable. We provide, to the best of our knowledge, the first algorithm that provably converges to the value of a partially observable pursuitevasion game with infinite horizon. Our algorithm extends well-known value iteration algorithm by exploiting that (1) value functions of our game depend only on the position of the pursuer and the belief he has about the position of the evader, and (2) that these functions are piecewise linear and convex in the belief space.

Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D., prof. Dr. Michal Pěchouček, MSc.,
Publikace: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2017. p. 558-564. ISBN 978-1-57735-780-3.
Rok: 2017

Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Security problems can be modeled as two-player partially observable stochastic games with one-sided partial observability and infinite horizon (one-sided POSGs). We seek for optimal strategies of player 1 that correspond to robust strategies against the worst-case opponent (player 2) that is assumed to have a perfect information about the game. We present a novel algorithm for approximately solving one-sided POSGs based on the heuristic search value iteration (HSVI) for POMDPs. Our results include (1) theoretical properties of one-sided POSGs and their value functions, (2) guarantees showing the convergence of our algorithm to optimal strategies, and (3) practical demonstration of applicability and scalability of our algorithm on three different domains: pursuit-evasion, patrolling, and search games.

Manipulating Adversary’s Belief: A Dynamic Game Approach to Deception by Design for Proactive Network Security

Autoři: Ing. Karel Horák, Ph.D., Zhu, Q., doc. Mgr. Branislav Bošanský, Ph.D.,
Publikace: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Düsseldorf: Springer VDI Verlag, 2017. p. 273-294. ISSN 0302-9743. ISBN 978-3-319-68710-0.
Rok: 2017

DOI: 10.1007/978-3-319-68711-7_15
Odkaz: https://doi.org/10.1007/978-3-319-68711-7_15
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Due to the sophisticated nature of current computer systems, traditional defense measures, such as firewalls, malware scanners, and intrusion detection/prevention systems, have been found inadequate. These technological systems suffer from the fact that a sophisticated attacker can study them, identify their weaknesses and thus get an advantage over the defender. To prevent this from happening a proactive cyber defense is a new defense mechanism in which we strategically engage the attacker by using cyber deception techniques, and we influence his actions by creating and reinforcing his view of the computer system. We apply the cyber deception techniques in the field of network security and study the impact of the deception on attacker’s beliefs using the quantitative framework of the game theory. We account for the sequential nature of an attack and investigate how attacker’s belief evolves and influences his actions. We show how the defender should manipulate this belief to prevent the attacker from achieving his goals and thus minimize the damage inflicted to the network. To design a successful defense based on cyber deception, it is crucial to employ strategic thinking and account explicitly for attacker’s belief that he is being exposed to deceptive attempts. By doing so, we can make the deception more believable from the perspective of the attacker.

Optimal Strategies for Detecting Data Exfiltration by Internal and External Attackers

Autoři: Durkota, K., doc. Mgr. Viliam Lisý, MSc., Ph.D., Kiekintveld, C., Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D., doc. Ing. Tomáš Pevný, Ph.D.,
Publikace: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Düsseldorf: Springer VDI Verlag, 2017. p. 171-192. ISSN 0302-9743. ISBN 978-3-319-68710-0.
Rok: 2017

DOI: 10.1007/978-3-319-68711-7_10
Odkaz: https://doi.org/10.1007/978-3-319-68711-7_10
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
We study the problem of detecting data exfiltration in computer networks. We focus on the performance of optimal defense strategies with respect to an attacker’s knowledge about typical network behavior and his ability to influence the standard traffic. Internal attackers know the typical upload behavior of the compromised host and may be able to discontinue standard uploads in favor of the exfiltration. External attackers do not immediately know the behavior of the compromised host, but they can learn it from observations.We model the problem as a sequential game of imperfect information, where the network administrator selects the thresholds for the detector, while the attacker chooses how much data to exfiltrate in each time step. We present novel algorithms for approximating the optimal defense strategies in the form of Stackelberg equilibria. We analyze the scalability of the algorithms and efficiency of the produced strategies in a case study based on real-world uploads of almost six thousand users to Google Drive. We show that with the computed defense strategies, the attacker exfiltrates 2–3 times less data than with simple heuristics; randomized defense strategies are up to 30% more effective than deterministic ones, and substantially more effective defense strategies are possible if the defense is customized for groups of hosts with similar behavior.

A Point-Based Approximate Algorithm for One-Side Partially Observable Pursuit-Evasion Games

Autoři: Ing. Karel Horák, Ph.D., doc. Mgr. Branislav Bošanský, Ph.D.,
Publikace: Decision and Game Theory for Security. Basel: Springer, 2016. p. 435-454. ISSN 0302-9743. ISBN 978-3-319-47412-0.
Rok: 2016

DOI: 10.1007/978-3-319-47413-7_25
Odkaz: https://doi.org/10.1007/978-3-319-47413-7_25
Pracoviště: Katedra počítačů, Centrum umělé inteligence
Anotace:
Pursuit-evasion games model many security problems where an evader is trying to escape a group of pursuing units. We consider a variant with partial observability and simultaneous moves of all units, and assume the worst-case setup, where the evader knows the location of pursuer’s units, but the pursuer does not know the location of the evader. Recent work has shown that the solution of such games is compactly representable as a collection of finite-dimensional value functions. We extend this result and propose the first practical algorithm for approximating optimal policies in pursuit-evasion games with one-sided partial observability. Our approach extends the point-based updates that exist for POMDPs to one-sided partially observable stochastic games. The experimental evaluation on multiple graphs shows significant improvementsover approximate algorithms that operate on finite game trees.

Ing. Karel Horák, Ph.D.

Všechny publikace

Solving zero-sum one-sided partially observable stochastic games

Solving Partially Observable Stochastic Shortest-Path Games

Compact Representation of Value Function in Partially Observable Stochastic Games

Optimizing Honeypot Strategies Against Dynamic Lateral Movement Using Partially Observable Stochastic Games

Solving Partially Observable Stochastic Games with Public Observations

Approximating Maxmin Strategies in Imperfect Recall Games Using A-loss Recall Property

Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs

Computing Maxmin Strategies in Extensive-form Zero-sum Games with Imperfect Recall

Dynamic Programming for One-sided Partially Observable Pursuit-evasion Games

Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games

Manipulating Adversary’s Belief: A Dynamic Game Approach to Deception by Design for Proactive Network Security

Optimal Strategies for Detecting Data Exfiltration by Internal and External Attackers

A Point-Based Approximate Algorithm for One-Side Partially Observable Pursuit-Evasion Games

Mějte přehled