Lidé

Ing. Petr Tomášek

Všechny publikace

Solving Partially Observable Stochastic Shortest-Path Games

  • DOI: 10.24963/ijcai.2021/575
  • Odkaz: https://doi.org/10.24963/ijcai.2021/575
  • Pracoviště: Katedra počítačů, Centrum umělé inteligence
  • Anotace:
    We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game.

Using One-Sided Partially Observable Stochastic Games for Solving Zero-Sum Security Games with Sequential Attacks

  • DOI: 10.1007/978-3-030-64793-3_21
  • Odkaz: https://doi.org/10.1007/978-3-030-64793-3_21
  • Pracoviště: Katedra počítačů, Centrum umělé inteligence
  • Anotace:
    Security games are a defender-attacker game-theoretic model where the defender determines how to allocate scarce resources to protect valuable targets against the attacker. A majority of existing work has focused on the one-shot game setting in which the attacker only attacks once. However, in many real-world scenarios, the attacker can perform multiple attacks in a sequential manner and leverage observable e_ects of these attacks for better attack decisions in the future. Recent work shows that in order to provide e_ective protection over targets, the defender has to take the prospect of sequential attacks into consideration. The algorithm proposed by existing work to handle sequential attacks, however, can only scale up to two attacks at most. We extend this line of work and focus on developing new scalable algorithms for solving the zero-sum variant of security games.We formulate security games with sequential attacks as a one-sided partially observable stochastic games. We show that the uncertainty about the state in the game can be modeled compactly and we can use variants of heuristic search value iteration algorithm for solving these games. We give two variants of the algorithm { an exact one and a heuristic formulation where the resource reallocation possibilities of the defender are simpli_ed. We experimentally compare these two variants of the algorithm and show that the heuristic variant is typically capable of _nding high-quality strategies while scaling to larger scenarios compared to the exact variant.

Optimizing Honeypot Strategies Against Dynamic Lateral Movement Using Partially Observable Stochastic Games

  • DOI: 10.1016/j.cose.2019.101579
  • Odkaz: https://doi.org/10.1016/j.cose.2019.101579
  • Pracoviště: Katedra počítačů, Centrum umělé inteligence
  • Anotace:
    Partially observable stochastic games (POSGs) are a general game-theoretic model for capturing dynamic interactions where players have partial information. The existing algorithms for solving subclasses of POSGs have theoretical guarantees for converging to approximate optimal strategies, however, their scalability is limited and they cannot be directly used to solve games of realistic sizes. In our problem, the attacker uses lateral movement through the network in order to reach a specific host, while the defender wants to discover the attacker by dynamically reallocating honeypots. We demonstrate that restricting to a specific domain allows us to substantially improve existing algorithms: (1) we formulate a compact representation of uncertainty the defender faces, (2) we exploit the incremental strategy-generation method that over iterations expands the possible actions for players. The experimental evaluation shows that our novel algorithms scale several orders of magnitude better compared to the existing state of the art.

Za stránku zodpovídá: Ing. Mgr. Radovan Suk