Lidé
Ing. Jiří Němeček
Všechny publikace
Bias Detection via Maximum Subgroup Discrepancy
- Autoři: Ing. Jiří Němeček, Kozdoba, M., Kryvoviaz, I., doc. Ing. Tomáš Pevný, Ph.D., Mgr. Jakub Mareček, Ph.D.,
- Publikace: KDD '25: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2. New York: Association for Computing Machinery, 2025. p. 2174-2185. ISBN 979-8-4007-1454-2.
- Rok: 2025
- DOI: 10.1145/3711896.3736857
- Odkaz: https://doi.org/10.1145/3711896.3736857
- Pracoviště: Centrum umělé inteligence
-
Anotace:
Bias evaluation is fundamental to trustworthy AI, both in terms of checking data quality and in terms of checking the outputs of AI systems. In testing data quality, for example, one may study the distance of a given dataset, viewed as a distribution, to a given ground-truth reference dataset. However, classical metrics, such as the Total Variation and the Wasserstein distances, are known to have high sample complexities and, therefore, may fail to provide a meaningful distinction in many practical scenarios. In this paper, we propose a new notion of distance, the Maximum Subgroup Discrepancy (MSD). In this metric, two distributions are close if, roughly, discrepancies are low for all feature subgroups. While the number of subgroups may be exponential, we show that the sample complexity is linear in the number of features, thus making it feasible for practical applications. Moreover, we provide a practical algorithm for evaluating the distance based on Mixed-integer optimization (MIO). We also note that the proposed distance is easily interpretable, thus providing clearer paths to fixing the biases once they have been identified. Finally, we describe a natural general bias detection framework, termed MSDD distances, and show that MSD aligns well with this framework. We empirically evaluate MSD by comparing it with other metrics and by demonstrating the above properties of MSD on real-world datasets.
Generating Likely Counterfactuals Using Sum-Product Networks
- Autoři: Ing. Jiří Němeček, doc. Ing. Tomáš Pevný, Ph.D., Mgr. Jakub Mareček, Ph.D.,
- Publikace: LEARNING REPRESENTATIONS. INTERNATIONAL CONFERENCE. 13TH 2025. (ICLR 2025). International Conference on Learning Representations, 2025. p. 74233-74264. ISBN 9798331320850.
- Rok: 2025
- Pracoviště: Centrum umělé inteligence
-
Anotace:
The need to explain decisions made by AI systems is driven by both recent regulation and user demand. The decisions are often explainable only post hoc. In counterfactual explanations, one may ask what constitutes the best counterfactual explanation. Clearly, multiple criteria must be taken into account, although "distance from the sample" is a key criterion. Recent methods that consider the plausibility of a counterfactual seem to sacrifice this original objective. Here, we present a system that provides high-likelihood explanations that are, at the same time, close and sparse. We show that the search for the most likely explanations satisfying many common desiderata for counterfactual explanations can be modeled using Mixed-Integer Optimization (MIO). We use a Sum-Product Network (SPN) to estimate the likelihood of a counterfactual. To achieve that, we propose an MIO formulation of an SPN, which can be of independent interest. The source code with examples is available at https://github.com/Epanemu/LiCE.