Ing. Ondřej Kuželka, Ph.D.

A More Practical Algorithm for Weighted First-Order Model Counting with Linear Order Axiom

Autoři: Meng, Q., Ing. Jan Tóth, Wang, Y., Wang, Y., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Frontiers in Artificial Intelligence and Applications. Oxford: IOS Press, 2024. p. 3145-3154. ISSN 0922-6389.
Rok: 2024

DOI: 10.3233/FAIA240858
Odkaz: https://doi.org/10.3233/FAIA240858
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We consider the task of weighted first-order model counting (WFOMC), a fundamental problem of probabilistic inference in statistical relational learning. The goal of WFOMC is to compute the weighted sum of models of a given first-order logic sentence over a finite domain, where each model is assigned a weight by a pair of weighting functions. Past work has shown that WFOMC can be solved in polynomial time in the domain size if the sentence is in the two-variable fragment of first-order logic (FO2). This result is later extended to the case where the sentence is in FO2with the linear order axiom, which requires a binary predicate in the sentence to introduce a linear ordering of the domain elements. However, despite its polynomial theoretical complexity, the existing domain-liftable algorithm for WFOMC with the linear order often suffers from inefficiencies when applied to real-world problems. This paper introduces a novel domain-lifted algorithm for WFOMC with the linear order axiom. Compared to the existing approach, our proposed algorithm exploits the inherent symmetries within first-order logic sentences and weighting functions to minimize redundant computations. Experimental results verify the efficiency of our approach, demonstrating a significant speedup over the existing approach.

Complexity of Weighted First-Order Model Counting in the Two-Variable Fragment with Counting Quantifiers: A Bound to Beat

Autoři: Ing. Jan Tóth, Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the 21st International Conference on Principles of Knowledge Representation and Reasoning. International Joint Conferences on Artificial Intelligence Organization, 2024. p. 676-686. ISSN 2334-1033. ISBN 978-1-956792-05-8.
Rok: 2024

DOI: 10.24963/kr.2024/64
Odkaz: https://doi.org/10.24963/kr.2024/64
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We study the time complexity of weighted first-order model counting (WFOMC) over the logical language with two variables and counting quantifiers. The problem is known to be solvable in time polynomial in the domain size. However, the degree of the polynomial, which turns out to be relatively high for most practical applications, has never been properly addressed. First, we formulate a time complexity bound for the existing techniques for solving WFOMC with counting quantifiers. The bound is already known to be a polynomial with its degree depending on the number of cells of the input formula. We observe that the number of cells depends, in turn, exponentially on the parameters of the counting quantifiers appearing in the formula. Second, we propose a new approach to dealing with counting quantifiers, reducing the exponential dependency to a quadratic one, therefore obtaining a tighter upper bound. It remains an open question whether the dependency of the polynomial degree on the counting quantifiers can be reduced further, thus making our new bound a bound to beat.

Faster Repeated Evasion Attacks in Tree Ensembles

Autoři: Cascioli, L., Devos, L., Ing. Ondřej Kuželka, Ph.D., Davis, J.
Publikace: Proceedings of the NeurIPS. Neural Information Processing Systems (NIPS) Foundation, 2024. ISSN 1049-5258. ISBN 979-8-3313-1438-5.
Rok: 2024

Pracoviště: Intelligent Data Analysis
Anotace:
Tree ensembles are one of the most widely used model classes. However, these mod- els are susceptible to adversarial examples, i.e., slightly perturbed examples that elicit a misprediction. There has been significant research on designing approaches to construct such examples for tree ensembles. But this is a computationally chal- lenging problem that often must be solved a large number of times (e.g., for all examples in a training set). This is compounded by the fact that current approaches attempt to find such examples from scratch. In contrast, we exploit the fact that multiple similar problems are being solved. Specifically, our approach exploits the insight that adversarial examples for tree ensembles tend to perturb a consistent but relatively small set of features. We show that we can quickly identify this set of features and use this knowledge to speedup constructing adversarial examples.

Lifted Algorithms for Symmetric Weighted First-order Model Sampling

Autoři: Wang, Y., Pu, J., Wang, Y., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Artificial Intelligence. 2024, 331 ISSN 0004-3702.
Rok: 2024

DOI: 10.1016/j.artint.2024.104114
Odkaz: https://doi.org/10.1016/j.artint.2024.104114
Pracoviště: Intelligent Data Analysis
Anotace:
Weighted model counting (WMC) is the task of computing the weighted sum of all satisfying assignments (i.e., models) of a propositional formula. Similarly, weighted model sampling (WMS) aims to randomly generate models with probability proportional to their respective weights. Both WMC and WMS are hard to solve exactly, falling under the #P-hard complexity class. However, it is known that the counting problem may sometimes be tractable, if the propositional formula can be compactly represented and expressed in first-order logic. In such cases, model counting problems can be solved in time polynomial in the domain size, and are known as domain-liftable . The following question then arises: Is it also the case for WMS? This paper addresses this question and answers it affirmatively. Specifically, we prove the domain-liftability under sampling for the two-variables fragment of first-order logic with counting quantifiers in this paper, by devising an efficient sampling algorithm for this fragment that runs in time polynomial in the domain size. We then further show that this result continues to hold even in the presence of cardinality constraints. To empirically validate our approach, we conduct experiments over various firstorder formulas designed for the uniform generation of combinatorial structures and sampling in statistical-relational models. The results demonstrate that our algorithm outperforms a state-ofthe-art WMS sampler by a substantial margin, confirming the theoretical results.

Model Counting and Sampling in First-Order Logic

Autoři: Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the 37th International Workshop on Description Logics. CEUR-WS.org, 2024. ISSN 1613-0073.
Rok: 2024

Pracoviště: Intelligent Data Analysis
Anotace:
First-order model counting (FOMC) asks to count models of a first-order logic sentence over a given set of domain elements. Its weighted variant, WFOMC, generalizes FOMC by assigning weights to the models and has applications among others in statistical relational learning. Several non-trivial classes of WFOMC problems that can be solved in time polynomial in the number of domain elements were identified in the literature over the past decade, since the two seminal paper by Van den Broeck (2011) and Van den Broeck, Meert and Darwiche (2014) established tractability of FO2 for WFOMC. This talk is about recent developments on WFOMC and the related problem of weighted first-order model sampling (WFOMS). We also discuss applications of WFOMC and WFOMS, such as automated solving of problems from enumerative combinatorics and elementary probability theory.

Quantified Neural Markov Logic Networks

Autoři: Ing. Peter Jung, Marra, G., Ing. Ondřej Kuželka, Ph.D.,
Publikace: International Journal of Approximate Reasoning. 2024, 171 ISSN 0888-613X.
Rok: 2024

DOI: 10.1016/j.ijar.2024.109172
Odkaz: https://doi.org/10.1016/j.ijar.2024.109172
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Markov Logic Networks (MLNs) are discrete generative models in the exponential family. However, specifying these rules requires considerable expertise and can pose a significant challenge. To overcome this limitation, Neural MLNs (NMLNs) have been introduced, enabling the specification of potential functions as neural networks. Thanks to the compact representation of their neural potential functions, NMLNs have shown impressive performance in modeling complex domains like molecular data. Despite the superior performance of NMLNs, their theoretical expressiveness is still equivalent to that of MLNs without quantifiers. In this paper, we propose a new class of NMLN, called Quantified NMLN, that extends the expressivity of NMLNs to the quantified setting. Furthermore, we demonstrate how to leverage the neural nature of NMLNs to employ learnable aggregation functions as quantifiers, increasing expressivity even further. We demonstrate the competitiveness of Quantified NMLNs over original NMLNs and state-of-the-art diffusion models in molecule generation experiments.

Beyond Graph Neural Networks with Lifted Relational Neural Networks

Autoři: Ing. Gustav Šír, Ph.D., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2023. vol. 37. ISSN 2374-3468. ISBN 978-1-57735-880-0.
Rok: 2023

DOI: 10.48448/4eps-hs54
Odkaz: https://doi.org/10.48448/4eps-hs54
Pracoviště: Intelligent Data Analysis
Anotace:
We introduce a declarative differentiable programming framework, based on the language of Lifted Relational Neural Networks, where small parameterized logic programs are used to encode deep relational learning scenarios through the underlying symmetries. When presented with relational data, such as various forms of graphs, the logic program interpreter dynamically unfolds differentiable computational graphs to be used for the program parameter optimization by standard means. Following from the declarative, relational logic-based encoding, this results into a unified representation of a wide range of neural models in the form of compact and elegant learning programs, in contrast to the existing procedural approaches operating directly on the computational graph level. We illustrate how this idea can be used for a concise encoding of existing advanced neural architectures, with the main focus on Graph Neural Networks (GNNs). Importantly, using the framework, we also show how the contemporary GNN models can be easily extended towards higher expressiveness in various ways. In the experiments, we demonstrate correctness and computation efficiency through comparison against specialized GNN frameworks, while shedding some light on the learning performance of the existing GNN models.

Counting and Sampling Models in First-Order Logic

Autoři: Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2023. p. 7020-7025. ISBN 978-1-956792-03-4.
Rok: 2023

DOI: 10.24963/ijcai.2023/801
Odkaz: https://doi.org/10.24963/ijcai.2023/801
Pracoviště: Intelligent Data Analysis
Anotace:
First-order model counting (FOMC) is the task of counting models of a first-order logic sentence over a given set of domain elements. Its weighted variant, WFOMC, generalizes FOMC by assigning weights to the models and has many applications in statistical relational learning. More than ten years of research by various authors has led to identification of non-trivial classes of WFOMC problems that can be solved in time polynomial in the number of domain elements. In this paper, we describe recent works on WFOMC and the related problem of weighted first-order model sampling (WFOMS). We also discuss possible applications of WFOMC and WFOMS within statistical relational learning and beyond, e.g., automated solving of problems from enumerative combinatorics and elementary probability theory. Finally, we mention research problems that still need to be tackled in order to make applications of these methods really practical more broadly.

First-Order Context-Specific Likelihood Weighting in Hybrid Probabilistic Logic Programs

Autoři: Kumar, N., Ing. Ondřej Kuželka, Ph.D., De Raedt, L.
Publikace: Journal of Artificial Intelligence Research. 2023, 77 683-735. ISSN 1076-9757.
Rok: 2023

DOI: 10.1613/jair.1.13657
Odkaz: https://doi.org/10.1613/jair.1.13657
Pracoviště: Intelligent Data Analysis
Anotace:
Statistical relational AI and probabilistic logic programming have so far mostly focused on discrete probabilistic models. The reasons for this is that one needs to provide constructs to succinctly model the independencies in such models, and also provide efficient inference. Three types of independencies are important to represent and exploit for scalable inference in hybrid models: conditional independencies elegantly modeled in Bayesian networks, context-specific independencies naturally represented by logical rules, and independencies amongst attributes of related objects in relational models succinctly expressed by combining rules. This paper introduces a hybrid probabilistic logic programming language, DC#, which integrates distributional clauses’ syntax and semantics principles of Bayesian logic programs. It represents the three types of independencies qualitatively. More importantly, we also introduce the scalable inference algorithm FO-CS-LW for DC#. FO-CS-LW is a first-order extension of the context-specific likelihood weighting algorithm (CS-LW), a novel sampling method that exploits conditional independencies and context-specific independencies in ground models. The FO-CS-LW algorithm upgrades CS-LW with unification and combining rules to the first-order case.

Hoeffding-Serfling Inequality for U-Statistics Without Replacement

Autoři: Jianhang Ai, MSc., Ing. Ondřej Kuželka, Ph.D., Wang, Y.
Publikace: JOURNAL OF THEORETICAL PROBABILITY. 2023, 36(1), 390-408. ISSN 0894-9840.
Rok: 2023

DOI: 10.1007/s10959-022-01169-x
Odkaz: https://doi.org/10.1007/s10959-022-01169-x
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Concentration inequalities quantify random fluctuations of functions of random variables, typically by bounding the probability that such a function differs from its expected value by more than a certain amount. In this paper we study one particular concentration inequality, the Hoeffding-Serfling inequality for U-statistics of random variables sampled without replacement from a finite set and extend recent results of Bardenet and Maillard (Bernoulli 21(3):1361-1385, 2015) to cover the U-statistics setting.

Lifted Inference with Linear Order Axiom

Autoři: Ing. Jan Tóth, Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2023. p. 12295-12304. vol. 37. ISSN 2374-3468. ISBN 978-1-57735-880-0.
Rok: 2023

DOI: 10.1609/aaai.v37i10.26449
Odkaz: https://doi.org/10.1609/aaai.v37i10.26449
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We consider the task of weighted first-order model counting (WFOMC) used for probabilistic inference in the area of statistical relational learning. Given a formula φ, domain size n and a pair of weight functions, what is the weighted sum of all models of φ over a domain of size n? It was shown that computing WFOMC of any logical sentence with at most two logical variables can be done in time polynomial in n. However, it was also shown that the task is #P1-complete once we add the third variable, which inspired the search for extensions of the two-variable fragment that would still permit a running time polynomial in n. One of such extension is the two-variable fragment with counting quantifiers. In this paper, we prove that adding a linear order axiom (which forces one of the predicates in φ to introduce a linear ordering of the domain elements in each model of φ) on top of the counting quantifiers still permits a computation time polynomial in the domain size. We present a new dynamic programming-based algorithm which can compute WFOMC with linear order in time polynomial in n, thus proving our primary claim.

Lifted Inference with Tree Axioms

Autoři: Bremen, T., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Artificial Intelligence. 2023, 324 ISSN 0004-3702.
Rok: 2023

DOI: 10.1016/j.artint.2023.103997
Odkaz: https://doi.org/10.1016/j.artint.2023.103997
Pracoviště: Intelligent Data Analysis
Anotace:
We consider the problem of weighted first-order model counting (WFOMC): given a first-order sentence ϕ and domain size , determine the weighted sum of models of ϕ over the domain . Past work has shown that any sentence using at most two logical variables admits an algorithm for WFOMC that runs in time polynomial in the given domain size [1], [2]. The same property was later also shown to hold for , the two-variable fragment with counting quantifiers [3]. In this paper, we further expand this result to any sentence ϕ with the addition of a tree axiom, stating that some distinguished binary relation in ϕ forms a tree in the graph-theoretic sense.

Lifted Relational Neural Networks: from Graphs to Deep Relational Learning

Autoři: Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: A Compendium of Neuro-Symbolic Artificial Intelligence. Amsterdam: IOS Press, 2023. p. 308-336. ISSN 0922-6389. ISBN 978-1-64368-406-2.
Rok: 2023

DOI: 10.3233/FAIA230147
Odkaz: https://doi.org/10.3233/FAIA230147
Pracoviště: Intelligent Data Analysis
Anotace:
Lifted Relational Neural Networks (LRNNs) were introduced in 2015 as a framework for combining logic programming with neural networks for efficient learning of latent relational structures, such as various subgraph patterns in molecules. In this chapter, we will briefly re-introduce the framework and explain its current relevance in the context of contemporary Graph Neural Networks (GNNs). Particularly, we will detail how the declarative nature of differentiable logic programming in LRNNs can be used to elegantly capture various GNN variants and generalize to novel, even more expressive, deep relational learning concepts. Additionally, we will briefly demonstrate practical use and computation performance of the framework.

On Discovering Interesting Combinatorial Integer Sequences

Autoři: Svatoš, M., Ing. Peter Jung, Ing. Jan Tóth, Wang, Y., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2023. p. 3338-3346. ISBN 978-1-956792-03-4.
Rok: 2023

DOI: 10.24963/ijcai.2023/372
Odkaz: https://doi.org/10.24963/ijcai.2023/372
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We study the problem of generating interesting integer sequences with a combinatorial interpretation. For this we introduce a two-step approach. In the first step, we generate first-order logic sentences which define some combinatorial objects, e.g., undirected graphs, permutations, matchings etc. In the second step, we use algorithms for lifted first-order model counting to generate integer sequences that count the objects encoded by the first-order logic formulas generated in the first step. For instance, if the first-order sentence defines permutations then the generated integer sequence is the sequence of factorial numbers n!. We demonstrate that our approach is able to generate interesting new sequences by showing that a non-negligible fraction of the automatically generated sequences can actually be found in the Online Encyclopaedia of Integer Sequences (OEIS) while generating many other similar sequences which are not present in OEIS and which are potentially interesting. A key technical contribution of our work is the method for generation of first-order logic sentences which is able to drastically prune the space of sentences by discarding large fraction of sentences which would lead to redundant integer sequences.

On Exact Sampling in the Two-Variable Fragment of First-Order Logic

Autoři: Wang, Y., Pu, J., Wang, Y., Ing. Ondřej Kuželka, Ph.D.,
Publikace: 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). IEEE Xplore, 2023. ISSN 2575-5528. ISBN 979-8-3503-3587-3.
Rok: 2023

DOI: 10.1109/LICS56636.2023.10175742
Odkaz: https://doi.org/10.1109/LICS56636.2023.10175742
Pracoviště: Intelligent Data Analysis
Anotace:
In this paper, we study the sampling problem for first-order logic proposed recently by Wang et al.—how to efficiently sample a model of a given first-order sentence on a finite domain? We extend their result for the universally-quantified subfragment of two-variable logic FO 2 (UFO 2 ) to the entire fragment of FO 2 . Specifically, we prove the domain-liftability under sampling of FO 2 , meaning that there exists a sampling algorithm for FO 2 that runs in time polynomial in the domain size. We then further show that this result continues to hold even in the presence of counting constraints, such as ∀x∃ =k y : φ(x, y) and ∃ =k x∀y : φ(x, y), for some quantifier-free formula φ(x, y). Our proposed method is constructive, and the resulting sampling algorithms have potential applications in various areas, including the uniform generation of combinatorial structures and sampling in statistical-relational models such as Markov logic networks and probabilistic logic programs.

Automatic Conjecturing of P-Recursions Using Lifted Inference

Autoři: Barvínek, J., van Bremen, T., Wang, Y., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Inductive Logic Programming. Springer Science and Business Media Deutschland GmbH, 2022. p. 17-25. ISSN 0302-9743. ISBN 978-3-030-97453-4.
Rok: 2022

DOI: 10.1007/978-3-030-97454-1_2
Odkaz: https://doi.org/10.1007/978-3-030-97454-1_2
Pracoviště: Intelligent Data Analysis
Anotace:
Recent progress in lifted inference algorithms has made it possible to solve many non-trivial counting tasks from enumerative combinatorics in an automated fashion, by casting them as first-order model counting problems. Algorithms for this problem typically output a single number, which is the number of models of the first-order logic sentence in question on a given domain. However, in the combinatorics setting, we are more interested in obtaining a mathematical formula that holds for any given structure size. In this paper, we show that one can use lifted inference algorithms to conjecture linear recurrences with polynomial coefficients, one such class of formulas of interest.

Domain-Lifted Sampling for Universal Two-Variable Logic and Extensions

Autoři: Wang, Y., van Bremen, T., Wang, Y., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the 36th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2022. p. 10070-10079. ISSN 2159-5399. ISBN 978-1-57735-876-3.
Rok: 2022

DOI: 10.1609/aaai.v36i9.21246
Odkaz: https://doi.org/10.1609/aaai.v36i9.21246
Pracoviště: Intelligent Data Analysis
Anotace:
Given a first-order sentence ? and a domain size n, how can one sample a model of ? on the domain {1, . . . , n} efficiently as n scales? We consider two variants of this problem: the uniform sampling regime, in which the goal is to sample a model uniformly at random, and the symmetric weighted sampling regime, in which models are weighted according to the number of groundings of each predicate appearing in them. Solutions to this problem have applications to the scalable generation of combinatorial structures, as well as sampling in several statistical-relational models such as Markov logic networks and probabilistic logic programs. In this paper, we identify certain classes of sentences that are domain-liftable under sampling, in the sense that they admit a sampling algorithm that runs in time polynomial in n. In particular, we prove that every sentence of the form ∀x∀y: ?(x, y) for some quantifier-free formula ?(x,y) is domain-liftable under sampling. We then further show that this result continues to hold in the presence of one or more cardinality constraints as well as a single tree axiom constraint.

Graph Generation with Graphon Generative Adversarial Networks

Autoři: Ing. Peter Jung, Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of The 31st International Conference on Inductive Logic Programming. Proceedings of Machine Learning Research, 2022.
Rok: 2022

Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Graphons are limits of converging sequences of graphs with a particularly simple representation—a graphon is simply a symmet ric function of two variables on [0; 1]2. In this work, we develop an el- egant GAN model, called GraphonGAN, which uses graphons imple- mented by neural networks as generators and graph neural networks as discriminators. We show that GraphonGAN is a decent model for modelling real-world networks. All the source codes will be available at https://github.com/kongzii/gangraphon

Hoeffding and Bernstein Inequalities for U-statistics without Replacement

Autoři: Jianhang Ai, MSc., Ing. Ondřej Kuželka, Ph.D., Wang, Y.
Publikace: Statistics and Probability Letters. 2022, 187 ISSN 0167-7152.
Rok: 2022

DOI: 10.1016/j.spl.2022.109528
Odkaz: https://doi.org/10.1016/j.spl.2022.109528
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Concentration inequalities quantify random fluctuations of functions of random variables, typically by bounding the probability that such a function differs from its expected value by more than a certain amount. In this paper, we extend Hoeffding’s inequality and Bernstein’s inequality for U-statistics to the setting of sampling without replacement from a finite population.

Learning Distributional Programs for Relational Autocompletion

Autoři: Kumar, N., Ing. Ondřej Kuželka, Ph.D., RAEDT, L.D.
Publikace: Theory and Practice of Logic Programming. 2022, 22(1), 81-114. ISSN 1471-0684.
Rok: 2022

DOI: 10.1017/S1471068421000144
Odkaz: https://doi.org/10.1017/S1471068421000144
Pracoviště: Intelligent Data Analysis
Anotace:
Relational autocompletion is the problem of automatically filling out some missing values in multi-relational data. We tackle this problem within the probabilistic logic programming framework of Distributional Clauses (DCs), which supports both discrete and continuous probability distributions. Within this framework, we introduce DiceML - an approach to learn both the structure and the parameters of DC programs from relational data (with possibly missing data). To realize this, DiceML integrates statistical modeling and DCs with rule learning. The distinguishing features of DiceML are that it (1) tackles autocompletion in relational data, (2) learns DCs extended with statistical models, (3) deals with both discrete and continuous distributions, (4) can exploit background knowledge, and (5) uses an expectation-maximization-based (EM) algorithm to cope with missing data. The empirical results show the promise of the approach, even when there is missing data.

Learning to Generate Molecules From Small Datasets Using Neural Markov Logic Networks

Autoři: Svatoš, M., Ing. Peter Jung, prof. Ing. Filip Železný, Ph.D., Marra, G., Ing. Ondřej Kuželka, Ph.D.,
Publikace: International Joint Conference on Learning & Reasoning. Cham: Springer, 2022.
Rok: 2022

Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Neural Markov Logic networks are a statistical relational model capable of generating relational structures. In this paper, we investigate how this particular model behaves in the setup of few-shot learning and show that Neural Markov Logic Networks are able to learn to generate small molecules from a handful of training examples without any pre-training.

Learning with Molecules beyond Graph Neural Networks

Autoři: Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: GRAPHS AND MORE COMPLEX STRUCTURES FOR LEARNING AND REASONING workshop @ AAAI. Massachusetts: OpenReview.net / University of Massachusetts, 2022.
Rok: 2022

Pracoviště: Intelligent Data Analysis
Anotace:
We demonstrate a deep learning framework which is inherently based in the highly expressive language of relational logic, enabling to, among other things, capture arbitrarily complex graph structures. We show how Graph Neural Networks and similar models can be easily covered in the framework by specifying the underlying propagation rules in the relational logic. The declarative nature of the used language then allows to easily modify and extend the existing propagation schemes into more complex structures, such as atom rings in molecules, which we choose for a short demonstration in this work

Beyond graph neural networks with lifted relational neural networks

Autoři: Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Machine Learning. 2021, 2021(3), 1695-1738. ISSN 1573-0565.
Rok: 2021

DOI: 10.1007/s10994-021-06017-3
Odkaz: https://doi.org/10.1007/s10994-021-06017-3
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We introduce a declarative differentiable programming framework, based on the language of Lifted Relational Neural Networks, where small parameterized logic programs are used to encode deep relational learning scenarios through the underlying symmetries. When presented with relational data, such as various forms of graphs, the logic program interpreter dynamically unfolds differentiable computation graphs to be used for the program parameter optimization by standard means. Following from the declarative, relational logic-based encoding, this results into a unified representation of a wide range of neural models in the form of compact and elegant learning programs, in contrast to the existing procedural approaches operating directly on the computational graph level. We illustrate how this idea can be used for a concise encoding of existing advanced neural architectures, with the main focus on Graph Neural Networks (GNNs). Importantly, using the framework, we also show how the contemporary GNN models can be easily extended towards higher expressiveness in various ways. In the experiments, we demonstrate correctness and computation efficiency through comparison against specialized GNN frameworks, while shedding some light on the learning performance of the existing GNN models.

Context-Specific Likelihood Weighting

Autoři: Kumar, N., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of Machine Learning Research. Proceedings of Machine Learning Research, 2021. 130. ISSN 2640-3498.
Rok: 2021

Pracoviště: Intelligent Data Analysis
Anotace:
Sampling is a popular method for approximate inference when exact inference is impractical. Generally, sampling algorithms do not exploit contextspecific independence (CSI) properties of probability distributions. We introduce context-specific likelihood weighting (CS-LW), a new sampling methodology, which besides exploiting the classical conditional independence properties, also exploits CSI properties. Unlike the standard likelihood weighting, CS-LW is based on partial assignments of random variables and requires fewer samples for convergence due to the sampling variance reduction. Furthermore, the speed of generating samples increases. Our novel notion of contextual assignments theoretically justifies CS-LW. We empirically show that CS-LW is competitive with state-of-the-art algorithms for approximate inference in the presence of a significant amount of CSIs.

Fast Algorithms for Relational Marginal Polytopes

Autoři: Wang, Y., van Bremen, T., Pu, J., Wang, Y., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2021. p. 4266-4274. ISSN 1045-0823. ISBN 978-0-9992411-9-6.
Rok: 2021

DOI: 10.24963/ijcai.2021/586
Odkaz: https://doi.org/10.24963/ijcai.2021/586
Pracoviště: Intelligent Data Analysis
Anotace:
We study the problem of constructing the relational marginal polytope (RMP) of a given set of first-order formulas. Past work has shown that the RMP construction problem can be reduced to weighted first-order model counting (WFOMC). However, existing reductions in the literature are intractable in practice, since they typically require an infeasibly large number of calls to a WFOMC oracle. In this paper, we propose an algorithm to construct RMPs using fewer oracle calls. As an application, we also show how to apply this new algorithm to improve an existing approximation scheme for WFOMC. We demonstrate the efficiency of the proposed approaches experimentally, and find that our method provides speed-ups over the baseline for RMP construction of a full order of magnitude.

Faster Lifting for Two-variable Logic Using Cell Graphs

Autoři: van Bremen, T., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence. ML Research Press, 2021. p. 1393-1402. ISSN 2640-3498.
Rok: 2021

Pracoviště: Intelligent Data Analysis
Anotace:
We consider the weighted first-order model counting (WFOMC) task, a problem with important applications to inference and learning in structured graphical models. Bringing together earlier work [Van den Broeck et al., 2011, 2014], a formal proof was given by Beame et al. [2015] showing that the two-variable fragment of first-order logic, FO^2, is domain-liftable, meaning it admits an algorithm for WFOMC whose runtime is polynomial in the given domain size. However, applying this theoretical upper bound is often impractical for real-world problem instances. We show how to adapt their proof into a fast algorithm for lifted inference in FO^2, using only off-the-shelf tools for knowledge compilation, and several careful optimizations involving the cell graph of the input sentence, a novel construct we define that encodes the interactions between the cells of the sentence. Experimental results show that, despite our approach being largely orthogonal to that of Forclift [Van den Broeck et al., 2011], our algorithm often outperforms it, scaling to larger domain sizes on more complex input sentences.

Lifted Inference with Tree Axioms

Autoři: van Bremen, T., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning. International Joint Conferences on Artificial Intelligence Organization, 2021. p. 599-608. ISSN 2334-1033. ISBN 978-1-956792-99-7.
Rok: 2021

DOI: 10.24963/kr.2021/57
Odkaz: https://doi.org/10.24963/kr.2021/57
Pracoviště: Intelligent Data Analysis
Anotace:
We consider the problem of weighted first-order model counting (WFOMC): given a first-order sentence ϕ and domain size n ∈ ℕ, determine the weighted sum of models of ϕ over the domain {1, ..., n}. Past work has shown that any sentence using at most two logical variables admits an algorithm for WFOMC that runs in time polynomial in the given domain size (Van den Broeck 2011; Van den Broeck, Meert, and Darwiche 2014). In this paper, we extend this result to any two-variable sentence ϕ with the addition of a tree axiom, stating that some distinguished binary relation in ϕ forms a tree in the graph-theoretic sense.

Lossless Compression of Structured Convolutional Models via Lifting

Autoři: Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: International Conference on Learning Representations. Massachusetts: OpenReview.net / University of Massachusetts, 2021.
Rok: 2021

Pracoviště: Intelligent Data Analysis
Anotace:
Lifting is an efficient technique to scale up graphical models generalized to relational domains by exploiting the underlying symmetries. Concurrently, neural models are continuously expanding from grid-like tensor data into structured representations, such as various attributed graphs and relational databases. To address the irregular structure of the data, the models typically extrapolate on the idea of convolution, effectively introducing parameter sharing in their, dynamically unfolded, computation graphs. The computation graphs themselves then reflect the symmetries of the underlying data, similarly to the lifted graphical models. Inspired by lifting, we introduce a simple and efficient technique to detect the symmetries and compress the neural models without loss of any information. We demonstrate through experiments that such compression can lead to significant speedups of structured convolutional models, such as various Graph Neural Networks, across various tasks, such as molecule classification and knowledge-base completion.

Neural Markov Logic Networks

Autoři: Marra, G., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence. ML Research Press, 2021. p. 908-917. ISSN 2640-3498.
Rok: 2021

Pracoviště: Intelligent Data Analysis
Anotace:
We introduce neural Markov logic networks (NMLNs), a statistical relational learning system that borrows ideas from Markov logic. Like Markov logic networks (MLNs), NMLNs are an exponential-family model for modelling distributions over possible worlds, but unlike MLNs, they do not rely on explicitly specified first-order logic rules. Instead, NMLNs learn an implicit representation of such rules as a neural network that acts as a potential function on fragments of the relational structure. Similarly to many neural symbolic methods, NMLNs can exploit embeddings of constants but, unlike them, NMLNs work well also in their absence. This is extremely important for predicting in settings other than the transductive one. We showcase the potential of NMLNs on knowledge-base completion, triple classification and on generation of molecular (graph) data.

Weighted First-Order Model Counting in the Two-Variable Fragment With Counting Quantifiers

Autoři: Ing. Ondřej Kuželka, Ph.D.,
Publikace: Journal of Artificial Intelligence Research. 2021, 70 1281-1307. ISSN 1076-9757.
Rok: 2021

DOI: 10.1613/jair.1.12320
Odkaz: https://doi.org/10.1613/jair.1.12320
Pracoviště: Intelligent Data Analysis
Anotace:
It is known due to the work of Van den Broeck, Meert and Darwiche that weighted first-order model counting (WFOMC) in the two-variable fragment of first-order logic can be solved in time polynomial in the number of domain elements. In this paper we extend this result to the two-variable fragment with counting quantifiers.

Approximate Weighted First-Order Model Counting: Exploiting Fast Approximate Model Counters and Symmetry

Autoři: Bremen, T., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2020. p. 4252-4258. ISBN 978-0-9992411-6-5.
Rok: 2020

DOI: 10.24963/ijcai.2020/587
Odkaz: https://doi.org/10.24963/ijcai.2020/587
Pracoviště: Intelligent Data Analysis
Anotace:
We study the symmetric weighted first-order model counting task and present ApproxWFOMC, a novel anytime method for efficiently bounding the weighted first-order model count of a sentence given an unweighted first-order model counting oracle. The algorithm has applications to inference in a variety of first-order probabilistic representations, such as Markov logic networks and probabilistic logic programs. Crucially for many applications, no assumptions are made on the form of the input sentence. Instead, the algorithm makes use of the symmetry inherent in the problem by imposing cardinality constraints on the number of possible true groundings of a sentence's literals. Realising the first-order model counting oracle in practice using the approximate hashing-based model counter ApproxMC3, we show how our algorithm is competitive with existing approximate and exact techniques for inference in first-order probabilistic models. We additionally provide PAC guarantees on the accuracy of the bounds generated.

Complex Markov Logic Networks: Expressivity and Liftability

Autoři: Ing. Ondřej Kuželka, Ph.D.,
Publikace: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence. Proceedings of Machine Learning Research, 2020. p. 749-758. ISSN 2640-3498.
Rok: 2020

Pracoviště: Intelligent Data Analysis
Anotace:
We study expressivity of Markov logic networks (MLNs). We introduce complex MLNs, which use complex-valued weights, and show that, unlike standard MLNs with real-valued weights, complex MLNs are"fully expressive". We then observe that discrete Fourier transform can be computed using weighted first order model counting (WFOMC) with complex weights and use this observation to design an algorithm for computing "relational marginal polytopes" which needs substantially less calls to a WFOMC oracle than an existing recent algorithm.

Domain-Liftability of Relational Marginal Polytopes

Autoři: Ing. Ondřej Kuželka, Ph.D., Wang, Y.
Publikace: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, 2020. p. 2284-2291. ISSN 2640-3498.
Rok: 2020

Pracoviště: Intelligent Data Analysis
Anotace:
We study computational aspects of "relational marginal polytopes" which are statistical relational learning counterparts of marginal polytopes, well-known from probabilistic graphical models. Here, given some first-order logic formula, we can define its relational marginal statistic to be the fraction of groundings that make this formula true in a given possible world. For a list of first-order logic formulas, the relational marginal polytope is the set of all points that correspond to expected values of the relational marginal statistics that are realizable. In this paper we study the following two problems: (i) Do domain-liftability results for the partition functions of Markov logic networks (MLNs)carry over to the problem of relational marginal polytope construction? (ii) Is the relational marginal polytope containment problem hard under some plausible complexity-theoretic assumptions? Our positive results have consequences for lifted weight learning of MLNs. In particular, we show that weight learning of MLNs is domain-liftable whenever the computation of the partition function of the respective MLNs is domain-liftable (this result has not been rigorously proven before).

Learning with Molecules beyond Graph Neural Networks

Autoři: Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Machine Learning for Molecules Workshop @ NeurIPS 2020. Massachusetts: OpenReview.net / University of Massachusetts, 2020.
Rok: 2020

Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
In this paper we demonstrate a deep learning framework which is inherently based in the highly expressive language of relational logic, enabling to, among other things, capture arbitrarily complex graph structures. We show how GNNs and similar models can be easily covered in the framework by specifying the underlying propagation rules in the relational logic. The declarative nature of the used language then allows to easily modify and extend the propagation schemes into complex structures, such as the molecular rings which we choose for a short demonstration in this paper.

Lifted Weight Learning of Markov Logic Networks (Revisited One More Time)

Autoři: Ing. Ondřej Kuželka, Ph.D., Dr. Vyacheslav Kungurtsev, Ph.D., Wang, Y.
Publikace: Proceedings of the 10th International Conference on Probabilistic Graphical Models. Proceedings of Machine Learning Research, 2020. p. 269-280. vol. 138. ISSN 2640-3498.
Rok: 2020

Pracoviště: Intelligent Data Analysis
Anotace:
We revisit the problem of lifted weight learning of Markov logic networks (MLNs). We show that there is an algorithm for maximum-likelihood learning which runs in time polynomial in the size of the domain, whenever the partition function of the given MLN can be computed in polynomial time. This improves on our recent results where we showed the same result with the additional dependency of the runtime on a parameter of the training data, called interiority, which measures how “extreme” the given training data are. In this work, we get rid of this dependency. The main new technical ingredient that we exploit are theoretical results obtained recently by Straszak and Vishnoi (Maximum Entropy Distributions: Bit Complexity and Stability, COLT 2019).

STRiKE: Rule-Driven Relational Learning Using Stratified k-Entailment

Autoři: Svatoš, M., Schockaert, S., Davis, J., Ing. Ondřej Kuželka, Ph.D.,
Publikace: The proceedings of the 24th European Conference on Artificial Intelligence (ECAI 2020). Oxford: IOS Press, 2020. p. 1515-1522. vol. 325. ISSN 0922-6389. ISBN 978-1-64368-100-9.
Rok: 2020

DOI: 10.3233/FAIA200259
Odkaz: https://doi.org/10.3233/FAIA200259
Pracoviště: Intelligent Data Analysis
Anotace:
Relational learning for knowledge base completion has been receiving considerable attention. Intuitively, rule-based strategies are clearly appealing, given their transparency and their ability to capture complex relational dependencies. In practice, however, pure rule-based strategies are currently not competitive with state-of-the-art methods, which is a reflection of the fact that (i) learning high-quality rules is challenging, and (ii) classical entailment is too brittle to cope with the noisy nature of the learned rules and the given knowledge base. In this paper, we introduce STRiKE, a new approach for relational learning in knowledge bases which addresses these concerns. Our contribution is three-fold. First, we introduce a new method for learning stratified rule bases from relational data. Second, to use these rules in a noise-tolerant way, we propose a strategy which extends k-entailment, a recently introduced cautious entailment relation, to stratified rule bases. Finally, we introduce an efficient algorithm for reasoning based on k-entailment.

Lifted Weight Learning of Markov Logic Networks Revisited

Autoři: Ing. Ondřej Kuželka, Ph.D., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: Proceedings of Machine Learning Research. Proceedings of Machine Learning Research, 2019. p. 1753-1761. ISSN 2640-3498.
Rok: 2019

Pracoviště: Intelligent Data Analysis
Anotace:
We study lifted weight learning of Markovlogic networks. We show that there is an al-gorithm for maximum-likelihood learning of2-variable Markov logic networks which runsin time polynomial in the domain size. Ourresults are based on existing lifted-inferencealgorithms and recent algorithmic results oncomputing maximum entropy distributions.

Markov Logic Networks for Knowledge Base Completion:A Theoretical Analysis Under the MCAR Assumption

Autoři: Ing. Ondřej Kuželka, Ph.D., Davis, J.
Publikace: Proceedings of the Conference on Uncertainty in Artificial Intelligence. Tel-Aviv: Tel-Aviv University, 2019. ISSN 2640-3498.
Rok: 2019

Pracoviště: Intelligent Data Analysis
Anotace:
We study the following question. We are givena knowledge base in which some facts are miss-ing. We learn the weights of a Markov logicnetwork using maximum likelihood estimationon this knowledge base and then use the learnedMarkov logic network to predict the missingfacts. Assuming that the facts are missing in-dependently and with the same probability, canwe say that this approach is consistent in someprecise sense? This is a non-trivial questionbecause we are learning from only one trainingexample. In this paper we show that the answerto this question is positive.

Lifted Relational Neural Networks: Efficient Learning of Latent Relational Structures

Autoři: Ing. Gustav Šír, Ph.D., Aschenbrenner, V., prof. Ing. Filip Železný, Ph.D., Schockaert, S., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Journal of Artificial Intelligence Research. 2018, 62 69-100. ISSN 1076-9757.
Rok: 2018

Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We propose a method to combine the interpretability and expressive power of firstorder logic with the effectiveness of neural network learning. In particular, we introduce a lifted framework in which first-order rules are used to describe the structure of a given problem setting. These rules are then used as a template for constructing a number of neural networks, one for each training and testing example. As the different networks corresponding to different examples share their weights, these weights can be efficiently learned using stochastic gradient descent. Our framework provides a flexible way for implementing and combining a wide variety of modelling constructs. In particular, the use of first-order logic allows for a declarative specification of latent relational structures, which can then be efficiently discovered in a given data set using neural network learning. Experiments on 78 relational learning benchmarks clearly demonstrate the effectiveness of the framework.

Pruning Hypothesis Spaces Using Learned Domain Theories

Autoři: Svatoš, M., Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Shockaert, S., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Inductive Logic Programming 2017. Cham: Springer International Publishing, 2018. p. 152-168. 27. ISSN 0302-9743. ISBN 978-3-319-78089-4.
Rok: 2018

DOI: 10.1007/978-3-319-78090-0_11
Odkaz: https://doi.org/10.1007/978-3-319-78090-0_11
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
We present a method to prune hypothesis spaces in the con- text of inductive logic programming. The main strategy of our method consists in removing hypotheses that are equivalent to already consid- ered hypotheses. The distinguishing feature of our method is that we use learned domain theories to check for equivalence, in contrast to existing approaches which only prune isomorphic hypotheses. Specifically, we use such learned domain theories to saturate hypotheses and then check if these saturations are isomorphic. While conceptually simple, we exper- imentally show that the resulting pruning strategy can be surprisingly effective in reducing both computation time and memory consumption when searching for long clauses, compared to approaches that only con- sider isomorphism.

Stacked Structure Learning for Lifted Relational Neural Networks

Autoři: Ing. Gustav Šír, Ph.D., Svatoš, M., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D., Schockaert, S.
Publikace: Inductive Logic Programming 2017. Cham: Springer International Publishing, 2018. p. 140-151. 27. ISSN 0302-9743. ISBN 978-3-319-78089-4.
Rok: 2018

DOI: 10.1007/978-3-319-78090-0_10
Odkaz: https://doi.org/10.1007/978-3-319-78090-0_10
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Lifted Relational Neural Networks (LRNNs) describe relational domains using weighted first-order rules which act as templates for constructing feed-forward neural networks. While previous work has shown that using LRNNs can lead to state-of-the-art results in various ILP tasks, these results depended on hand-crafted rules. In this paper, we extend the framework of LRNNs with structure learning, thus enabling a fully automated learning process. Similarly to many ILP methods, our structure learning algorithm proceeds in an iterative fashion by top-down searching through the hypothesis space of all possible Horn clauses, considering the predicates that occur in the training examples as well as invented soft concepts entailed by the best weighted rules found so far. In the experiments, we demonstrate the ability to automatically induce useful hierarchical soft concepts leading to deep LRNNs with a competitive predictive power.

Learning Predictive Categories Using Lifted Relational Neural Networks

Autoři: Ing. Gustav Šír, Ph.D., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D., Shockaert, S., Manandhar, S.
Publikace: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Berlin: Springer-Verlag, 2017. p. 108-119. ISSN 0302-9743. ISBN 978-3-319-63341-1.
Rok: 2017

DOI: 10.1007/978-3-319-63342-8_9
Odkaz: https://doi.org/10.1007/978-3-319-63342-8_9
Pracoviště: Katedra počítačů, Intelligent Data Analysis
Anotace:
Lifted relational neural networks (LRNNs) are a flexible neuralsymbolic framework based on the idea of lifted modelling. In this paper we show how LRNNs can be easily used to declaratively specify and solve a learning problem in which latent categories of entities and properties need to be jointly induced.

Lifted Relational Neural Networks

Autoři: Ing. Gustav Šír, Ph.D., Aschenbrenner, V., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Cognitive Computation 2015. Aachen: CEUR Workshop Proceedings, 2016. ISSN 1613-0073.
Rok: 2016

Pracoviště: Katedra počítačů
Anotace:
We propose a method combining relational-logic representations with neural network learning. A general lifted architecture, possibly reflecting some background domain knowledge, is described through relational rules which may be handcrafted or learned. The relational rule-set serves as a template for unfolding possibly deep neural networks whose structures also reflect the structures of given training or testing relational examples. Different networks corresponding to different examples share their weights, which co-evolve during training by stochastic gradient descent algorithm. The framework allows for hierarchical relational modeling constructs and learning of latent relational concepts through shared hidden layers weights corresponding to the rules. Discovery of notable relational concepts and experiments on 78 relational learning benchmarks demonstrate favorable performance of the method.

Polynomial and Extensible Solutions in Lock-Chart Solving

Autoři: Černoch, R., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Applied Artificial Intelligence. 2016, 30(10), 923-941. ISSN 0883-9514.
Rok: 2016

DOI: 10.1080/08839514.2017.1279110
Odkaz: https://doi.org/10.1080/08839514.2017.1279110
Pracoviště: Katedra počítačů
Anotace:
Lock-chart (also known as master-key system) solving is a process for designing mechanical keys and locks so that every key can open and be blocked in a user-defined set of locks. We present several algorithms as a result of our 3 years industrial collaboration project, chosen by their asymptotic time complexity and extensibility, which is the ability to add new keys and locks and satisfy the previously unforeseen blocking requirements. Our strongest result is a linear-time algorithm to find the largest diagonal lock-chart with a condition on mechanical properties of keys. Without these restrictions, but given a solution for master keys, the problem is translated into a maximum independent set problem and a greedy approximation is proposed. We evaluate the algorithms on a generated dataset using a non-backtracking procedure to simulate minimal extensions. Both approaches are suggested as heuristics or subroutines for a backtracking constraint satisfaction problem-based (CSP) solver.

Learning to detect network intrusion from a few labeled events and background traffic

Autoři: Ing. Gustav Šír, Ph.D., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Intelligent Mechanisms for Network Configuration and Security. Cham: Springer International Publishing, 2015, pp. 73-86. ISSN 0302-9743. ISBN 978-3-319-20033-0.
Rok: 2015

DOI: 10.1007/978-3-319-20034-7_9
Odkaz: https://doi.org/10.1007/978-3-319-20034-7_9
Pracoviště: Katedra počítačů
Anotace:
Intrusion detection systems (IDS) analyse network traffic data with the goal to reveal malicious activities and incidents. A general problem with learning within this domain is a lack of relevant ground truth data, i.e. real attacks, capturing malicious behaviors in their full variety. Most of existing solutions thus, up to a certain level, rely on rules designed by network domain experts. Although there are advantages to the use of rules, they lack the basic ability of adapting to traffic data. As a result, we propose an ensemble tree bagging classifier, capable of learning from an extremely small number of true attack representatives, and demonstrate that, incorporating a general background traffic, we are able to generalize from those few representatives to achieve competitive results to the expert designed rules used in existing IDS Camnep.

Novel Gene Sets Improve Set-level Classification of Prokaryotic Gene Expression Data

Autoři: Holec, M., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: BMC Bioinformatics. 2015, 16(1), ISSN 1471-2105.
Rok: 2015

DOI: 10.1186/s12859-015-0786-7
Odkaz: https://doi.org/10.1186/s12859-015-0786-7
Pracoviště: Katedra počítačů
Anotace:
Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers.

A Method for Reduction of Examples in Relational Learning

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., prof. Ing. Filip Železný, Ph.D.,
Publikace: Journal of Intelligent Information Systems. 2014, 42(2), 255-281. ISSN 0925-9902.
Rok: 2014

DOI: 10.1007/s10844-013-0294-z
Odkaz: https://doi.org/10.1007/s10844-013-0294-z
Pracoviště: Katedra počítačů
Anotace:
We explore whether in relational learning, examples in the form of clauses can be reduced in size to speed up learning without aﬀecting the learned hypothesis.

Predicting Top-k Trends on Twitter using Graphlets and Time Features

Autoři: Ing. Gustav Šír, Ph.D., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Late Breaking Papers of the 23rd International Conference on Inductive Logic Programming. Aachen: CEUR Workshop Proceedings, 2014, pp. 52-57. ISSN 1613-0073. Available from: http://ceur-ws.org/Vol-1187/paper-09.pdf
Rok: 2014

Pracoviště: Katedra počítačů
Anotace:
We introduce a novel method for predicting trending keywords on Twitter. This new method exploits topology of the studied parts of the social network. It is based on a combination of graphlet spectra and so-called time features. We show experimentally that using graphlets and time features is benecial for the accuracy of prediction.

Bounded Least General Generalization

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., prof. Ing. Filip Železný, Ph.D.,
Publikace: Inductive Logic Programming, 22nd International Conference, ILP 2012, Dubrovnik, Croatia, September 17-19, 2012, Revised Selected Papers. Berlin: Springer, 2013, pp. 116-129. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-38811-8. Available from: http://link.springer.com/chapter/10.1007/978-3-642-38812-5_9
Rok: 2013

DOI: 10.1007/978-3-642-38812-5_9
Odkaz: https://doi.org/10.1007/978-3-642-38812-5_9
Pracoviště: Katedra počítačů
Anotace:
We study a generalization of Plotkin’s least general generalization. We introduce a novel concept called bounded least general generalization w.r.t. a set of clauses and show an instance of it for which polynomial-time reduction procedures exist. We demonstrate the practical utility of our approach in experiments on several relational learning datasets.

Formulating the template ILP consistency problem as a constraint satisfaction problem

Autoři: Barták, RB, Černoch, R., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: CONSTRAINTS. 2013, 18(2), 144-165. ISSN 1383-7133.
Rok: 2013

DOI: 10.1007/s10601-013-9141-7
Odkaz: https://doi.org/10.1007/s10601-013-9141-7
Pracoviště: Katedra počítačů
Anotace:
We employ constraint satisfaction techniques to model and solve a problem known as template ILP consistency.

Reducing Examples in Relational Learning with Bounded-Treewidth Hypotheses

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of the Workshop on New Frontiers In Mining Complex Patterns. Naples: ICAR - National Research Council, 2013, pp. 17-32. ISSN 0302-9743. ISBN 978-3-642-37381-7. Available from: http://www.di.uniba.it/~ceci/micFiles/papers/NFMCP.pdf
Rok: 2013

DOI: 10.1007/978-3-642-37382-4_2
Odkaz: https://doi.org/10.1007/978-3-642-37382-4_2
Pracoviště: Katedra kybernetiky
Anotace:
We study reducibility of learning examples in the learning from entailment setting. We start with an existing reduction method and improve it for the case when learned hypotheses are restricted to have bounded treewidth. We show that in such cases there is a polynomial-time reduction algorithm which reduces the learning examples at least as much as the existing exponential-time algorithm despite the fact that the examples which are reduced can have arbitrarily high treewidth.

Constraint Satisfaction for Learning Hypotheses in Inductive Logic Programming

Autoři: Barták, R., prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: Encyclopedia of the Sciences of Learning. Dordrecht: Springer, 2012. p. 777-780. ISBN 978-1-4419-1427-9.
Rok: 2012

Pracoviště: Katedra kybernetiky
Anotace:
The chapter is an introductory overview of constraint-satisfaction techniques for learning hypotheses in inductive logic programming.

Extending the Ball-Histogram Method with Continuous Distributions and an Application to Prediction of DNA-Binding Proteins

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of 2012 IEEE International Conference on Bioinformatics and Biomedicine. Los Alamitos: IEEE Computer Society, 2012, pp. 365-368. ISSN 2156-1125. ISBN 978-1-4673-2746-6.
Rok: 2012

DOI: 10.1109/BIBM.2012.6392699
Odkaz: https://doi.org/10.1109/BIBM.2012.6392699
Pracoviště: Katedra kybernetiky
Anotace:
We introduce a novel method for prediction of DNA-binding propensity of proteins which extends our recently introduced ball-histogram method (Szabóová et al. 2012). Unlike the original ball-histogram method, it allows handling of continuous properties of protein regions. In experiments on four datasets of proteins, we show that the method improves upon the original ball-histogram method as well as other existing methods in terms of predictive accuracy.

Prediction of Antimicrobial Activity of Peptides using Relational Machine Learning

Autoři: Szabóová, A., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of 2012 IEEE International Conference on Bioinformatics and Biomedicine. Los Alamitos: IEEE Computer Society, 2012, pp. 575-580. ISSN 2156-1125. ISBN 978-1-4673-2746-6.
Rok: 2012

Pracoviště: Katedra kybernetiky
Anotace:
We apply relational machine learning techniques to predict antimicrobial activity of peptides. We follow our successful strategy (Szabóová et al., MLSB 2010) to prediction of DNA-binding propensity of proteins from structural features. We exploit structure prediction methods to obtain peptides' spatial structures, then we construct the structural relational features. We use these relational features as attributes in a regression model. We apply this methodology to antimicrobial activity prediction of peptides achieving better predictive accuracies than a state-of-the-art approach.

Prediction of DNA-Binding Propensity of Proteins by the Ball-Histogram Method Using Automatic Template Search

Autoři: Szabóová, A., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D., Tolar, J.
Publikace: BMC Bioinformatics. 2012, 13(Suppl 10), 1-11. ISSN 1471-2105.
Rok: 2012

DOI: 10.1186/1471-2105-13-S10-S3
Odkaz: https://doi.org/10.1186/1471-2105-13-S10-S3
Pracoviště: Katedra kybernetiky
Anotace:
We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.

Prediction of DNA-Binding Proteins from Relational Features

Autoři: Szabóová, A., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D., Tolar, J.
Publikace: Proteome Science. 2012, 2012(10), ISSN 1477-5956.
Rok: 2012

DOI: 10.1186/1477-5956-10-66
Odkaz: https://doi.org/10.1186/1477-5956-10-66
Pracoviště: Katedra kybernetiky
Anotace:
The process of protein-DNA binding has an essential role in the biological processing of genetic information. We use relational machine learning to predict DNA-binding propensity of proteins from their structures. Automatically discovered structural features are able to capture some characteristic spatial configurations of amino acids in proteins. Prediction based only on structural relational features already achieves competitive results to existing methods based on physicochemical properties on several protein datasets. Predictive performance is further improved when structural features are combined with physicochemical features. Moreover, the structural features provide some insights not revealed by physicochemical features. Our method is able to detect common spatial substructures. We demonstrate this in experiments with zinc finger proteins.

Relational Learning with Polynomials

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of IEEE 24th International Conference on Tools with Artificial Intelligence. Los Alamitos: IEEE Computer Society, 2012, pp. 1145-1150. ISSN 1082-3409. ISBN 978-0-7695-4915-6.
Rok: 2012

DOI: 10.1109/ICTAI.2012.163
Odkaz: https://doi.org/10.1109/ICTAI.2012.163
Pracoviště: Katedra kybernetiky
Anotace:
We describe a conceptually simple framework for transformation-based learning in hybrid relational domains. The proposed approach is related to hybrid Markov logic and to Gaussian logic framework. We evaluate the approach in three domains and show that it can achieve state-of-the-art performance while using only limited amount of information.

An Experimental Evaluation of Lifted Gene Sets

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Workshop on Mining Complex Entities from Network and Biomedical Data. Aldo Moro: Department of Computer Science (DIB) University of Bari, 2011, pp. 14-23. Available from: http://www.di.uniba.it/~loglisci/MIND2011/ECML_PKDD_MIND_2011_Proceedings.pdf
Rok: 2011

Pracoviště: Katedra kybernetiky
Anotace:
We introduce so-called lifted gene sets which are relational descriptions of sets of genes. We show that lifted gene sets are, to some extent, able to capture common structures in correlation matrices. In experiments with real-life microarray data, we show that, in fact, lifted gene sets can provide slightly better estimates of correlation than ground gene sets.

Block-Wise Construction of Tree-like Relational Features with Monotone Reducibility and Redundancy

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Machine Learning. 2011, 83(2), 163-192. ISSN 0885-6125.
Rok: 2011

DOI: 10.1007/s10994-010-5208-5
Odkaz: https://doi.org/10.1007/s10994-010-5208-5
Pracoviště: Katedra kybernetiky
Anotace:
We describe an algorithm for constructing a set of tree-like conjunctive relational features by combining smaller conjunctive blocks.

Gaussian Logic and Its Applications in Bioinformatics

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., prof. Ing. Filip Železný, Ph.D.,
Publikace: ACM BCB 2011: ACM Conference on Bioinformatics, Computational Biology and Biomedicine. New York: ACM, 2011, pp. 496-498. ISBN 978-1-4503-0796-3.
Rok: 2011

DOI: 10.1145/2147805.2147881
Odkaz: https://doi.org/10.1145/2147805.2147881
Pracoviště: Katedra kybernetiky
Anotace:
We describe a novel statistical relational learning framework capable to work efficiently with combinations of relational and numerical data which is especially valuable in bioinformatics applications. We show how this model can be applied to modelling of gene expression data and to problems from proteomics.

Gaussian Logic for Predictive Classification

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., Holec, M., prof. Ing. Filip Železný, Ph.D.,
Publikace: Machine Learning and Knowledge Discovery in Databases. Berlin: Springer, 2011, pp. 277-292. ISSN 0302-9743. ISBN 978-3-642-23782-9. Available from: http://www.springerlink.com/content/527048g350795uh0
Rok: 2011

DOI: 10.1007/978-3-642-23783-6_18
Odkaz: https://doi.org/10.1007/978-3-642-23783-6_18
Pracoviště: Katedra kybernetiky
Anotace:
We describe a statistical relational learning framework called Gaussian Logic capable to work efficiently with combinations of relational and numerical data. The framework assumes that, for a fixed relational structure, the numerical data can be modelled by a multivariate normal distribution. We demonstrate how the Gaussian Logic framework can be applied to predictive classification problems. In experiments, we first show an application of the framework for the prediction of DNA-binding propensity of proteins. Next, we show how the Gaussian Logic framework can be used to find motifs describing highly correlated gene groups in gene-expression data which are then used in a set-level-based classification method.

Gaussian Logic for Proteomics and Genomics

Autoři: Ing. Ondřej Kuželka, Ph.D., Szabóová, A., Holec, M., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of the 5th International Workshop on Machine Learning in Systems Biology. Technical University of Munich, 2011, pp. 44-48. ISBN 978-1-4503-0796-3.
Rok: 2011

Pracoviště: Katedra kybernetiky
Anotace:
We describe a statistical relational learning framework called Gaussian Logic capable to work efficiently with combinations of relational and numerical data. The framework assumes that, for a fixed relational structure, the numerical data can be modelled by a multivariate normal distribution. We show how the Gaussian Logic framework can be used to predict DNA-binding propensity of proteins and to find motifs describing novel gene sets which are then used in set-level classification of gene expression sample.

Prediction of DNA-Binding Propensity of Proteins by the Ball Histogram Method

Autoři: Szabóová, A., Ing. Ondřej Kuželka, Ph.D., Morales, S.E., prof. Ing. Filip Železný, Ph.D., Tolar, J.
Publikace: Bioinformatics Research and Applications. Heidelberg: Springer, 2011, pp. 358-367. ISSN 0302-9743. ISBN 978-3-642-21259-8.
Rok: 2011

Pracoviště: Katedra kybernetiky
Anotace:
We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing the charged patches of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of charged amino acids, capturing joint probabilities of specified amino acids occurring in certain distances from each other. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, achieving favorable accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.

Mining Frequent Spatial Docking Patterns in Zinc Finger DNA Complexes

Autoři: Szabóová, A., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D., Tolar, J.
Publikace: Proceedings of the 7th European Symposium on Biomedical Engineering. University of Patras, 2010, pp. !!!. ISBN 978-960-85715-5-6. Available from: http://www.medicon2010.org/MEDICON_2010_20100413_SK.htm
Rok: 2010

Pracoviště: Katedra kybernetiky
Anotace:
We use techniques of logic based relational machine learning to automatically detect spatial patterns common to 21 previously described examples of zinc finger DNA complexes. We demonstrate that such patterns can be found and thus the proposed methodology may potentially serve to achieve better understanding of zinc finger DNA binding.

Prediction of DNA Binding Proteins from Structural Features

Autoři: Szabóová, A., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D., Tolar, J.
Publikace: Proceedings of the Fourth International Workshop on Machine Learning in Systems Biology. Edinburgh: University of Edinburg, 2010, pp. 71-74.
Rok: 2010

Pracoviště: Katedra kybernetiky
Anotace:
We use logic-based machine learning to distinguish DNA- binding proteins from non-binding proteins. We combine previously suggested coarse-grained features (such as the dipole moment) with automatically constructed structural (spatial) features. Prediction based only on structural features already improves on the state-of-the-art predictive accuracies achieved in previous work with coarse-grained features. Accuracies are further improved when the combination of both feature categories is used. An important factor contributing to accurate prediction is that structural features are not Boolean but rather interpreted by counting the number of their occurences in a learning example.

Seeing the World through Homomorphism: An Experimental Study on Reducibility of Examples

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Inductive Logic Programming. Heidelberg: Springer, 2010, pp. 138-145. ISSN 0302-9743. ISBN 978-3-642-21294-9.
Rok: 2010

Pracoviště: Katedra kybernetiky
Anotace:
We study reducibility of examples in several typical inductive logic programming benchmarks. The notion of reducibility that we use is related to theta-reduction, commonly used to reduce hypotheses in ILP. Whereas examples are usually not reducible on their own, they often become implicitly reducible when language for constructing hypotheses is fixed. We show that number of ground facts in a dataset can be almost halved for some real-world molecular datasets. Furthermore, we study the impact this has on a popular ILP system Aleph.

Shrinking Covariance Matrices using Biological Background Knowledge

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of the Fourth International Workshop on Machine Learning in Systems Biology. Edinburgh: University of Edinburg, 2010, pp. 143-146.
Rok: 2010

Pracoviště: Katedra kybernetiky
Anotace:
We propose a novel method for covariance matrix estimation based on shrinkage with a target inferred from biological background knowledge using methods of inductive logic programming. We show that our novel method improves on the state of the art when sample sets are small and some background knowledge expressed in a subset of first-order logic is available. As we show in the experiments with genetic data, this background knowledge can be even only indirectly relevant to the modeling problem at hand.

Taming the Complexity of Inductive Logic Programming (invited talk)

Autoři: prof. Ing. Filip Železný, Ph.D., Ing. Ondřej Kuželka, Ph.D.,
Publikace: SOFSEM 2010: Theory and Practice of Computer Science, vol.1. Heidelberg: Springer, 2010. p. 132-140. LNCS. ISSN 0302-9743. ISBN 978-3-642-11265-2.
Rok: 2010

DOI: 10.1007/978-3-642-11266-9_11
Odkaz: https://doi.org/10.1007/978-3-642-11266-9_11
Pracoviště: Katedra kybernetiky
Anotace:
We review our recent research aimed at conquering two sources of complexity of inductive logic programming: the clause subsumption problem and the clause search problem.

Using Constraint Satisfaction for Learning Hypotheses in Inductive Logic Programming

Autoři: Barták, R., Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of the Twenty-Third International Florida Artificial Intelligence Research Society Conference. Menlo Park, California: AAAI Press, 2010. p. 440-444. ISBN 978-1-57735-447-5.
Rok: 2010

Pracoviště: Katedra kybernetiky
Anotace:
Inductive logic programming (ILP) is a subfield of machine learning which uses first-order logic as a uniform representation for examples, background knowledge and hypotheses (Muggleton and De Raedt, 1994). In this paper we deal with a so called template consistency problem, which is one of essential tasks in ILP (Gottlob et al 1999). In particular, given learning examples and template T, we are looking for a substitution s making Ts consistent with the examples using methods from the field of constraint satisfaction.

Block-Wise Construction of Acyclic Relational Features with Monotone Irreducibility and Relevancy Properties

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: 26th International Conference on Machine Learning - procedings. Medison: Omnipress, 2009. p. 569-576. ISBN 978-1-60558-516-1.
Rok: 2009

Pracoviště: Katedra kybernetiky
Anotace:
We describe an algorithm for constructing a set of acyclic conjunctive relational features by combining smaller conjunctive blocks. Unlike traditional level-wise approaches which preserve the monotonicity of frequency, our block-wise approach preserves a form of monotonicity of the irreducibility and relevancy feature properties, which are important in propositionalization employed in the context of classification learning.

A Restarted Strategy for Efficient Subsumption Testing

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Fundamenta Informaticae. 2008, 89(1), 95-109. ISSN 0169-2968.
Rok: 2008

Pracoviště: Katedra kybernetiky
Anotace:
We design a randomized restarted subsumption testing algorithm ReSumEr2 and test it on generated graph data as well as on the predictive toxicology challenge (PTC) data set.

Fast Estimation of First Order Clause Coverage through Randomization and Maximum Likelihood

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Proceedings of the 25th International Conference on Machine Learning. Madison: Omnipress, 2008. p. 504-511. ISBN 978-1-60558-205-4.
Rok: 2008

Pracoviště: Katedra kybernetiky
Anotace:
In relational learning, {$\theta$}-subsumption and OI-subsumption are widely used coverage tests. Unfortunately, testing both {$\theta$}-subsumption and OI-subsumption is NP-complete, which is one of the reasons for poor performance of most relational learners. In this paper, we aim at a possible exploitation of randomized restarted search strategies for coverage estimation. We devise an algorithm that can, under a distributional assumption, estimate coverage between a given hypothesis and a set of examples without having to decide the subsumption test for all examples. We implement this algorithm in programs {\sc ReCovEr} and {\sc ReCovErOI}. On artificial graph data and real-world datasets, we show that these algorithms can provide reasonably accurate estimates while achieving favorable runtimes as compared to state-of-the-art algorithms.

HiFi: Tractable Propositionalization through Hierarchical Feature Construction

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Inductive Logic Programming, 18th International Conference, Late Breaking Papers. Praha: Agentura Action M, 2008, pp. 69-74. ISBN 978-80-86742-26-7.
Rok: 2008

Pracoviště: Katedra kybernetiky
Anotace:
We present a novel propositionalization algorithm HiFi based on constructing first order features with hierarchical structure.

Learning to Plan and Planning to Learn via Merging Relational Machine Learning with Constraint Satisfaction (extended abstract)

Autoři: Ing. Ondřej Kuželka, Ph.D., prof. Ing. Filip Železný, Ph.D.,
Publikace: Planning to Learn Workshop at ICML 2008. Madison: Omnipress, 2008, pp. 49-51.
Rok: 2008

Pracoviště: Katedra kybernetiky
Anotace:
The purpose of this submission is to present our newly started research project entitled "LeCoS: Meeting Machine Learning with Constraint Satisfaction" concerned with exploring how to exploit relational machina learning (primarily inductive logic programming, ILP) to solve constraint satisfaction problems (promařily planning tasks), and vice versa.

Ing. Ondřej Kuželka, Ph.D.

Všechny publikace

A More Practical Algorithm for Weighted First-Order Model Counting with Linear Order Axiom

Complexity of Weighted First-Order Model Counting in the Two-Variable Fragment with Counting Quantifiers: A Bound to Beat

Faster Repeated Evasion Attacks in Tree Ensembles

Lifted Algorithms for Symmetric Weighted First-order Model Sampling

Model Counting and Sampling in First-Order Logic

Quantified Neural Markov Logic Networks

Beyond Graph Neural Networks with Lifted Relational Neural Networks

Counting and Sampling Models in First-Order Logic

First-Order Context-Specific Likelihood Weighting in Hybrid Probabilistic Logic Programs

Hoeffding-Serfling Inequality for U-Statistics Without Replacement

Lifted Inference with Linear Order Axiom

Lifted Inference with Tree Axioms

Lifted Relational Neural Networks: from Graphs to Deep Relational Learning

On Discovering Interesting Combinatorial Integer Sequences

On Exact Sampling in the Two-Variable Fragment of First-Order Logic

Automatic Conjecturing of P-Recursions Using Lifted Inference

Domain-Lifted Sampling for Universal Two-Variable Logic and Extensions

Graph Generation with Graphon Generative Adversarial Networks

Hoeffding and Bernstein Inequalities for U-statistics without Replacement

Learning Distributional Programs for Relational Autocompletion

Learning to Generate Molecules From Small Datasets Using Neural Markov Logic Networks

Learning with Molecules beyond Graph Neural Networks

Beyond graph neural networks with lifted relational neural networks

Context-Specific Likelihood Weighting

Fast Algorithms for Relational Marginal Polytopes

Faster Lifting for Two-variable Logic Using Cell Graphs

Lifted Inference with Tree Axioms

Lossless Compression of Structured Convolutional Models via Lifting

Neural Markov Logic Networks

Weighted First-Order Model Counting in the Two-Variable Fragment With Counting Quantifiers

Approximate Weighted First-Order Model Counting: Exploiting Fast Approximate Model Counters and Symmetry

Complex Markov Logic Networks: Expressivity and Liftability

Domain-Liftability of Relational Marginal Polytopes

Learning with Molecules beyond Graph Neural Networks

Lifted Weight Learning of Markov Logic Networks (Revisited One More Time)

STRiKE: Rule-Driven Relational Learning Using Stratified k-Entailment

Lifted Weight Learning of Markov Logic Networks Revisited

Markov Logic Networks for Knowledge Base Completion:A Theoretical Analysis Under the MCAR Assumption

Lifted Relational Neural Networks: Efficient Learning of Latent Relational Structures

Pruning Hypothesis Spaces Using Learned Domain Theories

Stacked Structure Learning for Lifted Relational Neural Networks

Learning Predictive Categories Using Lifted Relational Neural Networks

Lifted Relational Neural Networks

Polynomial and Extensible Solutions in Lock-Chart Solving

Learning to detect network intrusion from a few labeled events and background traffic

Novel Gene Sets Improve Set-level Classification of Prokaryotic Gene Expression Data

A Method for Reduction of Examples in Relational Learning

Predicting Top-k Trends on Twitter using Graphlets and Time Features

Bounded Least General Generalization

Formulating the template ILP consistency problem as a constraint satisfaction problem

Reducing Examples in Relational Learning with Bounded-Treewidth Hypotheses

Constraint Satisfaction for Learning Hypotheses in Inductive Logic Programming

Extending the Ball-Histogram Method with Continuous Distributions and an Application to Prediction of DNA-Binding Proteins

Prediction of Antimicrobial Activity of Peptides using Relational Machine Learning

Prediction of DNA-Binding Propensity of Proteins by the Ball-Histogram Method Using Automatic Template Search

Prediction of DNA-Binding Proteins from Relational Features

Relational Learning with Polynomials

An Experimental Evaluation of Lifted Gene Sets

Block-Wise Construction of Tree-like Relational Features with Monotone Reducibility and Redundancy

Gaussian Logic and Its Applications in Bioinformatics

Gaussian Logic for Predictive Classification

Gaussian Logic for Proteomics and Genomics

Prediction of DNA-Binding Propensity of Proteins by the Ball Histogram Method

Mining Frequent Spatial Docking Patterns in Zinc Finger DNA Complexes

Prediction of DNA Binding Proteins from Structural Features

Seeing the World through Homomorphism: An Experimental Study on Reducibility of Examples

Shrinking Covariance Matrices using Biological Background Knowledge

Taming the Complexity of Inductive Logic Programming (invited talk)

Using Constraint Satisfaction for Learning Hypotheses in Inductive Logic Programming

Block-Wise Construction of Acyclic Relational Features with Monotone Irreducibility and Relevancy Properties

A Restarted Strategy for Efficient Subsumption Testing

Fast Estimation of First Order Clause Coverage through Randomization and Maximum Likelihood

HiFi: Tractable Propositionalization through Hierarchical Feature Construction

Learning to Plan and Planning to Learn via Merging Relational Machine Learning with Constraint Satisfaction (extended abstract)

Mějte přehled