Dr. Vyacheslav Kungurtsev, Ph.D.

Parametric Semidefinite Programming: Geometry of the Trajectory of Solutions

Autoři: Bellon, A., prof. Ing. Didier Henrion, Ph.D., Dr. Vyacheslav Kungurtsev, Ph.D., Mgr. Jakub Mareček, Ph.D.,
Publikace: Mathematics of Operations Research. 2025, 50(1), 410-430. ISSN 0364-765X.
Rok: 2025

DOI: 10.1287/moor.2021.0097
Odkaz: https://doi.org/10.1287/moor.2021.0097
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
In many applications, solutions of convex optimization problems are updated on-line, as functions of time. In this paper, we consider parametric semidefinite programs, which are linear optimization problems in the semidefinite cone whose coefficients (input data) depend on a time parameter. We are interested in the geometry of the solution (output data) trajectory, defined as the set of solutions depending on the parameter. We propose an exhaustive description of the geometry of the solution trajectory. As our main result, we show that only six distinct behaviors can be observed at a neighborhood of a given point along the solution trajectory. Each possible behavior is then illustrated by an example.

Structural and Multidisciplinary Optimization Truss topology design under harmonic loads: Peak power minimization with

Autoři: Ma, S., Mgr. Jakub Mareček, Ph.D., Dr. Vyacheslav Kungurtsev, Ph.D., Tyburec, M.
Publikace: Structural and Multidisciplinary Optimization. 2025, 68 1-24. ISSN 1615-147X.
Rok: 2025

DOI: 10.1007/s00158-025-03973-5
Odkaz: https://doi.org/10.1007/s00158-025-03973-5
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
Designing lightweight yet stiff structures that can withstand vibrations is a crucial task in structural optimization. Here, we present a novel framework for truss topology optimization under undamped harmonic oscillations. Our approach minimizes the peak power of the structure under harmonic loads, overcoming the limitations of single- frequency and in-phase assumptions found in previous methods. For this, we leverage the concept of semidefinite representable (SDr) functions, demonstrating that while compliance readily conforms to an SDr representation, peak power requires a derivation based on the non-negativity of trigonometric functions. Finally, we introduce convex relaxations for the minimization problem and provide promising computational results.

Federated SGD with Local Asynchrony

Autoři: Chatterjee, B., Dr. Vyacheslav Kungurtsev, Ph.D., Alistrah, D.
Publikace: 2024 IEEE 44th International Conference on Distributed Computing Systems. IEEE Xplore, 2024. p. 857-868. ISSN 2575-8411. ISBN 979-8-3503-8605-9.
Rok: 2024

DOI: 10.1109/ICDCS60910.2024.00084
Odkaz: https://doi.org/10.1109/ICDCS60910.2024.00084
Pracoviště: Intelligent Data Analysis
Anotace:
Parallel SGD in a shared-memory setting is oft-represented by the popular Hogwild! algorithm, in which lock-free updates are asynchronously performed by multiple computing processes. Unfortunately, scaling Hogwild! to distributed workers is largely unexplored. Specifically, it is unknown if any adaptation of Hogwild! to the popular decentralized multi-GPU setting offers any competitive speedup, either empirically or theoretically. In this work, we investigate the potential of decentralizing Hogwild! by incorporating simultaneously (a) asynchronous local gradient updates on the shared memory of GPUs, and (b) non-blocking asynchronous decentralized federated averaging. A naive direct implementation shows degradation in performance, arising from scheduling overheads and concurrent write conflicts on GPUs. To mitigate these drawbacks, we investigate and propose a new method, based on careful block selection rules, which update only portions of the parameter vectors. Our experiments show that the resulting decentralized training method exhibits improved throughput and competitive accuracy for standard image classification benchmarks on the CIFAR-10, CIFAR-100, and Imagenet datasets. On the theoretical side, we prove that our method guarantees sublinear ergodic convergence rates for non-convex objectives.

Inexact Direct-search Methods for Bilevel Optimization Problems

Autoři: Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D., Rinaldi, F., Zeffiro, D.
Publikace: Computational Optimization and Applications. 2024, 88(2), 469-490. ISSN 0926-6003.
Rok: 2024

DOI: 10.1007/s10589-024-00567-7
Odkaz: https://doi.org/10.1007/s10589-024-00567-7
Pracoviště: Intelligent Data Analysis
Anotace:
In this work, we introduce new direct-search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy blackbox oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct-search schemes in these settings, giving convergence guarantees to approximate stationary points, as well as complexity bounds in the smooth case. We also propose the first adaptation of mesh adaptive direct-search schemes for BO. Some preliminary numerical results on a standard set of bilevel optimization problems show the effectiveness of our new approaches.

Iteration Complexity of Variational Quantum Algorithms

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Korpas, G., Mgr. Jakub Mareček, Ph.D., Zhu, E.Y.
Publikace: Quantum. 2024, 8 1-42. ISSN 2521-327X.
Rok: 2024

DOI: 10.22331/q-2024-10-10-1495
Odkaz: https://doi.org/10.22331/q-2024-10-10-1495
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
There has been much recent interest in near-term applications of quantum computers, i.e., using quantum circuits that have short decoherence times due to hardware limitations. Variational quantum algorithms (VQA), wherein an optimization algorithm implemented on a classical computer evaluates a parametrized quantum circuit as an objective function, are a leading framework in this space. An enormous breadth of algorithms in this framework have been proposed for solving a range of problems in machine learning, forecasting, applied physics, and combinatorial optimization, among others. In this paper, we analyze the iteration complexity of VQA, that is, the number of steps that VQA requires until its iterates satisfy a surrogate measure of optimality. We argue that although VQA procedures incorporate algorithms that can, in the idealized case, be modeled as classic procedures in the optimization literature, the particular nature of noise in near-term devices invalidates the claim of applicability of off-the-shelf analyses of these algorithms. Specifically, noise makes the evaluations of the objective function via quantum circuits biased. . Commonly used optimization procedures, such as SPSA and the parameter shift rule, can thus be seen as derivative-free optimization algorithms with biased function evaluations, for which there are currently no iteration complexity guarantees in the literature. We derive the missing guarantees and find that the rate of convergence is unaffected. However, the level of bias contributes unfavorably to both the constant therein, and the asymptotic distance to stationarity, i.e., the more bias, the farther one is guaranteed, at best, to reach a stationary point of the VQA objective.

Mini-batch Stochastic Subgradient for Functional Constrained Optimization

Autoři: Singh, N.K., Necoara, I., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: Optimization. 2024, 73(7), 2159-2185. ISSN 0233-1934.
Rok: 2024

DOI: 10.1080/02331934.2023.2189015
Odkaz: https://doi.org/10.1080/02331934.2023.2189015
Pracoviště: Intelligent Data Analysis
Anotace:
In this paper, we consider finite sum composite optimization problems with many functional constraints. The objective function is expressed as a finite sum of two terms, one of which admits easy computation of (sub)gradients while the other is amenable to proximal evaluations. We assume a generalized bounded gradient condition on the objective which allows us to simultaneously tackle both smooth and nonsmooth problems. We also consider the cases of both with and without a quadratic functional growth property. Further, we assume that each constraint set is given as the level set of a convex but not necessarily a differentiable function. We reformulate the constrained finite sum problem into a stochastic optimization problem for which the stochastic subgradient projection method from Necoara and Singh [Stochastic subgradient projection methods for composite optimization with functional constraints; 2022 Journal of Machine Learning Research, 23, 1-35] specializes in a collection of mini-batch variants, with different mini-batch sizes for the objective function and functional constraints, respectively. More specifically, at each iteration, our algorithm takes a mini-batch stochastic proximal subgradient step aimed at minimizing the objective function and then a subsequent mini-batch subgradient projection step minimizing the feasibility violation. By specializing in different mini-batching strategies, we derive exact expressions for the stepsizes as a function of the mini-batch size and in some cases we also derive insightful stepsize-switching rules which describe when one should switch from a constant to a decreasing stepsize regime. We also prove sublinear convergence rates for the mini-batch subgradient projection algorithm which depend explicitly on the mini-batch sizes and on the properties of the objective function. Numerical results also show a better performance of our mini-batch scheme over its single-batch counterpart.

Time-Varying Semidefinite Programming: Path Following a Burer–Monteiro Factorization

Autoři: Bellon, A., Dressler, M., Dr. Vyacheslav Kungurtsev, Ph.D., Mgr. Jakub Mareček, Ph.D., Uschmajew, A.
Publikace: SIAM Journal on Optimization. 2024, 34(1), 1-26. ISSN 1052-6234.
Rok: 2024

DOI: 10.1137/22M1529762
Odkaz: https://doi.org/10.1137/22M1529762
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
We present an online algorithm for time-varying semidefinite programs (TV-SDPs), based on the tracking of the solution trajectory of a low-rank matrix factorization, also known as the Burer–Monteiro factorization, in a path-following procedure. There, a predictor-corrector algorithm solves a sequence of linearized systems. This requires the introduction of a horizontal space constraint to ensure the local injectivity of the low-rank factorization. The method produces a sequence of approximate solutions for the original TV-SDP problem, for which we show that they stay close to the optimal solution path if properly initialized. Numerical experiments for a time-varying max-cut SDP relaxation demonstrate the computational advantages of the proposed method for tracking TV-SDPs in terms of runtime compared to off-the-shelf interior-point methods.

Trilevel and multilevel optimization using monotone operator theory

Autoři: Shafiei, A., Dr. Vyacheslav Kungurtsev, Ph.D., Mgr. Jakub Mareček, Ph.D.,
Publikace: Mathematical Methods of Operations Research. 2024, 99(1-2), 77-114. ISSN 1432-2994.
Rok: 2024

DOI: 10.1007/s00186-024-00852-5
Odkaz: https://doi.org/10.1007/s00186-024-00852-5
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
We consider rather a general class of multi-level optimization problems, where a convex objective function is to be minimized subject to constraints of optimality of nested convex optimization problems. As a special case, we consider a trilevel optimization problem, where the objective of the two lower layers consists of a sum of a smooth and a non-smooth term. Based on fixed-point theory and related arguments, we present a natural first-order algorithm and analyze its convergence and rates of convergence in several regimes of parameters.

Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents

Autoři: Saeed Vahidian, Y., Sun, J., Dr. Vyacheslav Kungurtsev, Ph.D., Gong, N.Z., Chen, Y.
Publikace: Computer Vision – ECCV 2024. Springer Nature, 2024. p. 18-33. ISBN 978-3-031-73228-7.
Rok: 2024

DOI: 10.1007/978-3-031-73229-4_2
Odkaz: https://doi.org/10.1007/978-3-031-73229-4_2
Pracoviště: Intelligent Data Analysis
Anotace:
Data heterogeneity presents significant challenges for federated learning (FL). Recently, dataset distillation techniques have been introduced, and performed at the client level, to attempt to mitigate some of these challenges. In this paper, we propose a highly efficient FL dataset distillation framework on the server side, significantly reducing both the computational and communication demands on local devices while enhancing the clients’ privacy. Unlike previous strategies that perform dataset distillation on local devices and upload synthetic data to the server, our technique enables the server to leverage prior knowledge from pre-trained deep generative models to synthesize essential data representations from a heterogeneous model architecture. This process allows local devices to train smaller surrogate models while enabling the training of a larger global model on the server, effectively minimizing resource utilization. We substantiate our claim with a theoretical analysis, demonstrating the asymptotic resemblance of the process to the hypothetical ideal of completely centralized training on a heterogeneous dataset. Empirical evidence from our comprehensive experiments indicates our method’s superiority, delivering an accuracy enhancement of up to 40% over non-dataset-distillation techniques in highly heterogeneous FL contexts, and surpassing existing dataset-distillation methods by 18%.

Decentralized Asynchronous Non-convex Stochastic Optimization on Directed Graphs

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Morafah, M., Javidi, T., Scutari, G.
Publikace: IEEE Transactions on Control of Network Systems. 2023, 10(4), 1796-1804. ISSN 2325-5870.
Rok: 2023

DOI: 10.1109/TCNS.2023.3242043
Odkaz: https://doi.org/10.1109/TCNS.2023.3242043
Pracoviště: Intelligent Data Analysis
Anotace:
We consider a decentralized stochastic optimization problem over a network of agents, modeled as a directed graph: Agents aim to asynchronously minimize the average of their individual losses (possibly non-convex), each one having access only to a noisy estimate of the gradient of its own function. We propose an asynchronous distributed algorithm for such a class of problems. The algorithm combines stochastic gradients with tracking in an asynchronous push-sum framework and obtains a sublinear convergence rate, matching the rate of the centralized stochastic gradient descent applied to the nonconvex minimization. Our experiments on a non-convex image classification task using convolutional neural network validate the convergence of our proposed algorithm across different number of nodes and graph connectivity percentages.

Decentralized Bayesian Learning with Metropolis-adjusted Hamiltonian Monte Carlo

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Cobb, A., Javidi, T., Jalain, B.
Publikace: Machine Learning. 2023, 112(8), 2791-2819. ISSN 0885-6125.
Rok: 2023

DOI: 10.1007/s10994-023-06345-6
Odkaz: https://doi.org/10.1007/s10994-023-06345-6
Pracoviště: Intelligent Data Analysis
Anotace:
Federated learning performed by a decentralized networks of agents is becoming increasingly important with the prevalence of embedded software on autonomous devices. Bayesian approaches to learning benefit from offering more information as to the uncertainty of a random quantity, and Langevin and Hamiltonian methods are effective at realizing sampling from an uncertain distribution with large parameter dimensions. Such methods have only recently appeared in the decentralized setting, and either exclusively use stochastic gradient Langevin and Hamiltonian Monte Carlo approaches that require a diminishing stepsize to asymptotically sample from the posterior and are known in practice to characterize uncertainty less faithfully than constant step-size methods with a Metropolis adjustment, or assume strong convexity properties of the potential function. We present the first approach to incorporating constant stepsize Metropolis-adjusted HMC in the decentralized sampling framework, show theoretical guarantees for consensus and probability distance to the posterior stationary distribution, and demonstrate their effectiveness numerically on standard real world problems, including decentralized learning of neural networks which is known to be highly non-convex.

Efficient Distribution Similarity Identification in Clustered Federated Learning via Principal Angles between Client Data Subspaces

Autoři: Vahidian, S., Morafah, M., Wang, W., Dr. Vyacheslav Kungurtsev, Ph.D., Chen, Ch., Shah, M., Lin, B.
Publikace: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2023. p. 10043-10052. vol. 37. ISSN 2374-3468. ISBN 978-1-57735-880-0.
Rok: 2023

DOI: 10.1609/aaai.v37i8.26197
Odkaz: https://doi.org/10.1609/aaai.v37i8.26197
Pracoviště: Intelligent Data Analysis
Anotace:
Clustered federated learning (FL) has been shown to produce promising results by grouping clients into clusters. This is especially effective in scenarios where separate groups of clients have significant differences in the distributions of their local data. Existing clustered FL algorithms are essentially trying to group together clients with similar distributions so that clients in the same cluster can leverage each other's data to better perform federated learning. However, prior clustered FL algorithms attempt to learn these distribution similarities indirectly during training, which can be quite time consuming as many rounds of federated learning may be required until the formation of clusters is stabilized. In this paper, we propose a new approach to federated learning that directly aims to efficiently identify distribution similarities among clients by analyzing the principal angles between the client data subspaces. Each client applies a truncated singular value decomposition (SVD) step on its local data in a single-shot manner to derive a small set of principal vectors, which provides a signature that succinctly captures the main characteristics of the underlying distribution. This small set of principal vectors is provided to the server so that the server can directly identify distribution similarities among the clients to form clusters. This is achieved by comparing the similarities of the principal angles between the client data subspaces spanned by those principal vectors. The approach provides a simple, yet effective clustered FL framework that addresses a broad range of data heterogeneity issues beyond simpler forms of Non-IIDness like label skews. Our clustered FL approach also enables convergence guarantees for non-convex objectives.

Mean-field Analysis for Heavy Ball Methods: Dropout-stability, Connectivity, and Global Convergence

Autoři: Wu, D., Dr. Vyacheslav Kungurtsev, Ph.D., Mondelli, M.
Publikace: Transactions on Machine Learning Research. 2023, 1-49. ISSN 2835-8856.
Rok: 2023

Pracoviště: Intelligent Data Analysis
Anotace:
The stochastic heavy ball method (SHB), also known as stochastic gradient descent (SGD) with Polyak's momentum, is widely used in training neural networks. However, despite the remarkable success of such algorithm in practice, its theoretical characterization remains limited. In this paper, we focus on neural networks with two and three layers and provide a rigorous understanding of the properties of the solutions found by SHB: \emph{(i)} stability after dropping out part of the neurons, \emph{(ii)} connectivity along a low-loss path, and \emph{(iii)} convergence to the global optimum. To achieve this goal, we take a mean-field view and relate the SHB dynamics to a certain partial differential equation in the limit of large network widths. This mean-field perspective has inspired a recent line of work focusing on SGD while, in contrast, our paper considers an algorithm with momentum. More specifically, after proving existence and uniqueness of the limit differential equations, we show convergence to the global optimum and give a quantitative bound between the mean-field limit and the SHB dynamics of a finite-width network. Armed with this last bound, we are able to establish the dropout-stability and connectivity of SHB solutions.

On the Ergodic Control of Ensembles in the Presence of Non-linear Filters

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Mgr. Jakub Mareček, Ph.D., Ghosh, R., Shorten, R.
Publikace: Automatica. 2023, 152 ISSN 0005-1098.
Rok: 2023

DOI: 10.1016/j.automatica.2023.110946
Odkaz: https://doi.org/10.1016/j.automatica.2023.110946
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
n many sharing-economy applications, as well as in other, conventional applications, one wishes to regulate the behaviour of an ensemble of agents with guarantees on both the regulation of the ensemble in aggregate and the revenue or quality of service associated with each agent. Previous work [Automatica, Volume 108, 108483] has developed guarantees of unique ergodicity when there are linear filters. Here, we extend the guarantees to systems including non-linear elements, such as non-linear filters.

Pathfollowing for Parametric Mathematical Programs with Complementarity Constraints

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Jaschke, J.
Publikace: Optimization and Engineering. 2023, 24(4), 2795-2826. ISSN 1389-4420.
Rok: 2023

DOI: 10.1007/s11081-023-09794-z
Odkaz: https://doi.org/10.1007/s11081-023-09794-z
Pracoviště: Intelligent Data Analysis
Anotace:
In this paper we study procedures for pathfollowing parametric mathematical programs with complementarity constraints. We present two algorithms, one based on the penalty approach for solving standalone MPCCs, and one based on tracing active set bifurcations arising from doubly-active complementarity constraints. We demonstrate the performance of these two approaches on a variety of examples with different types of stationary points and also a simple engineering problem with phase changes.

Regularized Quasi-monotone Method for Stochastic Optimization

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Shikhman, V.
Publikace: Optimization Letters. 2023, 17(5), 1215-1228. ISSN 1862-4472.
Rok: 2023

DOI: 10.1007/s11590-022-01931-4
Odkaz: https://doi.org/10.1007/s11590-022-01931-4
Pracoviště: Intelligent Data Analysis
Anotace:
We adapt the quasi-monotone method, an algorithm characterized by uniquely having convergence quality guarantees for the last iterate, for composite convex minimization in the stochastic setting. For the proposed numerical scheme we derive the optimal convergence rate of O(1k+1) in terms of the last iterate, rather than on average as it is standard for subgradient methods. The theoretical guarantee for individual convergence of the regularized quasi-monotone method is confirmed by numerical experiments on ℓ1-regularized robust linear regression.

A Sensitivity Assisted Alternating Directions Method of Multipliers for Distributed Optimization

Autoři: Krishnamoorthy, D., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: Proceedings of 61st IEEE Conference on Decision and Control. Piscataway: IEEE, 2022. p. 295-300. ISSN 2576-2370. ISBN 978-1-6654-6761-2.
Rok: 2022

DOI: 10.1109/CDC51059.2022.9993352
Odkaz: https://doi.org/10.1109/CDC51059.2022.9993352
Pracoviště: Intelligent Data Analysis
Anotace:
Alternating Directions Method of Multipliers (ADMM) is a form of decomposition-coordination method that typically requires several iterations/communication rounds between the subproblems and the master problem to converge. Repeatedly solving the subproblems over several iterations add to the total computation time. Noting that the subproblems solved from one iteration to the next differs only by a few variables, this paper proposes a novel sensitivity-assisted ADMM framework for nonlinear programming (NLP) problems, where the subproblems are cheaply approximated using the parametric sensitivities. By exploiting the parametric sensitivities, the computation of the subproblems can be reduced to a single linear solve instead of solving the full NLP problem, thereby reducing the overall computation cost. Different algorithmic variations are discussed and demonstrated using two numerical examples.

A Stochastic Levenberg--Marquardt Method Using Random Models with Complexity Results

Autoři: Bergou, E.H., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D., Royer, C.W.
Publikace: SIAM/ASA Journal on Uncertainty Quantification. 2022, 10(1), 507-536. ISSN 2166-2525.
Rok: 2022

DOI: 10.1137/20M1366253
Odkaz: https://doi.org/10.1137/20M1366253
Pracoviště: Intelligent Data Analysis
Anotace:
Globally convergent variants of the Gauss--Newton algorithm are often the methods of choice to tackle nonlinear least-squares problems. Among such frameworks, Levenberg--Marquardt and trust-region methods are two well-established, similar paradigms. Both schemes have been studied when the Gauss--Newton model is replaced by a random model that is only accurate with a given probability. Trust-region schemes have also been applied to problems where the objective value is subject to noise: this setting is of particular interest in fields such as data assimilation, where efficient methods that can adapt to noise are needed to account for the intrinsic uncertainty in the input data. In this paper, we describe a stochastic Levenberg--Marquardt algorithm that handles noisy objective function values and random models, provided sufficient accuracy is achieved in probability. Our method relies on a specific scaling of the regularization parameter that allows us to leverage existing results for trust-region algorithms. Moreover, we exploit the structure of our objective through the use of a family of stationarity criteria tailored to least-squares problems. Provided the probability of accurate function estimates and models is sufficiently large, we bound the expected number of iterations needed to reach an approximate stationary point, which generalizes results based on using deterministic models or noiseless function values. We illustrate the link between our approach and several applications related to inverse problems and machine learning.

A Subsampling Line-Search Method with Second-Order Results

Autoři: Bergou, E., Diouane, Y., Kunc, V., Dr. Vyacheslav Kungurtsev, Ph.D., Royer, C.W.
Publikace: INFORMS JOURNAL ON OPTIMIZATION. 2022, 4(4), 403-425. ISSN 2575-1484.
Rok: 2022

DOI: 10.1287/ijoo.2022.0072
Odkaz: https://doi.org/10.1287/ijoo.2022.0072
Pracoviště: Intelligent Data Analysis
Anotace:
In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization, such as a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. For nonconvex data-related problems, such as training deep learning models, one aims at developing methods that converge to second-order stationary points quickly, that is, escape saddle points efficiently. This is particularly difficult to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model; for a sufficiently large sample, a similar amount of reduction holds for the true objective. We then present worst-case complexity guarantees for a notion of stationarity tailored to the subsampling context. Our analysis encompasses the deterministic regime and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice.

Diminishing Stepsize Methods for Nonconvex Composite Problems Via Ghost Penalties: From the General to the Convex Regular Constrained Case

Autoři: Fachinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Lampariello, L., Scutari, G.
Publikace: Optimization Methods and Software. 2022, 37(4), 1242-1268. ISSN 1055-6788.
Rok: 2022

DOI: 10.1080/10556788.2020.1854253
Odkaz: https://doi.org/10.1080/10556788.2020.1854253
Pracoviště: Intelligent Data Analysis
Anotace:
In this paper, we first extend the diminishing stepsize method for nonconvex constrained problems presented in F. Facchinei, V. Kungurtsev, L. Lampariello and G. Scutari [Ghost penalties in nonconvex constrained optimization: Diminishing stepsizes and iteration complexity, To appear on Math. Oper. Res. 2020. Available at https://arxiv.org/abs/1709.03384.] to deal with equality constraints and a nonsmooth objective function of composite type. We then consider the particular case in which the constraints are convex and satisfy a standard constraint qualification and show that in this setting the algorithm can be considerably simplified, reducing the computational burden of each iteration.

Distributed Stochastic Nonsmooth Nonconvex Optimization

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: Operations Research Letters. 2022, 50(6), 627-631. ISSN 0167-6377.
Rok: 2022

DOI: 10.1016/j.orl.2022.09.001
Odkaz: https://doi.org/10.1016/j.orl.2022.09.001
Pracoviště: Intelligent Data Analysis
Anotace:
Distributed consensus optimization has received considerable attention in recent years and several distributed consensus-based algorithms have been proposed for (nonsmooth) convex and (smooth) nonconvex objective functions. However, the behavior of these distributed algorithms on nonconvex, nonsmooth and stochastic objective functions is not understood. Such class of functions and distributed setting are motivated by several applications, including problems in machine learning and signal processing. This paper presents the first convergence analysis of the decentralized stochastic subgradient method for such classes of problems, over networks modeled as undirected, fixed, graphs.

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

Autoři: Schevchenko, A., Dr. Vyacheslav Kungurtsev, Ph.D., Mondelli, M.
Publikace: Journal of Machine Learning Research. 2022, 23(130), 1-55. ISSN 1532-4435.
Rok: 2022

DOI: 10.48550/arXiv.2111.02278
Odkaz: https://doi.org/10.48550/arXiv.2111.02278
Pracoviště: Intelligent Data Analysis
Anotace:
Understanding the properties of neural networks trained via stochastic gradient descent (SGD) is at the heart of the theory of deep learning. In this work, we take a mean- field view, and consider a two-layer ReLU network trained via noisy-SGD for a univariate regularized regression problem. Our main result is that SGD with vanishingly small noise injected in the gradients is biased towards a simple solution: at convergence, the ReLU network implements a piecewise linear map of the inputs, and the number of \knot"points { i.e., points where the tangent of the ReLU network estimator changes { between two consecutive training inputs is at most three. In particular, as the number of neurons of the network grows, the SGD dynamics is captured by the solution of a gradient ow and, at convergence, the distribution of the weights approaches the unique minimizer of a related free energy, which has a Gibbs form. Our key technical contribution consists in the analysis of the estimator resulting from this minimizer: we show that its second derivative vanishes everywhere, except at some specific locations which represent the \knot"points. We also provide empirical evidence that knots at locations distinct from the data points might occur, as predicted by our theory.

Simultaneous Online Model Identification and Production Optimization Using Modifier Adaptation

Autoři: Matias, J., Dr. Vyacheslav Kungurtsev, Ph.D., Egan, M.
Publikace: Journal of Process Control. 2022, 110 110-120. ISSN 0959-1524.
Rok: 2022

DOI: 10.1016/j.jprocont.2021.12.009
Odkaz: https://doi.org/10.1016/j.jprocont.2021.12.009
Pracoviště: Intelligent Data Analysis
Anotace:
A key problem for many industrial processes is to limit exposure to system malfunction. The system health state can be represented by different models. However, it is often the case that control cost minimization is prioritized over model identification. Indeed, model identification is typically not considered in production optimization, which can lead to delayed awareness and alerting of malfunction. In this paper, we address the problem of simultaneous production optimization and system identification. We develop new algorithms based on modifier adaptation and reinforcement learning, which efficiently manage the tradeoff between cost minimization and identification. For two case studies based on a chemical reactor and subsea oil and gas exploration, we show that our algorithms yield control costs comparable to existing methods while yielding rapid identification of system degradation.

A Nonmonotone Matrix-Free Algorithm for Nonlinear Equality-Constrained Least-Squares Problems

Autoři: Bergou, E.H., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D., Royer, C.W.
Publikace: SIAM Journal on Scientific Computing. 2021, 43(4), S743-S766. ISSN 1064-8275.
Rok: 2021

DOI: 10.1137/20M1349138
Odkaz: https://doi.org/10.1137/20M1349138
Pracoviště: Intelligent Data Analysis
Anotace:
Least squares form one of the most prominent classes of optimization problems with numerous applications in scientific computing and data fitting. When such formulations aim at modeling complex systems, the optimization process must account for nonlinear dynamics by incorporating constraints. In addition, these systems often incorporate a large number of variables, which increases the difficulty of the problem and motivates the need for efficient algorithms amenable to large-scale implementations. In this paper, we propose and analyze a Levenberg--Marquardt algorithm for nonlinear least squares subject to nonlinear equality constraints. Our algorithm is based on inexact solves of linear least-squares problems that only require Jacobian-vector products. Global convergence is guaranteed by the combination of a composite step approach and a nonmonotone step acceptance rule. We illustrate the performance of our method on several test cases from data assimilation and inverse problems; our algorithm is able to reach the vicinity of a solution from an arbitrary starting point and can outperform the most natural alternatives for these classes of problems.

A Zeroth Order Method for Stochastic Weakly Convex Optimization

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Rinaldi, F.
Publikace: Computational Optimization and Applications. 2021, 80 731-753. ISSN 1573-2894.
Rok: 2021

DOI: 10.1007/s10589-021-00313-3
Odkaz: https://doi.org/10.1007/s10589-021-00313-3
Pracoviště: Intelligent Data Analysis
Anotace:
In this paper, we consider stochastic weakly convex optimization problems, however without the existence of a stochastic subgradient oracle. We present a derivative free algorithm that uses a two point approximation for computing a gradient estimate of the smoothed function. We prove convergence at a similar rate as state of the art methods, however with a larger constant, and report some numerical results showing the effectiveness of the approach.

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Egan, M., Chatterjee, B., Alistarh, D.
Publikace: Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, California: Association for the Advancement of Artificial Intelligence (AAAI), 2021. p. 8209-8216. ISSN 2159-5399. ISBN 978-1-57735-866-4.
Rok: 2021

Pracoviště: Intelligent Data Analysis
Anotace:
Asynchronous distributed algorithms are a popular way to reduce synchronization costs in large-scale optimization, and in particular for neural network training. However, for nonsmooth and nonconvex objectives, few convergence guarantees exist beyond cases where closed-form proximal operator solutions are available. As training most popular deep neural networks corresponds to optimizing nonsmooth and nonconvex objectives, there is a pressing need for such convergence guarantees. In this paper, we analyze for the first time the convergence of stochastic asynchronous optimization for this general class of objectives. In particular, we focus on stochastic subgradient methods allowing for block variable partitioning, where the shared model is asynchronously updated by concurrent processes. To this end, we use a probabilistic model which captures key features of real asynchronous scheduling between concurrent processes. Under this model, we establish convergence with probability one to an invariant set for stochastic subgradient methods with momentum. From a practical perspective, one issue with the family of algorithms that we consider is that they are not efficiently supported by machine learning frameworks, which mostly focus on distributed data-parallel strategies. To address this, we propose a new implementation strategy for shared-memory based training of deep neural networks for a partitioned but shared model in single- and multi-GPU settings. Based on this implementation, we achieve on average about 1.2x speed-up in comparison to state-of-the-art training methods for popular image classification tasks, without compromising accuracy.

Complexity Iteration Analysis for Strongly Convex Multi-objective Optimization Using a Newton Path-following Procedure

Autoři: Bergou, E., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: Optimization Letters. 2021, 15(4), 1215-1227. ISSN 1862-4472.
Rok: 2021

DOI: 10.1007/s11590-020-01623-x
Odkaz: https://doi.org/10.1007/s11590-020-01623-x
Pracoviště: Intelligent Data Analysis
Anotace:
In this note, we consider the iteration complexity of solving strongly convex multi-objective optimization problems. We discuss the precise meaning of this problem, noting that its definition is ambiguous, and focus on the most natural notion of finding a set of Pareto optimal points across a grid of scalarized problems. We prove that, in most cases, performing sensitivity based path-following after obtaining one solution is the optimal strategy for this task in terms of iteration complexity.

Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent

Autoři: Nadiradze, G., Markov, I., Chatterjee, B., Dr. Vyacheslav Kungurtsev, Ph.D., Alistarh, D.
Publikace: Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, California: Association for the Advancement of Artificial Intelligence (AAAI), 2021. p. 9037-9045. ISSN 2374-3468. ISBN 978-1-57735-866-4.
Rok: 2021

Pracoviště: Intelligent Data Analysis
Anotace:
One key element behind the recent progress of machine learning has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Most of these models are trained employing variants of stochastic gradient descent (SGD) based optimization, but most methods involve some type of consistency relaxation relative to sequential SGD, to mitigate its large communication or synchronization costs at scale. In this paper, we introduce a general consistency condition covering communication-reduced and asynchronous distributed SGD implementations. Our framework, called elastic consistency, decouples the system-specific aspects of the implementation from the SGD convergence requirements, giving a general way to obtain convergence bounds for a wide variety of distributed SGD methods used in practice. Elastic consistency can be used to re-derive or improve several previous convergence bounds in message-passing and shared-memory settings, but also to analyze new models and distribution schemes. As a direct application, we propose and analyze a new synchronization-avoiding scheduling scheme for distributed SGD, and show that it can be used to efficiently train deep convolutional models for image classification.

Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity

Autoři: Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Lampariello, L., Scutari, G.
Publikace: Mathematics of Operations Research. 2021, 46(2), 595-627. ISSN 0364-765X.
Rok: 2021

DOI: 10.1287/moor.2020.1079
Odkaz: https://doi.org/10.1287/moor.2020.1079
Pracoviště: Intelligent Data Analysis
Anotace:
We consider nonconvex constrained optimization problems and propose a new approach to the convergence analysis based on penalty functions. We make use of classical penalty functions in an unconventional way, in that penalty functions only enter in the theoretical analysis of convergence while the algorithm itself is penalty free. Based on this idea, we are able to establish several new results, including the first general analysis for diminishing stepsize methods in nonconvex, constrained optimization, showing convergence to generalized stationary points, and a complexity study for sequential quadratic programming–type algorithms.

A Shifted Primal-Dual Penalty-Barrier Method for Nonlinear Optimization

Autoři: Gill, P.E., Dr. Vyacheslav Kungurtsev, Ph.D., Robinson, D.P.
Publikace: SIAM Journal on Optimization. 2020, 30(2), 1067-1093. ISSN 1052-6234.
Rok: 2020

DOI: 10.1137/19M1247425
Odkaz: https://doi.org/10.1137/19M1247425
Pracoviště: Intelligent Data Analysis
Anotace:
In nonlinearly constrained optimization, penalty methods provide an effective strategy for handling equality constraints, while barrier methods provide an effective approach for the treatment of inequality constraints. A new algorithm for nonlinear optimization is proposed based on minimizing a shifted primal-dual penalty-barrier function. Certain global convergence properties are established. In particular, it is shown that a limit point of the sequence of iterates may always be found that is either an infeasible stationary point or a complementary approximate Karush--Kuhn--Tucker point; i.e., it satisfies reasonable stopping criteria and is a Karush--Kuhn--Tucker point under a regularity condition that is the weakest constraint qualification associated with sequential optimality conditions. It is also shown that under suitable additional assumptions, the method is equivalent to a shifted variant of the primal-dual path-following method in the neighborhood of a solution. Numerical examples are provided that illustrate the performance of the method compared to a widely used conventional interior-point method.

A Two-Step Pre-Processing for Semidefinite Programming

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Mgr. Jakub Mareček, Ph.D.,
Publikace: Proceedings of the 59th IEEE Conference on Decision and Control. Institute of Electrical and Electronics Engineers, Inc., 2020. p. 384-389. ISSN 2576-2370. ISBN 978-1-7281-7447-1.
Rok: 2020

DOI: 10.1109/CDC42340.2020.9304494
Odkaz: https://doi.org/10.1109/CDC42340.2020.9304494
Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
Anotace:
In semidefinite programming (SDP), a number of pre-processing techniques have been developed, including procedures based on chordal decomposition, which exploit sparsity in the semidefinite program in order to reduce the dimension of individual constraints, and procedures based on facial reduction, which reduces the dimension of the problem by removing redundant rows and columns. So far, these have been studied in isolation. We show that these techniques are, in fact, complementary. In computational experiments, we show that a two-step pre-processing followed by a standard interior-point method outperforms the interior point method, with or without either of the pre-processing techniques, by a considerable margin.

Asynchronous Parallel Algorithms for Nonconvex Optimization

Autoři: Cannelli, L., Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Scutari, G.
Publikace: Mathematical Programming. 2020, 184(1-2), 121-154. ISSN 0025-5610.
Rok: 2020

DOI: 10.1007/s10107-019-01408-w
Odkaz: https://doi.org/10.1007/s10107-019-01408-w
Pracoviště: Intelligent Data Analysis
Anotace:
We propose a new asynchronous parallel block-descent algorithmic framework for the minimization of the sum of a smooth nonconvex function and a nonsmooth convex one, subject to both convex and nonconvex constraints. The proposed framework hinges on successive convex approximation techniques and a novel probabilistic model that captures key elements of modern computational architectures and asynchronous implementations in a more faithful way than current state-of-the-art models. Other key features of the framework are: (1) it covers in a unified way several specific solution methods; (2) it accommodates a variety of possible parallel computing architectures; and (3) it can deal with nonconvex constraints. Almost sure convergence to stationary solutions is proved, and theoretical complexity results are provided, showing nearly ideal linear speedup when the number of workers is not too large.

Convergence and Complexity Analysis of a Levenberg-Marquardt Algorithm for Inverse Problems

Autoři: Bergou, E.H., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS. 2020, 185(3), 927-944. ISSN 0022-3239.
Rok: 2020

DOI: 10.1007/s10957-020-01666-1
Odkaz: https://doi.org/10.1007/s10957-020-01666-1
Pracoviště: Intelligent Data Analysis
Anotace:
The Levenberg-Marquardt algorithm is one of the most popular algorithms for finding the solution of nonlinear least squares problems. Across different modified variations of the basic procedure, the algorithm enjoys global convergence, a competitive worst-case iteration complexity rate, and a guaranteed rate of local convergence for both zero and nonzero small residual problems, under suitable assumptions. We introduce a novel Levenberg-Marquardt method that matches, simultaneously, the state of the art in all of these convergence properties with a single seamless algorithm. Numerical experiments confirm the theoretical behavior of our proposed algorithm.

Convergence Rate for Diminishing Stepsize Methods in nonconvex Constrained Optimization via Ghost Penalties

Autoři: Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Lampariello, L., Scutari, G.
Publikace: Atti della Accademia Peloritana dei Pericolanti. Classe di Scienze Fisiche, Matematiche e Naturali. 2020, 98(S2), ISSN 1825-1242.
Rok: 2020

DOI: 10.1478/AAPP.98S2A8
Odkaz: https://doi.org/10.1478/AAPP.98S2A8
Pracoviště: Intelligent Data Analysis
Anotace:
This is a companion paper to “Ghost penalties in nonconvex constrained optimization: Diminishing stepsizes and iteration complexity" (to appear in Mathematics of Operations Research). We consider the ghost penalty scheme for nonconvex, constrained optimization introduced in that paper, coupled with a diminishing stepsize procedure. Under an extended Mangasarian-Fromovitz-type constraint qualification we give an expression for the maximum number of iterations needed to achieve a given solution accuracy according to a natural stationarity measure, thus establishing the first result of this kind for a diminishing stepsize method for nonconvex, constrained optimization problems.

Fast Sensitivity-Based Economic Model Predictive Control for Degenerate Systems

Autoři: Suwartadi, E., Dr. Vyacheslav Kungurtsev, Ph.D., Jäschke, J.
Publikace: Journal of Process Control. 2020, 88 54-62. ISSN 0959-1524.
Rok: 2020

DOI: 10.1016/j.jprocont.2020.02.006
Odkaz: https://doi.org/10.1016/j.jprocont.2020.02.006
Pracoviště: Intelligent Data Analysis
Anotace:
We present a sensitivity-based nonlinear model predictive control (NMPC) algorithm and demonstrate it on a case study with an economic cost function. In contrast to existing sensitivity-based approaches that make strong assumptions on the underlying optimization problem (e.g. the linear independence constraint qualification implying unique multiplier), our method is designed to handle problems satisfying a weaker constraint qualification, namely the Mangasarian-Fromovitz constraint qualification (MFCQ). Our nonlinear programming (NLP) sensitivity update consists of three steps. The first step is a corrector step in which a system of linear equations is solved. Then a predictor step is computed by a quadratic program (QP). Finally, a linear program (LP) is solved to select the multipliers that give the correct sensitivity information. A path-following scheme containing these steps is embedded in the advanced-step NMPC (asNMPC) framework. We demonstrate our method on a large-scale case example consisting of a reactor and distillation process. We show that LICQ does not hold and the path-following method is able to accurately approximate the ideal solutions generated by an NLP solver.

Lifted Weight Learning of Markov Logic Networks (Revisited One More Time)

Autoři: Ing. Ondřej Kuželka, Ph.D., Dr. Vyacheslav Kungurtsev, Ph.D., Wang, Y.
Publikace: Proceedings of the 10th International Conference on Probabilistic Graphical Models. Proceedings of Machine Learning Research, 2020. p. 269-280. vol. 138. ISSN 2640-3498.
Rok: 2020

Pracoviště: Intelligent Data Analysis
Anotace:
We revisit the problem of lifted weight learning of Markov logic networks (MLNs). We show that there is an algorithm for maximum-likelihood learning which runs in time polynomial in the size of the domain, whenever the partition function of the given MLN can be computed in polynomial time. This improves on our recent results where we showed the same result with the additional dependency of the runtime on a parameter of the training data, called interiority, which measures how “extreme” the given training data are. In this work, we get rid of this dependency. The main new technical ingredient that we exploit are theoretical results obtained recently by Straszak and Vishnoi (Maximum Entropy Distributions: Bit Complexity and Stability, COLT 2019).

Reinforcement Learning Based on Real-Time Iteration NMPC

Autoři: Zanon, M., Dr. Vyacheslav Kungurtsev, Ph.D., Gros, S.
Publikace: Proceedings of the IFAC World Congress 2020. Laxenburg: IFAC, 2020. p. 5213-5218. IFAC-PapersOnLine. vol. 53. ISSN 2405-8963.
Rok: 2020

DOI: 10.1016/j.ifacol.2020.12.1195
Odkaz: https://doi.org/10.1016/j.ifacol.2020.12.1195
Pracoviště: Intelligent Data Analysis
Anotace:
Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which does guarantee safety and stability, but only yields optimality for the nominal model. Therefore, it has been recently proposed to use NMPC as a function approximator within RL. While the ability of this approach to yield good performance has been demonstrated, the main drawback hindering its applicability is related to the computational burden of NMPC, which has to be solved to full convergence. In practice, however, computationally efficient algorithms such as the Real-Time Iteration (RTI) scheme are deployed in order to return an approximate NMPC solution in very short time. In this paper we bridge this gap by extending the existing theoretical framework to also cover RL based on RTI NMPC. We demonstrate the effectiveness of this new RL approach with a nontrivial example modeling a challenging nonlinear system subject to stochastic perturbations with the objective of optimizing an economic cost.

Second-Order Guarantees of Distributed Gradient Algorithms

Autoři: Daneshmand, A., Scutari, G., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: SIAM Journal on Optimization. 2020, 30(4), 3029-3068. ISSN 1095-7189.
Rok: 2020

DOI: 10.1137/18M121784X
Odkaz: https://doi.org/10.1137/18M121784X
Pracoviště: Intelligent Data Analysis
Anotace:
We consider distributed smooth nonconvex unconstrained optimization over net- works, modeled as a connected graph. We examine the behavior of distributed gradient-based algorithms near strict saddle points. Specifically, we establish that (i) the renowned distributed gradient descent algorithm likely converges to a neighborhood of a second-order stationary (SoS) solution; and (ii) the more recent class of distributed algorithms based on gradient tracking---implementable also over digraphs---likely converges to exact SoS solutions, thus avoiding (strict) saddle points. Furthermore, new convergence rate results for first-order critical points is established for the latter class of algorithms.

Lifted Weight Learning of Markov Logic Networks Revisited

Autoři: Ing. Ondřej Kuželka, Ph.D., Dr. Vyacheslav Kungurtsev, Ph.D.,
Publikace: Proceedings of Machine Learning Research. Proceedings of Machine Learning Research, 2019. p. 1753-1761. ISSN 2640-3498.
Rok: 2019

Pracoviště: Intelligent Data Analysis
Anotace:
We study lifted weight learning of Markovlogic networks. We show that there is an al-gorithm for maximum-likelihood learning of2-variable Markov logic networks which runsin time polynomial in the domain size. Ourresults are based on existing lifted-inferencealgorithms and recent algorithmic results oncomputing maximum entropy distributions.

A Predictor-Corrector Path-Following Algorithm for Dual-Degenerate Parametric Optimization Problems

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Jäschke, J.
Publikace: SIAM JOURNAL ON OPTIMIZATION. 2017, 27(1), 538-564. ISSN 1052-6234.
Rok: 2017

DOI: 10.1137/16M1068736
Odkaz: https://doi.org/10.1137/16M1068736
Pracoviště: Intelligent Data Analysis
Anotace:
Most path-following algorithms for tracing a solution path of a parametric nonlinear optimization problem are only certifiably convergent under strong regularity assumptions about the problem functions. In particular, linear independence of the constraint gradients at the solutions is typically assumed, which implies unique multipliers. In this paper we propose a procedure designed to solve problems satisfying a weaker set of conditions, allowing for nonunique (but bounded) multipliers. Each iteration along the path consists of three parts: (1) a Newton corrector step for the primal and dual variables, which is obtained by solving a linear system of equations, (2) a predictor step for the primal and dual variables, which is found as the solution of a quadratic programming problem, and (3) a jump step for the dual variables, which is found as the solution of a linear programming problem. We present a convergence proof and demonstrate the successful solution tracking of the algorithm numerically on a couple of illustrative examples.

A stabilized SQP Method: Global Convergence

Autoři: Gill, P.E., Dr. Vyacheslav Kungurtsev, Ph.D., Robinson, D.P.
Publikace: IMA Journal of Numerical Analysis (IMAJNA). 2017, 37(1), 407-443. ISSN 0272-4979.
Rok: 2017

DOI: 10.1093/imanum/drw004
Odkaz: https://doi.org/10.1093/imanum/drw004
Pracoviště: Intelligent Data Analysis
Anotace:
Stabilized sequential quadratic programming (SQP) methods for nonlinear optimization are designed to provide a sequence of iterates with fast local convergence even when the active-constraint gradients are linearly dependent. This paper concerns the global convergence properties of a stabilized SQP method with a primal–dual augmented Lagrangian merit function. The proposed method incorporates two novel features. First, a flexible line search is used based on a direction formed from an approximate solution of a strictly convex quadratic programming (QP) subproblem and, when one exists, a direction of negative curvature for the primal–dual merit function. Second, when certain conditions hold, an approximate QP solution is computed by solving a single linear system defined in terms of an estimate of the optimal active set. We also establish two desirable convergence results. (i) It is shown that with an appropriate choice of termination condition, the method terminates in a finite number of iterations without the assumption of a constraint qualification. The method may be interpreted as an SQP method with an augmented Lagrangian safeguarding strategy. This safeguarding becomes relevant only when the iterates are converging to an infeasible stationary point of the norm of the constraint violations. Otherwise, the method terminates with a point that approximately satisfies certain second-order necessary conditions for optimality. In this situation, if all termination conditions are removed, then the limit points either satisfy the same second-order necessary conditions exactly or fail to satisfy a weak second-order constraint qualification. (ii) The global convergence analysis concerns a specific algorithm that estimates the least curvature of the merit function at each step. If negative curvature directions are omitted, the analysis still applies and establishes convergence to either first-order solutions or infeasible stationary points.

A stabilized SQP Method: Superlinear Convergence

Autoři: Gill, P.E., Dr. Vyacheslav Kungurtsev, Ph.D., Robinson, D.P.
Publikace: Mathematical Programming. 2017, 369-410. ISSN 0025-5610.
Rok: 2017

DOI: 10.1007/s10107-016-1066-7
Odkaz: https://doi.org/10.1007/s10107-016-1066-7
Pracoviště: Intelligent Data Analysis
Anotace:
Stabilized sequential quadratic programming (sSQP) methods for nonlinear optimization generate a sequence of iterates with fast local convergence regardless of whether or not the active-constraint gradients are linearly dependent. This paper concerns the local convergence analysis of an sSQP method that uses a line search with a primal-dual augmented Lagrangian merit function to enforce global convergence. The method is provably well-defined and is based on solving a strictly convex quadratic programming subproblem at each iteration. It is shown that the method has superlinear local convergence under assumptions that are no stronger than those required by conventional stabilized SQP methods. The fast local convergence is obtained by allowing a small relaxation of the optimality conditions for the quadratic programming subproblem in the neighborhood of a solution. In the limit, the line search selects the unit step length, which implies that the method does not suffer from the Maratos effect. The analysis indicates that the method has the same strong first- and second-order global convergence properties that have been established for augmented Lagrangian methods, yet is able to transition seamlessly to sSQP with fast local convergence in the neighborhood of a solution. Numerical results on some degenerate problems are reported.

Input Shaper Optimization with a Constraint on the Spectrum Distribution

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Pilbauer, D., Vyhlídal, T., doc. Ing. Martin Hromčík, Ph.D., Michiels, W.
Publikace: Proceedings of the IFAC 2017 World Congress. Kidlington Oxford OX GB: Elsevier, 2017. p. 13324-13329. ISSN 2405-8963.
Rok: 2017

DOI: 10.1016/j.ifacol.2017.08.1893
Odkaz: https://doi.org/10.1016/j.ifacol.2017.08.1893
Pracoviště: Katedra řídicí techniky, Intelligent Data Analysis
Anotace:
This paper presents a procedure to parametrize input shapers with piecewise equally distributed time delays. The procedure seeks to minimize residual vibrations around an undesirable frequency but at the same time avoids introducing zeros in the right half plane. This ensures that the introduction of the inverse of the input shaper in a closed feedback loop does not introduce unstable poles in the system. The requirement of stable spectra for the input shaper appears as an additional constraint in a constrained optimization problem that includes minimizing the response time and residual vibrations while preserving required properties of the shaper. The novel introduction of a spectral constraint makes the optimization problem nonsmooth and nonconvex, necessitating special optimization algorithms for its reliable solution. The framework is outlined in detail and the results of the optimization are presented indicating the success of the procedure.

Sensitivity-Based Economic NMPC with a Path-Following Approach

Autoři: Suwartadi, E., Dr. Vyacheslav Kungurtsev, Ph.D., Jäschke, J.
Publikace: Processes. 2017, 5(1), ISSN 2227-9717.
Rok: 2017

DOI: 10.3390/pr5010008
Odkaz: https://doi.org/10.3390/pr5010008
Pracoviště: Intelligent Data Analysis
Anotace:
We present a sensitivity-based predictor-corrector path-following algorithm for fast nonlinear model predictive control (NMPC) and demonstrate it on a large case study with an economic cost function. The path-following method is applied within the advanced-step NMPC framework to obtain fast and accurate approximate solutions of the NMPC problem. In our approach, we solve a sequence of quadratic programs to trace the optimal NMPC solution along a parameter change. A distinguishing feature of the path-following algorithm in this paper is that the strongly-active inequality constraints are included as equality constraints in the quadratic programs, while the weakly-active constraints are left as inequalities. This leads to close tracking of the optimal solution. The approach is applied to an economic NMPC case study consisting of a process with a reactor, a distillation column and a recycler. We compare the path-following NMPC solution with an ideal NMPC solution, which is obtained by solving the full nonlinear programming problem. Our simulations show that the proposed algorithm effectively traces the exact solution.

Hybrid Random/Deterministic Parallel Algorithms for Convex and Nonconvex Big Data Optimization

Autoři: Daneshmand, A., Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Scutari, G.
Publikace: IEEE Transactions on Signal Processing. 2015, 63(15), 3914-3929. ISSN 1053-587X.
Rok: 2015

DOI: 10.1109/TSP.2015.2436357
Odkaz: https://doi.org/10.1109/TSP.2015.2436357
Pracoviště: Katedra počítačů
Anotace:
We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a nonsmooth (possibly nonseparable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel parallel, hybrid random/deterministic decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing a convex surrogate of the original nonconvex function. To tackle huge-scale problems, the (block) variables to be updated are chosen according to a mixed random and deterministic procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results show that on huge-scale problems the proposed hybrid random/deterministic algorithm compares favorably to random and deterministic schemes on both convex and nonconvex problems.

Flexible Selective Parallel Algorithms for Big Data Optimization

Autoři: Daneshmand, A., Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Scutari, G.
Publikace: CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS. USA: IEEE Computer Society, 2014. p. 3-7. ISSN 1058-6393. ISBN 978-1-4799-8297-4.
Rok: 2014

DOI: 10.1109/ACSSC.2014.7094384
Odkaz: https://doi.org/10.1109/ACSSC.2014.7094384
Pracoviště: Intelligent Data Analysis
Anotace:
We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a nonsmooth (separable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel parallel, hybrid random/deterministic decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing local convex approximations of the original nonconvex function. To tackle with huge-scale problems, the (block) variables to be updated are chosen according to a mixed random and deterministic procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results on huge-scale problems show that the proposed algorithm outperforms current schemes.

Sequential Quadratic Programming Methods for Parametric Nonlinear Optimization

Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Diehl, M.
Publikace: Computational Optimization and Applications. 2014, 59(3), 475-509. ISSN 0926-6003.
Rok: 2014

DOI: 10.1007/s10589-014-9696-2
Odkaz: https://doi.org/10.1007/s10589-014-9696-2
Pracoviště: Katedra počítačů
Anotace:
Sequential quadratic programming (SQP) methods are known to be efficient for solving a series of related nonlinear optimization problems because of desirable hot and warm start properties-a solution for one problem is a good estimate of the solution of the next. However, standard SQP solvers contain elements to enforce global convergence that can interfere with the potential to take advantage of these theoretical local properties in full. We present two new predictor-corrector procedures for solving a nonlinear program given a sufficiently accurate estimate of the solution of a similar problem. The procedures attempt to trace a homotopy path between solutions of the two problems, staying within the local domain of convergence for the series of problems generated. We provide theoretical convergence and tracking results, as well as some numerical results demonstrating the robustness and performance of the methods.

Dr. Vyacheslav Kungurtsev, Ph.D.

Všechny publikace

Parametric Semidefinite Programming: Geometry of the Trajectory of Solutions

Structural and Multidisciplinary Optimization Truss topology design under harmonic loads: Peak power minimization with

Federated SGD with Local Asynchrony

Inexact Direct-search Methods for Bilevel Optimization Problems

Iteration Complexity of Variational Quantum Algorithms

Mini-batch Stochastic Subgradient for Functional Constrained Optimization

Time-Varying Semidefinite Programming: Path Following a Burer–Monteiro Factorization

Trilevel and multilevel optimization using monotone operator theory

Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents

Decentralized Asynchronous Non-convex Stochastic Optimization on Directed Graphs

Decentralized Bayesian Learning with Metropolis-adjusted Hamiltonian Monte Carlo

Efficient Distribution Similarity Identification in Clustered Federated Learning via Principal Angles between Client Data Subspaces

Mean-field Analysis for Heavy Ball Methods: Dropout-stability, Connectivity, and Global Convergence

On the Ergodic Control of Ensembles in the Presence of Non-linear Filters

Pathfollowing for Parametric Mathematical Programs with Complementarity Constraints

Regularized Quasi-monotone Method for Stochastic Optimization

A Sensitivity Assisted Alternating Directions Method of Multipliers for Distributed Optimization

A Stochastic Levenberg--Marquardt Method Using Random Models with Complexity Results

A Subsampling Line-Search Method with Second-Order Results

Diminishing Stepsize Methods for Nonconvex Composite Problems Via Ghost Penalties: From the General to the Convex Regular Constrained Case

Distributed Stochastic Nonsmooth Nonconvex Optimization

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

Simultaneous Online Model Identification and Production Optimization Using Modifier Adaptation

A Nonmonotone Matrix-Free Algorithm for Nonlinear Equality-Constrained Least-Squares Problems

A Zeroth Order Method for Stochastic Weakly Convex Optimization

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

Complexity Iteration Analysis for Strongly Convex Multi-objective Optimization Using a Newton Path-following Procedure

Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent

Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity

A Shifted Primal-Dual Penalty-Barrier Method for Nonlinear Optimization

A Two-Step Pre-Processing for Semidefinite Programming

Asynchronous Parallel Algorithms for Nonconvex Optimization

Convergence and Complexity Analysis of a Levenberg-Marquardt Algorithm for Inverse Problems

Convergence Rate for Diminishing Stepsize Methods in nonconvex Constrained Optimization via Ghost Penalties

Fast Sensitivity-Based Economic Model Predictive Control for Degenerate Systems

Lifted Weight Learning of Markov Logic Networks (Revisited One More Time)

Reinforcement Learning Based on Real-Time Iteration NMPC

Second-Order Guarantees of Distributed Gradient Algorithms

Lifted Weight Learning of Markov Logic Networks Revisited

A Predictor-Corrector Path-Following Algorithm for Dual-Degenerate Parametric Optimization Problems

A stabilized SQP Method: Global Convergence

A stabilized SQP Method: Superlinear Convergence

Input Shaper Optimization with a Constraint on the Spectrum Distribution

Sensitivity-Based Economic NMPC with a Path-Following Approach

Hybrid Random/Deterministic Parallel Algorithms for Convex and Nonconvex Big Data Optimization

Flexible Selective Parallel Algorithms for Big Data Optimization

Sequential Quadratic Programming Methods for Parametric Nonlinear Optimization

Mějte přehled