Lidé

Dr. Vyacheslav Kungurtsev, Ph.D.

Všechny publikace

Time-Varying Semidefinite Programming: Path Following a Burer–Monteiro Factorization

  • DOI: 10.1137/22M1529762
  • Odkaz: https://doi.org/10.1137/22M1529762
  • Pracoviště: Katedra počítačů, Centrum umělé inteligence, Intelligent Data Analysis
  • Anotace:
    We present an online algorithm for time-varying semidefinite programs (TV-SDPs), based on the tracking of the solution trajectory of a low-rank matrix factorization, also known as the Burer–Monteiro factorization, in a path-following procedure. There, a predictor-corrector algorithm solves a sequence of linearized systems. This requires the introduction of a horizontal space constraint to ensure the local injectivity of the low-rank factorization. The method produces a sequence of approximate solutions for the original TV-SDP problem, for which we show that they stay close to the optimal solution path if properly initialized. Numerical experiments for a time-varying max-cut SDP relaxation demonstrate the computational advantages of the proposed method for tracking TV-SDPs in terms of runtime compared to off-the-shelf interior-point methods.

A Sensitivity Assisted Alternating Directions Method of Multipliers for Distributed Optimization

  • Autoři: Krishnamoorthy, D., Dr. Vyacheslav Kungurtsev, Ph.D.,
  • Publikace: Proceedings of 61st IEEE Conference on Decision and Control. Piscataway: IEEE, 2022. p. 295-300. ISSN 2576-2370. ISBN 978-1-6654-6761-2.
  • Rok: 2022
  • DOI: 10.1109/CDC51059.2022.9993352
  • Odkaz: https://doi.org/10.1109/CDC51059.2022.9993352
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Alternating Directions Method of Multipliers (ADMM) is a form of decomposition-coordination method that typically requires several iterations/communication rounds between the subproblems and the master problem to converge. Repeatedly solving the subproblems over several iterations add to the total computation time. Noting that the subproblems solved from one iteration to the next differs only by a few variables, this paper proposes a novel sensitivity-assisted ADMM framework for nonlinear programming (NLP) problems, where the subproblems are cheaply approximated using the parametric sensitivities. By exploiting the parametric sensitivities, the computation of the subproblems can be reduced to a single linear solve instead of solving the full NLP problem, thereby reducing the overall computation cost. Different algorithmic variations are discussed and demonstrated using two numerical examples.

A Stochastic Levenberg--Marquardt Method Using Random Models with Complexity Results

  • Autoři: Bergou, E.H., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D., Royer, C.W.
  • Publikace: SIAM/ASA Journal on Uncertainty Quantification. 2022, 10(1), 507-536. ISSN 2166-2525.
  • Rok: 2022
  • DOI: 10.1137/20M1366253
  • Odkaz: https://doi.org/10.1137/20M1366253
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Globally convergent variants of the Gauss--Newton algorithm are often the methods of choice to tackle nonlinear least-squares problems. Among such frameworks, Levenberg--Marquardt and trust-region methods are two well-established, similar paradigms. Both schemes have been studied when the Gauss--Newton model is replaced by a random model that is only accurate with a given probability. Trust-region schemes have also been applied to problems where the objective value is subject to noise: this setting is of particular interest in fields such as data assimilation, where efficient methods that can adapt to noise are needed to account for the intrinsic uncertainty in the input data. In this paper, we describe a stochastic Levenberg--Marquardt algorithm that handles noisy objective function values and random models, provided sufficient accuracy is achieved in probability. Our method relies on a specific scaling of the regularization parameter that allows us to leverage existing results for trust-region algorithms. Moreover, we exploit the structure of our objective through the use of a family of stationarity criteria tailored to least-squares problems. Provided the probability of accurate function estimates and models is sufficiently large, we bound the expected number of iterations needed to reach an approximate stationary point, which generalizes results based on using deterministic models or noiseless function values. We illustrate the link between our approach and several applications related to inverse problems and machine learning.

A Subsampling Line-Search Method with Second-Order Results

  • Autoři: Bergou, E., Diouane, Y., Kunc, V., Dr. Vyacheslav Kungurtsev, Ph.D., Royer, C.W.
  • Publikace: INFORMS JOURNAL ON OPTIMIZATION. 2022, 4(4), 403-425. ISSN 2575-1484.
  • Rok: 2022
  • DOI: 10.1287/ijoo.2022.0072
  • Odkaz: https://doi.org/10.1287/ijoo.2022.0072
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization, such as a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. For nonconvex data-related problems, such as training deep learning models, one aims at developing methods that converge to second-order stationary points quickly, that is, escape saddle points efficiently. This is particularly difficult to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model; for a sufficiently large sample, a similar amount of reduction holds for the true objective. We then present worst-case complexity guarantees for a notion of stationarity tailored to the subsampling context. Our analysis encompasses the deterministic regime and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice.

Distributed Stochastic Nonsmooth Nonconvex Optimization

  • DOI: 10.1016/j.orl.2022.09.001
  • Odkaz: https://doi.org/10.1016/j.orl.2022.09.001
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Distributed consensus optimization has received considerable attention in recent years and several distributed consensus-based algorithms have been proposed for (nonsmooth) convex and (smooth) nonconvex objective functions. However, the behavior of these distributed algorithms on nonconvex, nonsmooth and stochastic objective functions is not understood. Such class of functions and distributed setting are motivated by several applications, including problems in machine learning and signal processing. This paper presents the first convergence analysis of the decentralized stochastic subgradient method for such classes of problems, over networks modeled as undirected, fixed, graphs.

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

  • Autoři: Schevchenko, A., Dr. Vyacheslav Kungurtsev, Ph.D., Mondelli, M.
  • Publikace: Journal of Machine Learning Research. 2022, 23(130), 1-55. ISSN 1532-4435.
  • Rok: 2022
  • DOI: 10.48550/arXiv.2111.02278
  • Odkaz: https://doi.org/10.48550/arXiv.2111.02278
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Understanding the properties of neural networks trained via stochastic gradient descent (SGD) is at the heart of the theory of deep learning. In this work, we take a mean- field view, and consider a two-layer ReLU network trained via noisy-SGD for a univariate regularized regression problem. Our main result is that SGD with vanishingly small noise injected in the gradients is biased towards a simple solution: at convergence, the ReLU network implements a piecewise linear map of the inputs, and the number of \knot"points { i.e., points where the tangent of the ReLU network estimator changes { between two consecutive training inputs is at most three. In particular, as the number of neurons of the network grows, the SGD dynamics is captured by the solution of a gradient ow and, at convergence, the distribution of the weights approaches the unique minimizer of a related free energy, which has a Gibbs form. Our key technical contribution consists in the analysis of the estimator resulting from this minimizer: we show that its second derivative vanishes everywhere, except at some specific locations which represent the \knot"points. We also provide empirical evidence that knots at locations distinct from the data points might occur, as predicted by our theory.

Simultaneous Online Model Identification and Production Optimization Using Modifier Adaptation

  • DOI: 10.1016/j.jprocont.2021.12.009
  • Odkaz: https://doi.org/10.1016/j.jprocont.2021.12.009
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    A key problem for many industrial processes is to limit exposure to system malfunction. The system health state can be represented by different models. However, it is often the case that control cost minimization is prioritized over model identification. Indeed, model identification is typically not considered in production optimization, which can lead to delayed awareness and alerting of malfunction. In this paper, we address the problem of simultaneous production optimization and system identification. We develop new algorithms based on modifier adaptation and reinforcement learning, which efficiently manage the tradeoff between cost minimization and identification. For two case studies based on a chemical reactor and subsea oil and gas exploration, we show that our algorithms yield control costs comparable to existing methods while yielding rapid identification of system degradation.

A Nonmonotone Matrix-Free Algorithm for Nonlinear Equality-Constrained Least-Squares Problems

  • Autoři: Bergou, E.H., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D., Royer, C.W.
  • Publikace: SIAM Journal on Scientific Computing. 2021, 43(4), S743-S766. ISSN 1064-8275.
  • Rok: 2021
  • DOI: 10.1137/20M1349138
  • Odkaz: https://doi.org/10.1137/20M1349138
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Least squares form one of the most prominent classes of optimization problems with numerous applications in scientific computing and data fitting. When such formulations aim at modeling complex systems, the optimization process must account for nonlinear dynamics by incorporating constraints. In addition, these systems often incorporate a large number of variables, which increases the difficulty of the problem and motivates the need for efficient algorithms amenable to large-scale implementations. In this paper, we propose and analyze a Levenberg--Marquardt algorithm for nonlinear least squares subject to nonlinear equality constraints. Our algorithm is based on inexact solves of linear least-squares problems that only require Jacobian-vector products. Global convergence is guaranteed by the combination of a composite step approach and a nonmonotone step acceptance rule. We illustrate the performance of our method on several test cases from data assimilation and inverse problems; our algorithm is able to reach the vicinity of a solution from an arbitrary starting point and can outperform the most natural alternatives for these classes of problems.

A Zeroth Order Method for Stochastic Weakly Convex Optimization

  • DOI: 10.1007/s10589-021-00313-3
  • Odkaz: https://doi.org/10.1007/s10589-021-00313-3
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    In this paper, we consider stochastic weakly convex optimization problems, however without the existence of a stochastic subgradient oracle. We present a derivative free algorithm that uses a two point approximation for computing a gradient estimate of the smoothed function. We prove convergence at a similar rate as state of the art methods, however with a larger constant, and report some numerical results showing the effectiveness of the approach.

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

  • Autoři: Dr. Vyacheslav Kungurtsev, Ph.D., Egan, M., Chatterjee, B., Alistarh, D.
  • Publikace: Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, California: Association for the Advancement of Artificial Intelligence (AAAI), 2021. p. 8209-8216. ISSN 2159-5399. ISBN 978-1-57735-866-4.
  • Rok: 2021
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Asynchronous distributed algorithms are a popular way to reduce synchronization costs in large-scale optimization, and in particular for neural network training. However, for nonsmooth and nonconvex objectives, few convergence guarantees exist beyond cases where closed-form proximal operator solutions are available. As training most popular deep neural networks corresponds to optimizing nonsmooth and nonconvex objectives, there is a pressing need for such convergence guarantees. In this paper, we analyze for the first time the convergence of stochastic asynchronous optimization for this general class of objectives. In particular, we focus on stochastic subgradient methods allowing for block variable partitioning, where the shared model is asynchronously updated by concurrent processes. To this end, we use a probabilistic model which captures key features of real asynchronous scheduling between concurrent processes. Under this model, we establish convergence with probability one to an invariant set for stochastic subgradient methods with momentum. From a practical perspective, one issue with the family of algorithms that we consider is that they are not efficiently supported by machine learning frameworks, which mostly focus on distributed data-parallel strategies. To address this, we propose a new implementation strategy for shared-memory based training of deep neural networks for a partitioned but shared model in single- and multi-GPU settings. Based on this implementation, we achieve on average about 1.2x speed-up in comparison to state-of-the-art training methods for popular image classification tasks, without compromising accuracy.

Complexity Iteration Analysis for Strongly Convex Multi-objective Optimization Using a Newton Path-following Procedure

  • DOI: 10.1007/s11590-020-01623-x
  • Odkaz: https://doi.org/10.1007/s11590-020-01623-x
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    In this note, we consider the iteration complexity of solving strongly convex multi-objective optimization problems. We discuss the precise meaning of this problem, noting that its definition is ambiguous, and focus on the most natural notion of finding a set of Pareto optimal points across a grid of scalarized problems. We prove that, in most cases, performing sensitivity based path-following after obtaining one solution is the optimal strategy for this task in terms of iteration complexity.

Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent

  • Autoři: Nadiradze, G., Markov, I., Chatterjee, B., Dr. Vyacheslav Kungurtsev, Ph.D., Alistarh, D.
  • Publikace: Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, California: Association for the Advancement of Artificial Intelligence (AAAI), 2021. p. 9037-9045. ISSN 2374-3468. ISBN 978-1-57735-866-4.
  • Rok: 2021
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    One key element behind the recent progress of machine learning has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Most of these models are trained employing variants of stochastic gradient descent (SGD) based optimization, but most methods involve some type of consistency relaxation relative to sequential SGD, to mitigate its large communication or synchronization costs at scale. In this paper, we introduce a general consistency condition covering communication-reduced and asynchronous distributed SGD implementations. Our framework, called elastic consistency, decouples the system-specific aspects of the implementation from the SGD convergence requirements, giving a general way to obtain convergence bounds for a wide variety of distributed SGD methods used in practice. Elastic consistency can be used to re-derive or improve several previous convergence bounds in message-passing and shared-memory settings, but also to analyze new models and distribution schemes. As a direct application, we propose and analyze a new synchronization-avoiding scheduling scheme for distributed SGD, and show that it can be used to efficiently train deep convolutional models for image classification.

Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity

  • Autoři: Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Lampariello, L., Scutari, G.
  • Publikace: Mathematics of Operations Research. 2021, 46(2), 595-627. ISSN 0364-765X.
  • Rok: 2021
  • DOI: 10.1287/moor.2020.1079
  • Odkaz: https://doi.org/10.1287/moor.2020.1079
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We consider nonconvex constrained optimization problems and propose a new approach to the convergence analysis based on penalty functions. We make use of classical penalty functions in an unconventional way, in that penalty functions only enter in the theoretical analysis of convergence while the algorithm itself is penalty free. Based on this idea, we are able to establish several new results, including the first general analysis for diminishing stepsize methods in nonconvex, constrained optimization, showing convergence to generalized stationary points, and a complexity study for sequential quadratic programming–type algorithms.

A Shifted Primal-Dual Penalty-Barrier Method for Nonlinear Optimization

  • Autoři: Gill, P.E., Dr. Vyacheslav Kungurtsev, Ph.D., Robinson, D.P.
  • Publikace: SIAM Journal on Optimization. 2020, 30(2), 1067-1093. ISSN 1052-6234.
  • Rok: 2020
  • DOI: 10.1137/19M1247425
  • Odkaz: https://doi.org/10.1137/19M1247425
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    In nonlinearly constrained optimization, penalty methods provide an effective strategy for handling equality constraints, while barrier methods provide an effective approach for the treatment of inequality constraints. A new algorithm for nonlinear optimization is proposed based on minimizing a shifted primal-dual penalty-barrier function. Certain global convergence properties are established. In particular, it is shown that a limit point of the sequence of iterates may always be found that is either an infeasible stationary point or a complementary approximate Karush--Kuhn--Tucker point; i.e., it satisfies reasonable stopping criteria and is a Karush--Kuhn--Tucker point under a regularity condition that is the weakest constraint qualification associated with sequential optimality conditions. It is also shown that under suitable additional assumptions, the method is equivalent to a shifted variant of the primal-dual path-following method in the neighborhood of a solution. Numerical examples are provided that illustrate the performance of the method compared to a widely used conventional interior-point method.

A Two-Step Pre-Processing for Semidefinite Programming

  • DOI: 10.1109/CDC42340.2020.9304494
  • Odkaz: https://doi.org/10.1109/CDC42340.2020.9304494
  • Pracoviště: Centrum umělé inteligence, Intelligent Data Analysis
  • Anotace:
    In semidefinite programming (SDP), a number of pre-processing techniques have been developed, including procedures based on chordal decomposition, which exploit sparsity in the semidefinite program in order to reduce the dimension of individual constraints, and procedures based on facial reduction, which reduces the dimension of the problem by removing redundant rows and columns. So far, these have been studied in isolation. We show that these techniques are, in fact, complementary. In computational experiments, we show that a two-step pre-processing followed by a standard interior-point method outperforms the interior point method, with or without either of the pre-processing techniques, by a considerable margin.

Asynchronous Parallel Algorithms for Nonconvex Optimization

  • Autoři: Cannelli, L., Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Scutari, G.
  • Publikace: Mathematical Programming. 2020, 184(1-2), 121-154. ISSN 0025-5610.
  • Rok: 2020
  • DOI: 10.1007/s10107-019-01408-w
  • Odkaz: https://doi.org/10.1007/s10107-019-01408-w
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We propose a new asynchronous parallel block-descent algorithmic framework for the minimization of the sum of a smooth nonconvex function and a nonsmooth convex one, subject to both convex and nonconvex constraints. The proposed framework hinges on successive convex approximation techniques and a novel probabilistic model that captures key elements of modern computational architectures and asynchronous implementations in a more faithful way than current state-of-the-art models. Other key features of the framework are: (1) it covers in a unified way several specific solution methods; (2) it accommodates a variety of possible parallel computing architectures; and (3) it can deal with nonconvex constraints. Almost sure convergence to stationary solutions is proved, and theoretical complexity results are provided, showing nearly ideal linear speedup when the number of workers is not too large.

Convergence and Complexity Analysis of a Levenberg-Marquardt Algorithm for Inverse Problems

  • Autoři: Bergou, E.H., Diouane, Y., Dr. Vyacheslav Kungurtsev, Ph.D.,
  • Publikace: JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS. 2020, 185(3), 927-944. ISSN 0022-3239.
  • Rok: 2020
  • DOI: 10.1007/s10957-020-01666-1
  • Odkaz: https://doi.org/10.1007/s10957-020-01666-1
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    The Levenberg-Marquardt algorithm is one of the most popular algorithms for finding the solution of nonlinear least squares problems. Across different modified variations of the basic procedure, the algorithm enjoys global convergence, a competitive worst-case iteration complexity rate, and a guaranteed rate of local convergence for both zero and nonzero small residual problems, under suitable assumptions. We introduce a novel Levenberg-Marquardt method that matches, simultaneously, the state of the art in all of these convergence properties with a single seamless algorithm. Numerical experiments confirm the theoretical behavior of our proposed algorithm.

Convergence Rate for Diminishing Stepsize Methods in nonconvex Constrained Optimization via Ghost Penalties

  • Autoři: Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Lampariello, L., Scutari, G.
  • Publikace: Atti della Accademia Peloritana dei Pericolanti. Classe di Scienze Fisiche, Matematiche e Naturali. 2020, 98(S2), ISSN 1825-1242.
  • Rok: 2020
  • DOI: 10.1478/AAPP.98S2A8
  • Odkaz: https://doi.org/10.1478/AAPP.98S2A8
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    This is a companion paper to “Ghost penalties in nonconvex constrained optimization: Diminishing stepsizes and iteration complexity" (to appear in Mathematics of Operations Research). We consider the ghost penalty scheme for nonconvex, constrained optimization introduced in that paper, coupled with a diminishing stepsize procedure. Under an extended Mangasarian-Fromovitz-type constraint qualification we give an expression for the maximum number of iterations needed to achieve a given solution accuracy according to a natural stationarity measure, thus establishing the first result of this kind for a diminishing stepsize method for nonconvex, constrained optimization problems.

Fast Sensitivity-Based Economic Model Predictive Control for Degenerate Systems

  • DOI: 10.1016/j.jprocont.2020.02.006
  • Odkaz: https://doi.org/10.1016/j.jprocont.2020.02.006
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We present a sensitivity-based nonlinear model predictive control (NMPC) algorithm and demonstrate it on a case study with an economic cost function. In contrast to existing sensitivity-based approaches that make strong assumptions on the underlying optimization problem (e.g. the linear independence constraint qualification implying unique multiplier), our method is designed to handle problems satisfying a weaker constraint qualification, namely the Mangasarian-Fromovitz constraint qualification (MFCQ). Our nonlinear programming (NLP) sensitivity update consists of three steps. The first step is a corrector step in which a system of linear equations is solved. Then a predictor step is computed by a quadratic program (QP). Finally, a linear program (LP) is solved to select the multipliers that give the correct sensitivity information. A path-following scheme containing these steps is embedded in the advanced-step NMPC (asNMPC) framework. We demonstrate our method on a large-scale case example consisting of a reactor and distillation process. We show that LICQ does not hold and the path-following method is able to accurately approximate the ideal solutions generated by an NLP solver.

Lifted Weight Learning of Markov Logic Networks (Revisited One More Time)

  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We revisit the problem of lifted weight learning of Markov logic networks (MLNs). We show that there is an algorithm for maximum-likelihood learning which runs in time polynomial in the size of the domain, whenever the partition function of the given MLN can be computed in polynomial time. This improves on our recent results where we showed the same result with the additional dependency of the runtime on a parameter of the training data, called interiority, which measures how “extreme” the given training data are. In this work, we get rid of this dependency. The main new technical ingredient that we exploit are theoretical results obtained recently by Straszak and Vishnoi (Maximum Entropy Distributions: Bit Complexity and Stability, COLT 2019).

Reinforcement Learning Based on Real-Time Iteration NMPC

  • Autoři: Zanon, M., Dr. Vyacheslav Kungurtsev, Ph.D., Gros, S.
  • Publikace: Proceedings of the IFAC World Congress 2020. Laxenburg: IFAC, 2020. p. 5213-5218. IFAC-PapersOnLine. vol. 53. ISSN 2405-8963.
  • Rok: 2020
  • DOI: 10.1016/j.ifacol.2020.12.1195
  • Odkaz: https://doi.org/10.1016/j.ifacol.2020.12.1195
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which does guarantee safety and stability, but only yields optimality for the nominal model. Therefore, it has been recently proposed to use NMPC as a function approximator within RL. While the ability of this approach to yield good performance has been demonstrated, the main drawback hindering its applicability is related to the computational burden of NMPC, which has to be solved to full convergence. In practice, however, computationally efficient algorithms such as the Real-Time Iteration (RTI) scheme are deployed in order to return an approximate NMPC solution in very short time. In this paper we bridge this gap by extending the existing theoretical framework to also cover RL based on RTI NMPC. We demonstrate the effectiveness of this new RL approach with a nontrivial example modeling a challenging nonlinear system subject to stochastic perturbations with the objective of optimizing an economic cost.

Second-Order Guarantees of Distributed Gradient Algorithms

  • Autoři: Daneshmand, A., Scutari, G., Dr. Vyacheslav Kungurtsev, Ph.D.,
  • Publikace: SIAM Journal on Optimization. 2020, 30(4), 3029-3068. ISSN 1095-7189.
  • Rok: 2020
  • DOI: 10.1137/18M121784X
  • Odkaz: https://doi.org/10.1137/18M121784X
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We consider distributed smooth nonconvex unconstrained optimization over net- works, modeled as a connected graph. We examine the behavior of distributed gradient-based algorithms near strict saddle points. Specifically, we establish that (i) the renowned distributed gradient descent algorithm likely converges to a neighborhood of a second-order stationary (SoS) solution; and (ii) the more recent class of distributed algorithms based on gradient tracking---implementable also over digraphs---likely converges to exact SoS solutions, thus avoiding (strict) saddle points. Furthermore, new convergence rate results for first-order critical points is established for the latter class of algorithms.

Lifted Weight Learning of Markov Logic Networks Revisited

  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We study lifted weight learning of Markovlogic networks. We show that there is an al-gorithm for maximum-likelihood learning of2-variable Markov logic networks which runsin time polynomial in the domain size. Ourresults are based on existing lifted-inferencealgorithms and recent algorithmic results oncomputing maximum entropy distributions.

A Predictor-Corrector Path-Following Algorithm for Dual-Degenerate Parametric Optimization Problems

  • DOI: 10.1137/16M1068736
  • Odkaz: https://doi.org/10.1137/16M1068736
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Most path-following algorithms for tracing a solution path of a parametric nonlinear optimization problem are only certifiably convergent under strong regularity assumptions about the problem functions. In particular, linear independence of the constraint gradients at the solutions is typically assumed, which implies unique multipliers. In this paper we propose a procedure designed to solve problems satisfying a weaker set of conditions, allowing for nonunique (but bounded) multipliers. Each iteration along the path consists of three parts: (1) a Newton corrector step for the primal and dual variables, which is obtained by solving a linear system of equations, (2) a predictor step for the primal and dual variables, which is found as the solution of a quadratic programming problem, and (3) a jump step for the dual variables, which is found as the solution of a linear programming problem. We present a convergence proof and demonstrate the successful solution tracking of the algorithm numerically on a couple of illustrative examples.

A stabilized SQP Method: Global Convergence

  • Autoři: Gill, P.E., Dr. Vyacheslav Kungurtsev, Ph.D., Robinson, D.P.
  • Publikace: IMA Journal of Numerical Analysis (IMAJNA). 2017, 37(1), 407-443. ISSN 0272-4979.
  • Rok: 2017
  • DOI: 10.1093/imanum/drw004
  • Odkaz: https://doi.org/10.1093/imanum/drw004
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Stabilized sequential quadratic programming (SQP) methods for nonlinear optimization are designed to provide a sequence of iterates with fast local convergence even when the active-constraint gradients are linearly dependent. This paper concerns the global convergence properties of a stabilized SQP method with a primal–dual augmented Lagrangian merit function. The proposed method incorporates two novel features. First, a flexible line search is used based on a direction formed from an approximate solution of a strictly convex quadratic programming (QP) subproblem and, when one exists, a direction of negative curvature for the primal–dual merit function. Second, when certain conditions hold, an approximate QP solution is computed by solving a single linear system defined in terms of an estimate of the optimal active set. We also establish two desirable convergence results. (i) It is shown that with an appropriate choice of termination condition, the method terminates in a finite number of iterations without the assumption of a constraint qualification. The method may be interpreted as an SQP method with an augmented Lagrangian safeguarding strategy. This safeguarding becomes relevant only when the iterates are converging to an infeasible stationary point of the norm of the constraint violations. Otherwise, the method terminates with a point that approximately satisfies certain second-order necessary conditions for optimality. In this situation, if all termination conditions are removed, then the limit points either satisfy the same second-order necessary conditions exactly or fail to satisfy a weak second-order constraint qualification. (ii) The global convergence analysis concerns a specific algorithm that estimates the least curvature of the merit function at each step. If negative curvature directions are omitted, the analysis still applies and establishes convergence to either first-order solutions or infeasible stationary points.

A stabilized SQP Method: Superlinear Convergence

  • DOI: 10.1007/s10107-016-1066-7
  • Odkaz: https://doi.org/10.1007/s10107-016-1066-7
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    Stabilized sequential quadratic programming (sSQP) methods for nonlinear optimization generate a sequence of iterates with fast local convergence regardless of whether or not the active-constraint gradients are linearly dependent. This paper concerns the local convergence analysis of an sSQP method that uses a line search with a primal-dual augmented Lagrangian merit function to enforce global convergence. The method is provably well-defined and is based on solving a strictly convex quadratic programming subproblem at each iteration. It is shown that the method has superlinear local convergence under assumptions that are no stronger than those required by conventional stabilized SQP methods. The fast local convergence is obtained by allowing a small relaxation of the optimality conditions for the quadratic programming subproblem in the neighborhood of a solution. In the limit, the line search selects the unit step length, which implies that the method does not suffer from the Maratos effect. The analysis indicates that the method has the same strong first- and second-order global convergence properties that have been established for augmented Lagrangian methods, yet is able to transition seamlessly to sSQP with fast local convergence in the neighborhood of a solution. Numerical results on some degenerate problems are reported.

Input Shaper Optimization with a Constraint on the Spectrum Distribution

  • DOI: 10.1016/j.ifacol.2017.08.1893
  • Odkaz: https://doi.org/10.1016/j.ifacol.2017.08.1893
  • Pracoviště: Katedra řídicí techniky, Intelligent Data Analysis
  • Anotace:
    This paper presents a procedure to parametrize input shapers with piecewise equally distributed time delays. The procedure seeks to minimize residual vibrations around an undesirable frequency but at the same time avoids introducing zeros in the right half plane. This ensures that the introduction of the inverse of the input shaper in a closed feedback loop does not introduce unstable poles in the system. The requirement of stable spectra for the input shaper appears as an additional constraint in a constrained optimization problem that includes minimizing the response time and residual vibrations while preserving required properties of the shaper. The novel introduction of a spectral constraint makes the optimization problem nonsmooth and nonconvex, necessitating special optimization algorithms for its reliable solution. The framework is outlined in detail and the results of the optimization are presented indicating the success of the procedure.

Sensitivity-Based Economic NMPC with a Path-Following Approach

  • DOI: 10.3390/pr5010008
  • Odkaz: https://doi.org/10.3390/pr5010008
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We present a sensitivity-based predictor-corrector path-following algorithm for fast nonlinear model predictive control (NMPC) and demonstrate it on a large case study with an economic cost function. The path-following method is applied within the advanced-step NMPC framework to obtain fast and accurate approximate solutions of the NMPC problem. In our approach, we solve a sequence of quadratic programs to trace the optimal NMPC solution along a parameter change. A distinguishing feature of the path-following algorithm in this paper is that the strongly-active inequality constraints are included as equality constraints in the quadratic programs, while the weakly-active constraints are left as inequalities. This leads to close tracking of the optimal solution. The approach is applied to an economic NMPC case study consisting of a process with a reactor, a distillation column and a recycler. We compare the path-following NMPC solution with an ideal NMPC solution, which is obtained by solving the full nonlinear programming problem. Our simulations show that the proposed algorithm effectively traces the exact solution.

Hybrid Random/Deterministic Parallel Algorithms for Convex and Nonconvex Big Data Optimization

  • Autoři: Daneshmand, A., Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Scutari, G.
  • Publikace: IEEE Transactions on Signal Processing. 2015, 63(15), 3914-3929. ISSN 1053-587X.
  • Rok: 2015
  • DOI: 10.1109/TSP.2015.2436357
  • Odkaz: https://doi.org/10.1109/TSP.2015.2436357
  • Pracoviště: Katedra počítačů
  • Anotace:
    We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a nonsmooth (possibly nonseparable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel parallel, hybrid random/deterministic decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing a convex surrogate of the original nonconvex function. To tackle huge-scale problems, the (block) variables to be updated are chosen according to a mixed random and deterministic procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results show that on huge-scale problems the proposed hybrid random/deterministic algorithm compares favorably to random and deterministic schemes on both convex and nonconvex problems.

Flexible Selective Parallel Algorithms for Big Data Optimization

  • Autoři: Daneshmand, A., Facchinei, F., Dr. Vyacheslav Kungurtsev, Ph.D., Scutari, G.
  • Publikace: CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS. USA: IEEE Computer Society, 2014. p. 3-7. ISSN 1058-6393. ISBN 978-1-4799-8297-4.
  • Rok: 2014
  • DOI: 10.1109/ACSSC.2014.7094384
  • Odkaz: https://doi.org/10.1109/ACSSC.2014.7094384
  • Pracoviště: Intelligent Data Analysis
  • Anotace:
    We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a nonsmooth (separable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel parallel, hybrid random/deterministic decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing local convex approximations of the original nonconvex function. To tackle with huge-scale problems, the (block) variables to be updated are chosen according to a mixed random and deterministic procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results on huge-scale problems show that the proposed algorithm outperforms current schemes.

Sequential Quadratic Programming Methods for Parametric Nonlinear Optimization

  • DOI: 10.1007/s10589-014-9696-2
  • Odkaz: https://doi.org/10.1007/s10589-014-9696-2
  • Pracoviště: Katedra počítačů
  • Anotace:
    Sequential quadratic programming (SQP) methods are known to be efficient for solving a series of related nonlinear optimization problems because of desirable hot and warm start properties-a solution for one problem is a good estimate of the solution of the next. However, standard SQP solvers contain elements to enforce global convergence that can interfere with the potential to take advantage of these theoretical local properties in full. We present two new predictor-corrector procedures for solving a nonlinear program given a sufficiently accurate estimate of the solution of a similar problem. The procedures attempt to trace a homotopy path between solutions of the two problems, staying within the local domain of convergence for the series of problems generated. We provide theoretical convergence and tracking results, as well as some numerical results demonstrating the robustness and performance of the methods.

Za stránku zodpovídá: Ing. Mgr. Radovan Suk