The importance of Variational Autoencoders reaches far beyond standalone generative models -- the approach is also used for learning latent representations and can be generalized to semi-supervised learning. This requires a thorough analysis of their commonly known shortcomings: posterior collapse and approximation errors. This paper analyzes VAE approximation errors caused by the combination of the ELBO objective and encoder models from conditional exponential families, including, but not limited to, commonly used conditionally independent discrete and continuous models.
We characterize subclasses of generative models consistent with these encoder families. We show that the ELBO optimizer is pulled away from the likelihood optimizer towards the consistent subset and study this effect experimentally. Importantly, this subset can not be enlarged, and the respective error cannot be decreased, by considering deeper encoder/decoder networks.
Coupling cell detection and tracking by temporal feedback
The tracking-by-detection strategy is the backbone of many methods for tracking living cells in time-lapse microscopy. An object detector is first applied to the input images, and the resulting detection candidates are then linked by a data association module. The performance of such methods strongly depends on the quality of the detector because detection errors propagate to the linking step. To tackle this issue, we propose a joint model for segmentation, detection and tracking. The model is defined implicitly as limiting distribution of a Markov chain Monte Carlo algorithm and contains a temporal feedback, which allows to dynamically alter detector parameters using hints given by neighboring frames and, in this way, correct detection errors. The proposed method can integrate any detector and is therefore not restricted to a specific domain. The parameters of the model are learned using an objective based on empirical risk minimization. We use our method to conduct large-scale experiments for confluent cultures of endothelial cells and evaluate its performance in the ISBI Cell Tracking Challenge, where it consistently scored among the best three methods.
Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks
In neural networks with binary activations and or binary weights the training by gradient descent is complicated as the model has piecewise constant response. We consider stochastic binary networks, obtained by adding noises in front of activations. The expected model response becomes a smooth function of parameters, its gradient is well defined but it is challenging to estimate it accurately. We propose a new method for this estimation problem combining sampling and analytic approximation steps. The method has a significantly reduced variance at the price of a small bias which gives a very practical tradeoff in comparison with existing unbiased and biased estimators. We further show that one extra linearization step leads to a deep straight-through estimator previously known only as an ad-hoc heuristic. We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models with both proposed methods.
Feed-forward Propagation in Probabilistic Neural Networks with Categorical and Max Layers
Probabilistic Neural Networks deal with various sources of stochasticity: input noise, dropout, stochastic neurons, parameter uncertainties modeled as random variables, etc.
In this paper we revisit a feed-forward propagation approach that allows one to estimate for each neuron its mean and variance \wrt all mentioned sources of stochasticity. In contrast, standard NNs propagate only point estimates, discarding the uncertainty.
Methods propagating also the variance have been proposed by several authors in different context. The view presented here attempts to clarify the assumptions and derivation behind such methods, relate them to classical NNs and broaden their scope of applicability.
The main technical contributions are new approximations for the distributions of argmax and max-related transforms, which allow for fully analytic uncertainty propagation in networks with softmax and max-pooling layers as well as leaky ReLU activations.
We evaluate the accuracy of the approximation and suggest a simple calibration. Applying the method to networks with dropout allows for faster training and gives improved test likelihoods without the need of sampling.
In this work we investigate the reasons why Batch Normalization (BN) improves the generalization performance of deep networks. We argue that one major reason, distinguishing it from data-independent normalization methods, is randomness of batch statistics. This randomness appears in the parameters rather than in activations and admits an interpretation as a practical Bayesian learning. We apply this idea to other (deterministic) normalization techniques that are oblivious to the batch size. We show that their generalization performance can be improved significantly by Bayesian learning of the same form. We obtain test performance comparable to BN and, at the same time, better validation losses suitable for subsequent output uncertainty estimation through approximate Bayesian posterior.
Finding a Given Number of Solutions to a System of Fuzzy Constraints
A minimax modification of a fuzzy constraint satisfaction problem is considered, where constraints determine not whether a given solution is feasible but the numerical value of satisfiability. The algorithm is proposed that finds a given number of solutions with the highest value of satisfiability in polynomial time for a subclass of problems with constraints invariant to some majority operator. It is important that knowing the operator itself is not required. Moreover, it is not necessary to guarantee its existence. For any system of fuzzy constraints, the algorithm either finds a given number of best solutions or declines the problem. The latter is only possible when no such operator exists.
Normalization of Neural Networks using Analytic Variance Propagation
We address the problem of estimating statistics of hidden units in a neural network using a method of analytic moment propagation. These statistics are useful for approximate whitening of the inputs in front of saturating non-linearities such as a sigmoid function. This is important for initialization of training and for reducing the accumulated scale and bias dependencies (compensating covariate shift), which presumably eases the learning. In batch normalization, which is currently a very widely applied technique, sample estimates of statistics of hidden units over a batch are used. The proposed estimation uses an analytic propagation of mean and variance of the training set through the network. The result depends on the network structure and its current weights but not on the specific batch input. The estimates are suitable for initialization and normalization, efficient to compute and independent of the batch size. The experimental verification well supports these claims. However, the method does not share the generalization properties of BN, to which our experiments give some additional insight.
Polarized actin and VE-cadherin dynamics regulate junctional remodelling and cell migration during sprouting angiogenesis
Cao, J., Ehling, M., März, S., Seebach, J., Tarbashevich, K., Sixta, T., Pitulescu, M.E., Werner, A.-C., doc. Boris Flach, Dr. rer. nat. habil., Montanez, E., Raz, E., Adams, R.H., Schnittler, H.
VEGFR-2/Notch signalling regulates angiogenesis in part by driving the remodelling of endothelial cell junctions and by inducing cell migration. Here, we show that VEGF-induced polarized cell elongation increases cell perimeter and decreases the relative VE-cadherin concentration at junctions, triggering polarized formation of actin-driven junction-associated intermittent lamellipodia (JAIL) under control of the WASP/WAVE/ARP2/3 complex. JAIL allow formation of new VE-cadherin adhesion sites that are critical for cell migration and monolayer integrity. Whereas at the leading edge of the cell, large JAIL drive cell migration with supportive contraction, lateral junctions show small JAIL that allow relative cell movement. VEGFR-2 activation initiates cell elongation through dephosphorylation of junctional myosin light chain II, which leads to a local loss of tension to induce JAIL-mediated junctional remodelling. These events require both microtubules and polarized Rac activity. Together, we propose a model where polarized JAIL formation drives directed cell migration and junctional remodelling during sprouting angiogenesis.
Multiple Object Segmentation and Tracking by Bayes Risk Minimization
Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016, Part II. Cham: Springer International Publishing, 2016. pp. 607-615. Lecture Notes in Computer Science. vol. 9901. ISSN 0302-9743. ISBN 978-3-319-46722-1.
Motion analysis of cells and subcellular particles like vesicles, microtubules or membrane receptors is essential for understanding various processes, which take place in living tissue. Manual detection and tracking is usually infeasible due to large number of particles. In addition the images are often distorted by noise caused by limited resolution of optical microscopes, which makes the analysis even more challenging. In this paper we formulate the task of detection and tracking of small objects as a Bayes risk minimization. We introduce a novel spatio-temporal probabilistic graphical model which models the dynamics of individual particles as well as their relations and propose a loss function suitable for this task. Performance of our method is evaluated on artificial but highly realistic data from the 2012 ISBI Particle Tracking Challenge . We show that our approach is fully comparable or even outperforms state-of-the-art methods.
Joint Segmentation and Registration Through the Duality of Congealing and Maximum Likelihood Estimate
In this paper we consider the task of joint registration and segmentation. A popular method which aligns images and simultaneously estimates a simple statistical shape model was proposed by E. Learned-Miller and is known as congealing. It considers the entropy of a simple, pixel-wise independent distribution as the objective function for searching the unknown transformations. Besides being intuitive and appealing, this idea raises several theoretical and practical questions, which we try to answer in this paper. First, we analyse the approach theoretically and show that the original congealing is in fact the DC-dual task (difference of convex functions) for a properly formulated Maximum Likelihood estimation task. This interpretation immediately leads to a different choice for the algorithm which is substantially simpler than the known congealing algorithm. The second contribution is to show, how to generalise the task for models in which the shape prior is formulated in terms of segmentation labellings and is related to the signal domain via a parametric appearance model. We call this generalisation unsupervised congealing. The new approach is applied to the task of aligning and segmenting imaginal discs of Drosophila melanogaster larvae.
Digital Terrain Modeling and Glacier Topographic Characterization
The Earth's topography results from dynamic interactions involving climate, tecton ics, and surface processes. In this chapter our main interest is in describing and illustratin g how satellite-derived DEMs (and other DEMs) can be used to derive information about glacier dynamical changes
Dynamic Programming (DP) is a paradigm used in algorithms for solving optimization problems. It relies on the construction of nested subproblems such that the solution of the main problem can be obtained from the solutions of the subproblems.
The Expectation Maximization algorithm iteratively maximizes the likelihood of a training sample with respect to unknown parameters of a probability model under the condition of missing information. The training sample is assumed to represent a set of independent realizations of a random variable defined on the underlying probability space.
Minimax Problems of Discrete Optimization Invariant under Majority Operators
A special class of discrete optimization problems that are stated as a minimax modification of the constraint satisfaction problem is studied. The minimax formulation of the problem generalizes the classical problem to realistic situations where the constraints order the elements of the set by the degree of their feasibility, rather than defining a dichotomy between feasible and infeasible subsets. The invariance of this ordering under an operator is defined, and the discrete minimization of functions invariant under majority operators is proved to have polynomial complexity. A particular algorithm for this minimization is described.
A Class of Random Fields on Complete Graphs with Tractable Partition Function
The aim of this short note is to draw attention to a method by which the partition function and marginal probabilities for a certain class of random fields on complete graphs can be computed in polynomial time. This class includes Ising models with homogeneous pairwise potentials but arbitrary (inhomogeneous) unary potentials. Similarly, the partition function and marginal
probabilities can be computed in polynomial time for random fields on complete bipartite graphs, provided they have homogeneous pairwise potentials. We expect that these tractable classes of large scale random fields can be very useful for the evaluation of approximation algorithms by providing exact error estimates.
Unsupervised (parameter) learning for MRFs on bipartite graphs
We consider unsupervised (parameter) learning for general Markov random fields on bipartite graphs. This model class includes Restricted Boltzmann Machines. We show that besides the widely used stochastic gradient approximation (a.k.a. Persistent Con- trastive Divergence) there is an alternative learning approach - a modified EM algorithm which is tractable because of the bipartiteness of the model graph. We compare the re- sulting double loop algorithm and the PCD learning experimentally and show that the former converges faster and more stable than the latter.
CVPR 2011: Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2011, pp. 2177-2182. IEEE Conference on Computer Vision and Pattern Recognition. ISSN 1063-6919. ISBN 978-1-4577-0393-5.
We analyse the potential of Gibbs Random Fields for shape prior modelling. We show that the expressive power of second order GRFs is already sufficient to express spatial relations between shape parts and simple shapes simultaneously. This allows to model and recognise complex shapes as spatial compositions of simpler parts.
Modelling Distributed Shape Priors by Gibbs Random Fields of Second Order
We analyse the potential of Gibbs Random Fields for shape prior modelling. We show that the expressive power of second order GRFs is already sufficient to express simple shapes and spatial relations between them simultaneously. This allows to model and recognise complex shapes as spatial compositions of simpler parts.
Structural, Syntactic, and Statistical Pattern Recognition. Berlin & Heidelberg: Springer, 2008. p. 177-186. Lecture Notes in Computer Science. vol. 5342. ISSN 0302-9743. ISBN 978-3-540-89688-3.
We propose a combination of shape prior models with Markov-Random Fields. The model allows to integrate multiple shape priors and appearance models into MRF-models for segmentation. We discuss a recognition task and introduce a general learning scheme. Both tasks are solved in the scope of the model and verified experimentally.
TRLFS: Analysing spectra with an expectation-maximization (EM) algorithm
Many image recognition tasks can be expressed in terms of searching for the maximum a posteriori
labeling in some statistical model. We introduce a class of higher order Gibbs models, also known as Markov
random fields, for which this task is solvable in polynomial time.
Analysis of Optimal Labelling Problems and Their Application to Image Segmentation and Binocular Stereovision