The Departement of Mathematical Engineering organizes a series of seminars. The seminars are usually held on Tuesday from 2:00pm to 3:00pm in the Euler lecture room, Building EULER, av. Georges Lemaître 4-6, Louvain-la-Neuve (Parking 13). Be mindful that exceptions may occur; see the talk annoucements.
If you wish to receive the seminar announcements by email, please send an email to Etienne Huens.
Master students can take this seminar for credit in either of the two semesters; see LINMA2120 for more information.
Seminars to come
|26/03/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Felix Lieder (Universität Düsseldorf)
Performance Estimation for Fixed Point Iterations
In recent years exact worst-case performance estimation of first-order methods has gained great attention: It is now well established (especially, but not solely, due to the work of de Klerk, Drori, Glineur, Hendrickx, Taylor and Teboulle) that we can find tight convergence rates for smooth (strongly) convex functions via semidefinite programming. Here the existing approach is transferred to the broader setting of fixed point iterations of non-expansive operators, again ensuring tight rates. Specifically we focus on two fixed point iterations, first the Halpern-Iteration and second the Krasnoselski-Mann-Iteration with constant step-size. In both cases we consider the convergence of the norm of the residuals as our performance criterion of interest. We are able to improve the existing bounds on the convergence rate and, more importantly, show tightness of both bounds. In particular, while the Krasnoselski-Mann-Iteration often performs better in practice, its (tight) rate is an order of magnitude worse than of the Halpern-Iteration, which possesses the fastest proven convergence rate among all fixed step methods. Finally we discuss applications and possible future extensions, such as different performance criteria and automated algorithm modeling.
|02/04/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Erling Andersen (CEO of MOSEK)
A primal-dual interior-point algorithm for nonsymmetric conic optimization
It is well known that primal-dual interior-point algorithms for linear optimization can easily be extended to the case of symmetric conic optimization, as shown by Nesterov and Todd (NT) in their 1997 paper about self-scaled barriers. Although many convex optimization problems can be expressed using symmetric cones then models involving for instance exponential functions do not belong to the class of symmetric conic optimization problems. Therefore, several authors have suggested generalizations of the NT primal-dual interior-point algorithm to handle nonsymmetric cones such as the exponential cone. Based on this work we will present a generalization of the NT algorithm to the case of nonsymmetric conic optimization. Although we have no polynomial complexity proof for the suggested algorithm then it performs extremely well in practice as will be documented with computational results. To summarize, this presentation should be interesting for users of convex optimization.
|23/04/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Benoit Delhaye (UCLouvain)
Restoring natural tactile feedback in bionic hands through a peripheral nerve interface
Tactile feedback is essential to achieve a dexterous bionic hand. Without touch, interactions with objects are slow and clumsy, and require constant attention. In this talk, I will describe recent advances toward restoring useful somatosensory feedback in bionic hands through electrical stimulation of the peripheral nerves using chronically implanted electrode arrays. First, I will show that the perceived intensity of a tactile stimulus can be systematically manipulated by modulating the aggregate response of tactile nerve fibers and discuss how this response can be systematically manipulated by controlling stimulation parameters. Second, I will discuss the advantages of the biomimetic approach to tactile restoration. In brief, the closer we can mimic the activity that would be produced in the nerve during natural tactile interactions with objects, the more natural and useful the evoked sensations will be. I describe an approach we have developed to convey such naturalistic tactile feedback.
|30/04/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Mohammad Golbabaee (The University of Bath)
|07/05/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Mihai Cucuringu (University of Oxford)
|03/12/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Tsiaflakis Paschalis (Nokia Bell Labs, Antwerp)
Signal processing and optimization for ultrabroadband internet access
|19/03/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
James Mathew (UCLouvain)
Investigating the theoretical and neural framework of predictive mechanisms in eye-hand coordination
The ability to coordinate efficiently eye and hand actions is central for humans in everyday activities. Furthermore it is argued that the ability to predict the sensory consequences of self-initiated movements is crucial for skilled motor behavior. Here by means of a task in which participants were asked to track with the eyes a visual target that was moved by their hand, we investigated the predictive mechanisms underlying eye-hand coordination. In a first study, using a protocol in which participants had to adapt to rotated hand visual feedback, we show that these predictive mechanisms can be updated independently of the ability to perform accurate hand movements. In a follow up study we tested the effect of hand dominance, and showed that, despite obvious differences in the accuracy of hand movement control, the ability to predict visual consequences of right and left hand actions was similar. Finally, by means of transcranial magnetic stimulation, we tested the hypothesis that those predictive mechanisms rely on hand efferent signals from the primary motor cortex (M1). However our results failed to support this view, and instead suggest that if such a contribution exists, it must be upstream of M1. Overall, we propose that eye-hand coordination relies on similar predictive mechanisms for both hands, possibly located upstream of M1, which can be updated independently of hand movement control.
|12/03/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Laurent Jacques (UCLouvain)
One-Bit Sensing of Low-Rank and Bisparse Matrices
In this talk, we will consider the worst-case recovery error of low-rank and bisparse matrices as a function of the number of one-bit measurements used to acquire them, i.e., when these matrices are randomly projected into a low-dimensional domain and only the signs of the resulting coordinates are recorded. Such matrices are of practical interest for instance in the problems of sparse phase retrieval, sparse PCA, or in optical applications involving far-field observations of sparse scenes. First, by way of the concept of consistency width, we will review precise estimates on how fast the recovery error can in theory decay. Next, we will analyze an idealized recovery method reaching the fourth-root of the optimal decay rate for Gaussian sensing schemes. This idealized method being impractical, an implementable recovery algorithm will be finally proposed in the context of factorized Gaussian sensing schemes. In this case, the recovery error provably decays as the sixth-root of the optimal rate. Joint work with Simon Foucart (Texas A&M University, USA)
|05/03/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Alexandre Bovet (UCLouvain)
Diffusion in complex systems: random walks, fast ions and fake news
After a quick overview of my past work on the anomalous diffusion of fast ions in turbulent plasmas I will present my recent work on the investigation of diffusion of news in Twitter. To understand the role of fake news in Twitter during the 2016 US election, we combined machine learning, network science, statistical physics and causality analysis to not only measure the importance of fake news compared to traditional news in Twitter but also understand their influence and the mechanisms of their diffusion. Using a dataset of more than 170 million tweets covering the five months preceding election day and concerning the two main candidates of the 2016 US presidential election, we find that 25% of the tweets linking to a news spread either fake or extremely biased news. We analyzed the networks of information flow and found the most important news spreaders by using the theory of optimal percolation and used a multivariate causal network reconstruction to uncover how fake news influenced the presidential election. I will end with the presentation of my current work and my future research plan about random walks to investigate spatio-temporal structures in temporal networks.
|26/02/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Anis Kacem (IMT Lille Douai)
Geometric Tools for Human Behavior Understanding
Developing intelligent systems dedicated to human behavior understanding has been a very hot research topic in the few recent decades. Indeed, it is crucial to understand human behavior in order to make machines able to interact with, assist, and help humans in their daily life. Recent breakthroughs in computer vision and machine learning have made this possible. For instance, human-related computer vision problems can be approached by first detecting and tracking 2D or 3D landmark points from visual data. Two relevant examples of this are given by the facial landmarks detected on the human face and the skeletons tracked along videos of human bodies. These techniques generate temporal sequences of landmark configurations, which exhibit several distortions in their analysis, especially in uncontrolled environments, due to view variations, inaccurate detection and tracking, missing data, etc. In this thesis, we propose two novel space-time representations of human landmark sequences along with suitable computational tools for human behavior understanding. Firstly, we propose a representation based on the trajectories of Gram matrices of human landmarks. Gram matrices are positive semi-definite matrices of fixed rank and lie on a nonlinear manifold where standard computational and machine learning techniques could not be applied in a straightforward way. To overcome this issue, we make use of some notions of the Riemannian geometry and derive suitable computational tools for analyzing Gram trajectories. We evaluate the proposed approach in several human-related applications involving 2D and 3D landmarks of human faces and bodies such us emotion recognition from facial expression and body movements and also action recognition from skeletons. Secondly, we propose another representation based on the barycentric coordinates of 2D facial landmarks. While being related to the Gram trajectory representation and robust to view variations, the barycentric representation allows to directly work with standard computational tools. The evaluation of this second approach is conducted on two face analysis tasks namely, facial expression recognition and depression severity level assessment. The obtained results with the two proposed approaches on real benchmarks are competitive with respect to recent state-of-the-art methods.
|19/02/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Jesús Angulo (CMM-Centre de Morphologie Mathématique, MINES ParisTech, PSL-Research University, France)
Minkowski Set Operations and Morphological Openings of Ellipsoids - Application for Processing Positive Definite Matrix-Valued Images
Morphological dilation and erosion for sets are based on (geometric) Minkowski sum and difference of two sets. In this talk we deal with the particular case of ellipsoids. Indeed ellipsoidal sets appear nowadays in different image and data representations, e.g., in diffusion tensor imaging each voxel is valued with a 3D ellipsoid; the dispersion of a scatter set of points can be described by a multivariate Gaussian distribution where the covariance matrix may be seen as ellipsoidal shape centered at the mean position. The Minkowski sum and difference of two ellipsoidal sets are in general not ellipsoidal. However, in many applications, we are interested in computing the ellipsoidal set which approximates in a certain sense the Minkowski operations.
A closed-form characterization of Minkowski operations of two ellipsoids has been recently introduced (Yan and Chirikjian, 2015), which provide implicit surface expression and parametric formulas for the boundary of the Minkowski sum and difference of two arbitrary oriented ellipsoids. We adopt in this study a different approach based on the so-called ellipsoidal calculus (Kurzhanski and Valyi, 1996). Ellipsoidal calculus is a method for solving problems in control and estimation theory having unknown but bounded errors in terms of sets of approximating ellipsoidal-value functions.
Using the theory and algorithms from ellipsoidal calculus, we consider parameterized families of external and internal ellipsoids that tightly approximate the Minkowski sum and difference of ellipsoids. Approximations are tight along a direction l in the sense that the support functions on l of the ellipsoids are equal to the support function on l of the sum and difference. External (resp. internal) support function-based approximation can be then selected according to minimal (resp. maximal) measures of volume or trace of the corresponding ellipsoid. We focus in particular on the ellipsoidal approximations to the morphological opening between two ellipsoids and its interest in practical examples.
We discuss also the connection between the approximations to the Minkowski sum and difference of two positive definite matrices and their mean using their Euclidean or Riemannian geometries, which is also related to their Bures-Wasserstein mean (Bhatia et al., 2018).
Bhatia R., Jain T., Lim Y. (2018) On the Bures-Wasserstein Distance Between Positive Definite Matrices. Expositiones Mathematicae, to appear.
Kurzhanski A.B, Valyi I (1996). Ellipsoidal Calculus for Estimation and Control. Birkhäuser, Boston, MA, 1996.
Mazure M.-L (1991) Equations de convolution et formes quadratiques. Annali di Maematica pura et applicata (IV), Vol. CLVIII, 75-97.
Yan Y, Chirikjian G.S (2015) Closed characterization of the Minkowski sum and difference of two ellipsoids. Geom. Dedicata 177:103-128.
|05/02/2019 (14:00) [Lieu : Bât. Euler (room A.002)]|
Silvia Villa (Università di Genova)
Iterative regularization using proximal methods
In the context of linear inverse problems, I will discuss iterative regularization methods allowing to consider large classes of data-fit terms and regularizers.In particular, I will investigate regularization properties of first order proximal splitting optimization techniques. Such methods are appealing since their computational complexity is tailored to the estimation accuracy allowed by the data, as I will show theoretically and numerically.
|18/12/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Zheming Wang (UCLouvain)
Computation of the maximal invariant set of discrete-time systems with quasi-smooth nonlinear constraints
Invariant set theory is an important tool for stability analysis and control design of constrained dynamical systems and it has been successfully used to solve various problems in system and control. In this talk, we study the computation of the maximal invariant set of linear systems with a class of nonlinear constraints that has quadratic relaxations. With these quadratic relaxations, we are able to determine a sufficient condition for the maximal invariant set. By using the sufficient condition, we propose an algorithm to compute the exact maximal invariant set. At each iteration, we will solve a set of linear matrix inequalities instead of a nonlinear optimization problem. Thanks to this feature, the proposed algorithm is computationally tractable and efficient in many cases. Under mild assumptions, the proposed algorithm will terminate in finite time. This algorithm can also be extended to a class of nonlinear systems that admit a state feedback linearization. The performance of this algorithm is demonstrated on several numerical examples.
|12/12/2018 (16:30) [Lieu : Bât. Euler (room A.002)]|
Colin Jones (EPFL)
Predictive Dispatch and Demand Response for Commercial Buildings
Note the unusual day (Wednesday).
Commercial buildings have significant flexibility in how and when they consume energy, and this freedom can be put to good use by grid operators managing the stability of the grid. This talk will discuss demand response for buildings, or the throttling of power consumption in reaction to signals sent by the grid operator. By shifting energy consumption in time, these techniques can be seen as utilizing the thermal inertia of buildings as a form of virtual electrical storage. In the first part of the talk, we will introduce and analyse a hybrid storage system that combines the best properties of electrical batteries, commercial buildings and generation re-scheduling to better provide balancing services at multiple time scales. We will then demonstrate the effectiveness of the proposed control framework via a set of 12 full day experimental results on the 20kV distribution feeder of the EPFL campus that is comprised of: a set of five uncontrollable office buildings (350kWp) and a roof-top PV installation (90kWp), and a set of controllable resources: a grid-connected BESS (720kVA-500kWh), and a small occupied office building (45 kWp).
|04/12/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Guillaume Berger and Benoît Legat (UCLouvain)
Two 20-minute talks
Speaker: Guillaume Berger
Title: Path-complete p-dominant switching linear systems
Abstract: In this presentation, we study the asymptotic properties of a class of dynamical systems called switched linear systems. Therefore, we introduce the notion of path-complete $p$-dominance for switched linear systems as a way to generalize the notion of dominant/slow modes for LTI systems. Path-dominance is characterized by the contraction property of a set of quadratic cones in the state space. We show that path-dominant systems have a low-dimensional asymptotic behavior, and hence allow for a simplified analysis of their dynamics. Finally, we present an algorithm for deciding the path-dominance of a given system.
Speaker: Benoît Legat
Title: Minimally Constrained Stable Switched Systems and Application to Co-simulation
Abstract: We propose an algorithm to restrict the switching signals of a constrained switched system in order to guarantee its stability, while at the same time attempting to keep the largest possible set of allowed switching signals. Our work is motivated by co-simulation of complex dynamical systems by multiple cores. There, numerical stability is a hard constraint, but should be attained by restricting as few as possible the allowed behaviours of the simulators. We apply our results to certify the stability of an adaptive co-simulation orchestration algorithm, which selects the optimal switching signal at run-time, as a function of (varying) performance and accuracy requirements.
|27/11/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Erik Bekkers (Eindhoven University of Technology)
An introduction to sub-Riemannian geometry in SE(2) for vessel analysis and roto-translation equivariant deep learning
This presentation focusses on the application of sub-Riemannian geometry and group theory in 2D medical image analysis. In the discussed image analysis applications, 2D images are lifted to 3D functions on the coupled space of positions and orientations. Lifting is done via a (learned) wavelet-type transformation, leading to a neat organization of image data based on position and orientation. In the extended domain, which we identify with the Lie group SE(2), a sub-Riemannian (SR) geometry is recognized. Here, "sub" refers to the restriction that tangent vectors of naturally lifted curves are contained in a sub-space of the full tangent space. A parallel can be drawn between these curves and the paths of moving cars: a car can only move forward and change direction (2 controls) in a 3D state-space (2D position and orientation), but is not able to move sideways.
In this presentation the following items are discussed:
- A introduction to the Lie group SE(2) (group product, algebra, exponential/logarithmic map and representations)
- An explanation of the sub-Riemannian geometry on SE(2)
- The benefit of using sub-Riemannian geometry for vessel tracking
- The benefit of using the Lie group structure of SE(2) for equivariant or invariant deep learning
|13/11/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Thanh Son Nguyen (UCLouvain)
Parametric model order reduction and interpolation on matrix manifolds as a tool
Large-scale control systems appear frequently as mathematical models for many practical fields such as heat conduction, electrical circuits. Simulation of such systems requires solving equations whose order can reach dozen of thousands. Besides, in many cases, these systems depend on parameter which makes the simulation more challenging. In this talk, first we will give an introduction to model order reduction of parameter-dependent linear time-invariant systems as well as a review of interpolation-based methods. Then we will explain why in some cases, we have to do it in the framework of interpolation on matrix manifolds. Finally, we will present some numerical examples for illustrative purpose.
|06/11/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
New PhD students in INMA
Welcome Seminar #2
In this seminar, new PhD students in INMA will introduce themselves and their research topic: "Compressive Learning : learning from 'too much' data" (Vincent Schellekens); "Signal processing for direct imaging of stellar systems" (Benoît Pairet); "Optimal interference nulling for large arrays of coupled antennas" (Valentin Hamaide); "Algorithms for smart content marketing" (Mridul Seth).
|23/10/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Laurent Demanet (Departments of Mathematics and EAPS, MIT)
Interferometry and convex programming
A number of hard estimation questions in science and engineering can be expressed as recovering a rank-1 matrix from linear measurements. One example that I will cover in this talk is "interferometry", a trick in signal processing and imaging consisting in taking cross-correlations to attenuate incoherent noise. I show that recovery for inverse problems involving interferometric combinations is sometimes possible with varying levels of convex relaxations, and depends on the underlying graph of measurements being an expander. I also show an example of "interferometric thinking" to blind deconvolution, where it enables the design of new sparse regularizers.
|16/10/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
New PhD students in INMA
Welcome Seminar #1
In this seminar, new PhD students in INMA will introduce themselves and their research topic: "Nonnegative matrix factorization in infinite-dimensional feature spaces : a parameterized approach" (Cécile Hautecoeur); "Analysis and Control of Interconnected Systems" (Ayoub Ben Ayed); "Decentralized optimization in open multi-agent systems" (Charles Monnoyer); "MILP-based Algorithm for the Global Solution of Economic Dispatch Problems with Valve-Point Effects" (Loïc Van Hoorebeeck).
|09/10/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Jean-François Cardoso and Pierre Ablin (CNRS - INRIA)
Fast and invariant learning for Independent Component Analysis
Independent Component Analysis (ICA) is a widely used unsupervised data exploration technique. It models a set of observed signals as a linear mixtures of statistically independent sources and aims at recovering blindly those underlying sources. Here `blind' means that sources are recovered from the signals by applying a separating matrix which is totally inconstrained, making ICA applicable in a large variety of tasks. After a brief introduction, we stress the multiplicative structure of the ICA problem: the parameter space is the multiplicative group of GL(n) of square invertible matrices. Learning algorithms which exploit this structure are `equivariant' and are `rewarded' for doing so in terms implementation, optimization and performance, as we shall see. In a second part, we introduce a fast quasi-Newton equivariant algorithm for ICA, the Preconditioned ICA for Real Data (Picard) algorithm. It exploits the specific structure of the problem to compute cheap Hessian approximations, and then refines them using L-BFGS, a classical optimization algorithm. It shows state of the art convergence speed when applied to real datasets. Interestingly, it can be straightforwardly constrained to work on the rotation manifold, a constraint often imposed in ICA.
|02/10/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Adrien Scheuer (UCLouvain)
Multi-scale modelling of fibre suspensions: Particle inertia, confined flows and data-driven approach
Suspensions of fibres and non-spherical particles are encountered in many fields ranging from engineering to biology, e.g. papermaking, composite manufacturing, pharmaceutical applications, red blood cells, food-processing and cosmetics industries, etc. Predicting the evolution of the orientation state of the particles is crucial to estimate the rheology of the suspension, that is its flow behaviour, as well as the final properties of the material. Jefferys theory, describing the kinematics of a single particle immersed in an homogeneous flow of Newtonian fluid, lays the foundation for almost every models used today. Coarser representations, built upon this work, have been introduced later to describe statistically the orientation state of the particles, either using a probability density function, or even moments of this function (Advani-Tucker orientation tensors). The assumptions underlying Jefferys model are however quite restrictive to predict reliably what happens in fibre suspensions flows encountered in industrial processes. In this thesis, we first revisit this model, studying the impact of particle inertia and of confinement (wall effects) on the particle kinematics. In each case, we propose a multi-scale approach, but given the challenges to upscale the microscopic description to the macroscopic scale, we then came up with an innovative approach based on data-driven simulations to circumvent upscaling issues and inaccuracies introduced by macroscopic closure approximations. Finally, we developed efficient numerical methods to simulate fluid flows in thin geometries, considering, within the Proper Generalized Decomposition (PGD) framework, an in-plane/out-of-plane separated representation of the solutions of the incompressible Navier-Stokes equations.
|25/09/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Elarbi Achhab (Université Chouaïb Doukkali)
Compensator Design for a class of Infinite-Dimensional Semilinear Systems
In this talk, we consider a class of partially observed infinite-dimensional semi-linear systems. First, we address the problem of the design of an exponential Luenberger-like observer for this class of systems. Then, using the state estimation result stated, a stabilizing compensator for this class of systems will be designed. A compensator is an auxiliary system which has as its input the output of the initial system and as its output the input of the original one. In our setting, the obtained control law is showed to exponentially stabilize around a desired equilibrium profile the system under consideration. Finally, the main result is applied to a non isothermal chemical plug flow reactor model. The approach is illustrated by numerical simulations.
|18/09/2018 (14:00) [Lieu : Bât. Euler (room A.002)]|
Antoine Godin (Agence Française de Développement)
The Stock-Flow Consistent approach or the importance of disequilibrium
Stock-Flow Consistent (SFC) models are sectoral macroeconomic models that combine two fundamental insights: first, the economy is ruled by imbalances and its reactions to these imbalances, and second, the economy is a dynamical multi-layered network of financial relationships. By combining these two aspects, SFC models are a powerful tool to understand modern financialised economies in modelling feedback loops between exchanges of good and services, financial flows, and wealth. The GEMMES research project at the French Agency for Development aims at contributing to the international debate related to climate change (both for the adaptation and mitigation aspects). Given the importance and the pervasive aspect of the transition to a low carbon economy, it is fundamental to be able to grasp how it will re-shape, create or steer imbalances and hence might lead to unsustainable dynamics such as financial crisis. This seminar will go through three different modelling approaches, all using the insight of Stock-Flow Consistency, to show the importance of disequilibrium and imbalances in economics
|11/09/2018 (16:30) [Lieu : Core(room b.135)]|
Ion Necoara (Univ.Pol.Bucharest)
Stochastic algorithms for convex feasibility and convex minimization
In this talk we present stochastic first-order methods for solving convex feasibility problems or convex minimization problems with many constraints. First, for the convex feasibility problem (that is finding a point in the (in)finite intersection of convex sets) we propose several equivalent stochastic reformulations, such as stochastic (non)smooth optimization problem, stochastic fixed point problem, or stochastic intersection problem. Based on these reformulations and on new characterization of the conditioning parameters we introduce a general random projection algorithmic framework with an over-relaxed stepsize, which generates either new algorithms or extends to random settings many existing alternating projection schemes. We also derive (sub)linear convergence rates for this general algorithm that depend explicitly on the conditioning parameters and on the number of projections computed at each iteration. Then, we extend this stochastic algorithmic framework to convex minimization problems subject to (in)fi nite intersection of constraints. For this algorithm we also derive convergence rates in terms of the expected quadratic distance from the iterates to the optimal solution for smooth strongly convex objective functions, which in the best case is of order O(1/k). We also provide necessary and sufficient conditions for linear convergence of this stochastic method. Finally, we give examples of several functional classes satisfying our new conditions and discuss several applications of these results.
|4/09/2018 (14:00) [Lieu : Bât. Euler (room A.207)]|
Sebastian Stich (EPFL)
Communication Efficient Variants of SGD for Distributed Computing
Nowadays machine learning applications require stochastic optimization algorithms that can be implemented on distributed systems. The communication overhead of the algorithms is a key bottleneck that hinders perfect scalability. In this talk we will discuss two techniques that aim reducing the communication costs. First, we discuss quantization and sparsification techniques that reduce the amount of data that needs to be communicated. We present a variant of SGD with k-sparsification (for instance top-k or random-k) and show that this scheme converges at the same rate as vanilla SGD. That is, the communication can be reduced by a factor of the dimension of the whilst still converging at the same rate. In the second (and shorter) half of the talk we discuss strategies that tackle the communication frequency instead of the communicated data. In particular, we compare local SGD (independent runs of SGD in parallel) with mini batch SGD.