Seminar Details

Home > Seminars > Details

2023-12-05 (14h) : Two talks

At Euler building (room A.002)

Organized by Mathematical Engineering

Section 1: Exact convergence rate of the last iterate in subgradient methods

Speaker : Zamani, Moslem

Abstract : Subgradient methods are widely employed for addressing non-differentiable optimization problems. In this talk, we study the convergence of the last iterate in subgradient methods applied to the minimization of a nonsmooth convex function with bounded subgradients. We propose a novel proof technique that expands upon the conventional analysis of subgradient methods. We then derive convergence rates for two variants of the subgradient method, with either fixed step size and fixed step length. We show that these rates are exact by constructing functions for which the subgradient method matches the proven rate. Finally, we introduce an optimized subgradient method, based on a new sequence of stepsizes, which achieves a last-iterate convergence rate matching the established lower bounds for non-differentiable convex optimization problems.

Section 2: Optimization without retraction on the random generalized Stiefel manifold for canonical correlation analysis

Speaker : Vary, Simon

Abstract : Optimization over the set of matrices that satisfy X^T B X = I_p , referred to as the generalized Stiefel manifold, appears in many applications such as canonical correlation analysis (CCA) and the generalized eigenvalue problem (GEVP). Solving these problems for large-scale datasets is computationally expensive and is typically done by either computing the closed-form solution with subsampled data or by iterative methods such as Riemannian approaches. Building on the work of Ablin and Peyre (2022), we propose an inexpensive iterative method that does not enforce the constraint in every iteration exactly, but instead it produces iterations that converge to the generalized Stiefel manifold. We also tackle the random case, where the matrix B is an expectation. Our method requires only efficient matrix multiplications, and has the same sublinear convergence rate as its Riemannian counterpart. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA for measuring model representation similarity.

← Back to Seminars