Foundations and Trends (R) in Machine Learning
2 total works
Submodular functions are relevant to machine learning for at least two reasons: (1) some problems may be expressed directly as the optimization of submodular functions, and (2) the Lovász extension of submodular functions provides a useful set of regularization functions for supervised and unsupervised learning.
In Learning with Submodular Functions, the theory of submodular functions is presented in a self-contained way from a convex analysis perspective, presenting tight links between certain polyhedra, combinatorial optimization and convex optimization problems. In particular, it describes how submodular function minimization is equivalent to solving a wide variety of convex optimization problems. This allows the derivation of new efficient algorithms for approximate and exact submodular function minimization with theoretical guarantees and good practical performance.
By listing many examples of submodular functions, it reviews various applications to machine learning, such as clustering, experimental design, sensor placement, graphical model structure learning or subset selection, as well as a family of structured sparsity-inducing norms that can be derived and used from submodular functions.
This is an ideal reference for researchers, scientists, or engineers with an interest in applying submodular functions to machine learning problems.
In Learning with Submodular Functions, the theory of submodular functions is presented in a self-contained way from a convex analysis perspective, presenting tight links between certain polyhedra, combinatorial optimization and convex optimization problems. In particular, it describes how submodular function minimization is equivalent to solving a wide variety of convex optimization problems. This allows the derivation of new efficient algorithms for approximate and exact submodular function minimization with theoretical guarantees and good practical performance.
By listing many examples of submodular functions, it reviews various applications to machine learning, such as clustering, experimental design, sensor placement, graphical model structure learning or subset selection, as well as a family of structured sparsity-inducing norms that can be derived and used from submodular functions.
This is an ideal reference for researchers, scientists, or engineers with an interest in applying submodular functions to machine learning problems.
Optimization with Sparsity-Inducing Penalties
by Francis Bach, Rodolph Jenatton, Julien Mairal, and Guillaume Obozinski
Published 4 January 2012
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate nonsmooth norms.
Optimization with Sparsity-Inducing Penalties presents optimization tools and techniques dedicated to such sparsity-inducing penalties from a general perspective. It covers proximal methods, block-coordinate descent, reweighted ?2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provides an extensive set of experiments to compare various algorithms from a computational point of view.
The presentation of this book is essentially based on existing literature, but the process of constructing a general framework leads naturally to new results, connections and points of view. It is an ideal reference on the topic for anyone working in machine learning and related areas.
Optimization with Sparsity-Inducing Penalties presents optimization tools and techniques dedicated to such sparsity-inducing penalties from a general perspective. It covers proximal methods, block-coordinate descent, reweighted ?2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provides an extensive set of experiments to compare various algorithms from a computational point of view.
The presentation of this book is essentially based on existing literature, but the process of constructing a general framework leads naturally to new results, connections and points of view. It is an ideal reference on the topic for anyone working in machine learning and related areas.