Foundations and Trends (R) in Machine Learning
3 total works
Generalized Low Rank Models
by Madeleine Udell, Corinne Horn, Reza Zadeh, and Stephen Boyd
Published 23 June 2016
Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, the authors extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types.
This framework encompasses many well-known techniques in data analysis, such as non-negative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. The authors propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.
This framework encompasses many well-known techniques in data analysis, such as non-negative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. The authors propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.
Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable.
This book argues that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas-Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ?1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, it discusses applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. It also discusses general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.
This book argues that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas-Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ?1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, it discusses applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. It also discusses general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.
Minimum-Distortion Embedding
by Akshay Agrawal, Alnur Ali, and Stephen Boyd
Published 8 September 2021
Embeddings provide concrete numerical representations of otherwise abstract items, for use in downstream tasks. For example, a biologist might look for subfamilies of related cells by clustering embedding vectors associated with individual cells, while a machine learning practitioner might use vector representations of words as features for a classification task. In this monograph the authors present a general framework for faithful embedding called minimum-distortion embedding (MDE) that generalizes the common cases in which similarities between items are described by weights or distances. The MDE framework is simple but general. It includes a wide variety of specific embedding methods, including spectral embedding, principal component analysis, multidimensional scaling, Euclidean distance problems, etc.The authors provide a detailed description of minimum-distortion embedding problem and describe the theory behind creating solutions to all aspects. They also give describe in detail algorithms for computing minimum-distortion embeddings. Finally, they provide examples on how to approximately solve many MDE problems involving real datasets, including images, co-authorship networks, United States county demographics, population genetics, and single-cell mRNA transcriptomes.An accompanying open-source software package, PyMDE, makes it easy for practitioners to experiment with different embeddings via different choices of distortion functions and constraint sets.The theory and techniques described and illustrated in this book will be of interest to researchers and practitioners working on modern-day systems that look to adopt cutting-edge artificial intelligence.