This book presents the state of the art in distributed machine learning algorithms that are based on gradient optimization methods. In the big data era, large-scale datasets pose enormous challenges for the existing machine learning systems. As such, implementing machine learning algorithms in a distributed environment has become a key technology, and recent research has shown gradient-based iterative optimization to be an effective solution. Focusing on methods that can speed up large-scale gradient optimization through both algorithm optimizations and careful system implementations, the book introduces three essential techniques in designing a gradient optimization algorithm to train a distributed machine learning model: parallel strategy, data compression and synchronization protocol.

Written in a tutorial style, it covers a range of topics, from fundamental knowledge to a number of carefully designed algorithms and systems of distributed machine learning. It will appeal to a broad audience in the field of machine learning, artificial intelligence, big data and database management.



This book introduces readers to a workload-aware methodology for large-scale graph algorithm optimization in graph-computing systems, and proposes several optimization techniques that can enable these systems to handle advanced graph algorithms efficiently. More concretely, it proposes a workload-aware cost model to guide the development of high-performance algorithms. On the basis of the cost model, the book subsequently presents a system-level optimization resulting in a partition-aware graph-computing engine, PAGE. In addition, it presents three efficient and scalable advanced graph algorithms – the subgraph enumeration, cohesive subgraph detection, and graph extraction algorithms.

This book offers a valuable reference guide for junior researchers, covering the latest advances in large-scale graph analysis; and for senior researchers, sharing state-of-the-art solutions based on advanced graph algorithms. In addition, all readers will find a workload-aware methodology for designing efficient large-scale graph algorithms.