Book 818

This book presents the bi-partial approach to data analysis, which is both uniquely general and enables the development of techniques for many data analysis problems, including related models and algorithms. It is based on adequate representation of the essential clustering problem: to group together the similar, and to separate the dissimilar. This leads to a general objective function and subsequently to a broad class of concrete implementations. Using this basis, a suboptimising procedure can be developed, together with a variety of implementations.

This procedure has a striking affinity with the classical hierarchical merger algorithms, while also incorporating the stopping rule, based on the objective function. The approach resolves the cluster number issue, as the solutions obtained include both the content and the number of clusters. Further, it is demonstrated how the bi-partial principle can be effectively applied to a wide variety of problems in data analysis.

The book offers a valuable resource for all data scientists who wish to broaden their perspective on basic approaches and essential problems, and to thus find answers to questions that are often overlooked or have yet to be solved convincingly. It is also intended for graduate students in the computer and data sciences, and will complement their knowledge and skills with fresh insights on problems that are otherwise treated in the standard “academic” manner.


Book 957

This book presents a new perspective on and a new approach to a wide spectrum of situations, related to data analysis, actually, a kind of a new paradigm. Namely, for a given data set and its partition, whose origins may be of any kind, the authors try to reconstruct this partition on the basis of the data set given, using very broadly conceived clustering procedure. The main advantages of this new paradigm concern the substantive aspects of the particular cases considered, mainly in view of the variety of interpretations, which can be assumed in the framework of the paradigm. Due to the novel problem formulation and the flexibility in the interpretations of this problem and its components, the domains, which are encompassed (or at least affected) by the potential use of the paradigm, include cluster analysis, classification, outlier detection, feature selection, and even factor analysis as well as geometry of the data set. The book is useful for all those who look for new, nonconventional approaches to their data analysis problems.