Springer Theses
1 total work
Computational Reconstruction of Missing Data in Biological Research
by Feng Bao
Published 8 August 2021
The emerging biotechnologies have significantly advanced the study of biological mechanisms. However, biological data usually contain a great amount of missing information, e.g. missing features, missing labels or missing samples, which greatly limits the extensive usage of the data. In this book, we introduce different types of biological data missing scenarios and propose machine learning models to improve the data analysis, including deep recurrent neural network recovery for feature missings, robust information theoretic learning for label missings and structure-aware rebalancing for minor sample missings. Models in the book cover the fields of imbalance learning, deep learning, recurrent neural network and statistical inference, providing a wide range of references of the integration between artificial intelligence and biology. With simulated and biological datasets, we apply approaches to a variety of biological tasks, including single-cell characterization, genome-wide association studies, medical image segmentations, and quantify the performances in a number of successful metrics.
The outline of this book is as follows. In Chapter 2, we introduce the statistical recovery of missing data features; in Chapter 3, we introduce the statistical recovery of missing labels; in Chapter 4, we introduce the statistical recovery of missing data sample information; finally, in Chapter 5, we summarize the full text and outlook future directions. This book can be used as references for researchers in computational biology, bioinformatics and biostatistics. Readers are expected to have basic knowledge of statistics and machine learning.