2 High-Dimensional Space
2.1 Introduction
2.2 The Law of Large Numbers
2.3 The Geometry of High Dimensions
2.4 Properties of the Unit Ball
2.4.1 Volume of the Unit Ball
2.4.2 Volume Near the Equator
2.5 Generating Points Uniformly at Random from a Ball
2.6 Gaussians in High Dimension
2.7 Random Projection and Johnson-Lindenstrauss Lemma
2.8 Separating Gaussians
2.9 Fitting a Spherical Gaussian to Data
2.10 Bibliographic Notes
2.11 Exercises
3 Best-Fit Subspaces and Singular Value Decomposition (SVD)
3.1 Introduction
3.2 Preliminaries
3.3 Singular Vectors
3.4 Singular Value Decomposition (SVD)
3.5 Best Rank-k Approximations
3.6 Left Singular Vectors
3.7 Power Method for Singular Value Decomposition
3.8 Singular Vectors and Eigenvectors
3.9 Applications of Singular Value Decomposition
3.9.1 Centering Data
3.9.2 Principal Component Analysis
3.9.3 Clustering a Mixture of Spherical Gaussians
3.9.4 Ranking Documents and Web Pages
3.9.5 An Application of SVD to a Discrete Optimization Problem
3.10 Bibliographic Notes
3.11 Exercises
4 Random Walks and Markov Chains
4.1 Stationary Distribution
4.2 Markov Chain Monte Carlo
4.2.1 Metropolis-Hasting Algorithm
4.2.2 Gibbs Sampling
4.3 Areas and Volumes
4.4 Convergence of Random Walks on Undirected Graphs
4.5 Electrical Networks and Random Walks
4.6 Random Walks on Undirected Graphs with Unit Edge Weights
4.7 Random Walks in Euclidean Space
4.8 The Web as a Markov Chain
4.9 Bibliographic Notes
4.10 Exercises
5 Machine Learning
5.1 Introduction
5.2 Overfitting and Uniform Convergence
5.3 Illustrative Examples and Occams Razor
5.3.1 Learning Disjunctions
5.3.2 Occams Razor
5.3.3 Application: Learning Decision Trees
5.4 Regularization: Penalizing Complexity
5.5 Online Learning and the Perceptron Algorithm
……
6 Algorithms for Massive Data Problems: Streaming, Sketching, and Sampling
7 Clustering
8 Random Graphs
9 Topic Models, Non-Negative Matrix Factorization, Hidden Markov Models, and Graphical Models
10 Other Topics
11 Wavelets
12 Appendices