ISOMAP algorithm for Nonlinear Dimensionality Reduction

By | June 3, 2023

Step 3 – Multidimensional Scaling (MDS)

In step 3, we use Multidimensional Scaling (MDS) to embed our data into a lower-dimensional space in such a way that the pairwise distances between points (computed in Step 2) are preserved as much as possible.

To do this, MDS essentially transforms our geodesic distance matrix into a matrix of inner products and then applies an eigenvalue decomposition. The resulting eigenvectors (principal components) form the lower-dimensional representation of our data.

Let’s dive into these steps:

  1. Transform the distance matrix to a matrix B of inner products. From the geodesic distance matrix shortest_path_matrix (D), we can calculate B using a formula called double centering: B = -1/2 * J * D^2 * J Here, D^2 is the element-wise square of D, and J is a centering matrix defined as J = I – 1/n * (1 * 1^T), where I is the identity matrix, 1 is a column vector of ones, and n is the number of data points.
  2. Compute the Eigen-decomposition of the matrix B. Eigen-decomposition will give us a set of eigenvectors and corresponding eigenvalues.
  3. Select the top k eigenvectors. The k eigenvectors corresponding to the k largest eigenvalues are selected. These eigenvectors are the coordinates of the data points in the new k-dimensional space. The corresponding eigenvalues indicate the amount of variance captured by each eigenvector.

Suppose we’re reducing the data to two dimensions. In that case, we would choose the two eigenvectors corresponding to the two largest eigenvalues, and each data point’s coordinates in the new space would be given by its projection onto these two eigenvectors.

This process effectively ‘unfolds’ the manifold, allowing us to see the underlying structure in the data.

Here is how it looks with an example:

Let’s continue with our example and suppose we want to reduce it to one dimension. We have:

shortest_path_matrix = [[0, 1, 2],
                        [1, 0, 1],
                        [2, 1, 0]]

We double center this to obtain B, perform an eigenvalue decomposition, and select the eigenvector corresponding to the largest eigenvalue. Suppose the result is [1, -2, 1].

Thus, the first data point (A) gets mapped to 1, the second (B) to -2, and the third (C) to 1 in our new one-dimensional space. The geodesic distances between the points are preserved as well as possible in this new space. For instance, the Euclidean distance between A and B in the new space is abs(1 – (-2)) = 3, which is proportional to the original geodesic distance 1.

Leave a Reply

Your email address will not be published. Required fields are marked *