Overview
ISOMAP (Isometric Mapping) is a technique used for dimensionality reduction. It is one of the Non-linear dimensionality reduction methods that provides a low-dimensional embedding of high-dimensional data. Unlike PCA or SVD which are linear methods and look for linear correlations in the data, ISOMAP can handle non-linear correlations.
Here’s how it works:
- Step 1 – Neighborhood Graph Construction:First, ISOMAP constructs a neighborhood graph. For each data point, it computes the Euclidean distance to every other data point. If two data points are “close enough” (i.e., if the distance is less than a certain threshold, or if the data point is within the ‘k’ nearest neighbors), they are connected with an edge in the graph. The weight of that edge is the Euclidean distance between the points.
- Step 2 – Compute Geodesic Distances:In the second step, ISOMAP approximates geodesic distances between all pairs of data points. The geodesic distance between two data points is the shortest path in the graph that connects them. This can be computed efficiently using Dijkstra’s algorithm or the Floyd-Warshall algorithm.
- Step 3 – Multidimensional Scaling (MDS):The third step is where the actual dimensionality reduction happens. The algorithm applies classical multidimensional scaling (MDS) to the matrix of geodesic distances, computing a low-dimensional embedding of the data that preserves these distances as well as possible.MDS tries to place each object in N-dimensional space such that the between-object distances are preserved as well as possible. If the dimension of the space is less than the number of objects, then perfect reconstruction will usually be impossible.
The benefit of ISOMAP compared to PCA or SVD is that it can handle non-linear relationships between variables. PCA or SVD are linear techniques and they attempt to preserve the global structure of the data in the reduced space. ISOMAP, on the other hand, tries to preserve the geodesic (or non-linear) distances between all pairs of data points. This makes it capable of uncovering non-linear structure in the data that PCA or SVD may not find. For example, if your data lies on a curved manifold in high-dimensional space, ISOMAP may be able to “unroll” this manifold and find a low-dimensional embedding that preserves the intrinsic geometry of the data.