Spectral Clustering and GMM
How does Spectral Clustering work?
In spectral clustering, the data points are treated as nodes of a graph. Thus, clustering is treated as a graph partitioning problem. The nodes are then mapped to a low-dimensional space that can be easily segregated to form clusters. An important point to note is that no assumption is made about the shape/form of the clusters.
What are the steps for Spectral Clustering?
Spectral clustering involves 3 steps:
1. Compute a similarity graph
2. Project the data onto a low-dimensional space
3. Create clusters
Step 1 — Compute a similarity graph:
METHODS FOR DETERMINING OPTIMAL NUMBER OF CLUSTERS:
A fundamental step for any unsupervised algorithm is to determine the optimal number of clusters into which the data may be clustered. The Elbow Method is one of the most popular methods to determine this optimal value of k.
Support Vector Machines
HIERARCHICAL AGGLOMERATIVE CLUSTERING:
Implementation code in python:
Also known as bottom-up approach or hierarchical agglomerative clustering (HAC). A structure that is more informative than the unstructured set of clusters returned by flat clustering. This clustering algorithm does not require us to prespecify the number of clusters. Bottom-up algorithms treat each data as a singleton cluster at the outset and then successively agglomerates pairs of clusters until all clusters have been merged into a single cluster that contains all data.
Word Embeddings are the texts converted into numbers and there may be different numerical representations of the same text.
” Word Embeddings are Word converted into numbers ”
A dictionary may be the list of all unique words in the sentence. So, a dictionary may look like — [‘Word’,’Embeddings’,’are’,’Converted’,’into’,’numbers’]
A vector representation of a word may be a one-hot encoded vector where 1 stands for the position where the word exists and 0 everywhere else. The vector representation of “numbers” in this format according to the above dictionary is [0,0,0,0,0,1] and of converted is[0,0,0,1,0,0].
This is just a…
DBSCAN Clustering Algorithm
Data analytics and Naval architecture. Undergraduate at IIT Kharagpur. Loves to spend time with football.