I have a dataset of workers, and I want to divide them into clusters based on some observables, one of them is categorical (industry). I'm trying to do k-mean clustering. Instead of using many dummy variables for the different categories, I built a similarity measure between each pair of industries and want to use it in the k-mean algorithm. My question is how can I use this pre-existing similarity matrix in the k-mean computation, together with other continuous variables (e.g., education). The way I'm doing it right now is first to collapse the similarity matrix into 2 or 3 dimensions using multidimensional scaling process and then use the results in the k-mean method. Is there a way to use the similarity matrix directly?
Related Posts with k-mean clustering using existing similarity matrix
Exporting a series of tab, sum() tables to excelHello, I am creating a series of tabulation summaries using code similar to the code below: Code: …
Longitudinal Poisson count models - xtmepoissonDear all, I am struggling in how to model at its best the following situation: I have different subj…
Is "Pseudo R2"considered to be important in logistic regression? If so,what does it mean?Dear statalists, Sorry to disturb you all. As what says in the topic——Is "Pseudo R2"considered to be…
Global macro with `var'Dear Statalisters, I am trying to use a global macro that contains a reference to a loop (`var'). T…
Caculate proximity between two variablesHi all, I have dataset with patent, patent class, inventor and firm. Example as follows: firm_id yea…
Subscribe to:
Post Comments (Atom)
0 Response to k-mean clustering using existing similarity matrix
Post a Comment