Hello all,
How are you? I hope your week is going well!
Im conducting a k-medoids cluster analysis, using package clutils and command clpam. Im having some issues with my program. Here is a quick overview of my problem -
I have 26 waves (years) of data from the NLSY79 1979-2014. In each wave, a respondent (with person id "CASEID") has one status (and only one) from the following options -
1) Military
2) Education
3) Employment
4) Housework
5) Unemployment
6) Out of labor force (oolf)
Therefore, each respondent has a 26-year status sequence with no missing data.
Following Professor Halpin, the creator of package clutils, and his 2017 presentation on cluster analysis, I want to conduct k-medoids cluster analysis on Stata and sort all the consequences into five (5) distinct clusters, using clpam. My ideal clusters should be something like -
1) Lifetime military, including military all the way or military to further education (GI Bill) then back to military
2) Military to labor market without further education
3) Military to further education to the civilian labor market
4) labor market all the way without higher education
5) higher education to labor market.
My codes are -
sort CASEID
matrix dissim subcost = education military employment unemployment housekeeping oolf, variables matching dissim(oneminus) allbinary
matrix subA =subcost[1..6, 1..6]
clpam k5, dist(subcost) id(CASEID) medoids(5) many
However, an error message showed up, saying " variable CASEID does not uniquely identify the observation." If I understand the error correctly, I think it is because I have 26 waves, so the each CASEID shows up 26 times, and it does not uniquely identify the observation, which is true.
Im supposed to cluster the status sequences, not the individual status from each year. I dont think my code is doing what I want to accomplish.
If possible, would anyone please point me to the right direction?
Any help would be much appreciated!
Thank you very much!
Have a great day!
Rachelle
Related Posts with Question on Cluster Anlysis Partitioning Around Medoids (PAM) using clpam
Help with constructing time allocation variableHi, hope you are well. I needed some help with a particular dataset I am using for a study. My resea…
Difference-in-difference modelDear friends, if there is some literature about two-period DID model. If a policy was implemented fo…
IPTW ATE and binary outcomeDear all I'm performing an IPTW on Stata 14 comparing effect of a drug on a dicotomic outcome (death…
how to generate a new variable to describe "symbol"Hi, I have panel data (symbols, daily returns) with 97 symbols to describe ETFs, and all of them ha…
Selecting variables based on the value of another variableHello, I have a wide dataset which contains the birth year of various persons as well as the GDP f…
Subscribe to:
Post Comments (Atom)
0 Response to Question on Cluster Anlysis Partitioning Around Medoids (PAM) using clpam
Post a Comment