In my dataset, observations are for infrastructure projects of "infra" type (A = building, B = bridge, etc.) in "vID" villages. Each village has its own unique ID. My goal is to create a new variable with unique infrastructure IDs, based on "vID" and "infra". As shown below, a single village can have multiple infrastructure projects of the same type.
Code:
---------------------------------------- vID duple infra desired ---------------------------------------- 1118010006 0 A 1118010006A1 3203150004 0 A 3203150004A1 6110020012 2 A 6110020012A1 6110020012 2 A 6110020012A2 1118010002 3 A 1118010002A1 1118010002 3 A 1118010002A2 1118010002 3 A 1118010002A3
I believe one step in this process may involve concatenate, but I would first need to generate unique "counts" (?) of each project of the same type in the same village. This is the tricky part for me. I have tried strategies such as that in the code box below. However, this only gives me two "2" values for two projects in the same village. Rather, I want one "1" value for the first project and one "2" value for the second project in the same village.
Code:
egen infraID = count(vID), by (vID)
If concatenate is a good way forward, I have trouble formatting the new string variable to display the full numeric segment. I would prefer "1118010006A1" displayed. Instead, Stata displays "1.23e+09A1" as an example. I have tried and had no luck digging through format guides for categorical variables. I am using Stata 15.1 on a mac, for the record.
Many thanks for your attention!
0 Response to Generating unique ID variable from numeric and categorical variables
Post a Comment