I want to randomly select 40 observations from a dataset for INSURED (1=yes, 0=no) by each CITY. I also want to keep all the data set and only create a dummy variable (SELECT) that marks if those observations were randomly selected or no.
I figure that the code for the random selection is:
sample 40 count, by(INSURED, CITY)
But I am having troubles to keep my complete dataset and only create the SELECT variable.
(Simple sample of my dataset)
Unique_ID | CITY | INSURED | SELECT |
34 | TOR | 1 | 1 |
35 | BOS | 0 | 0 |
36 | BOS | 0 | 0 |
37 | BOS | 1 | 0 |
38 | BOS | 1 | 0 |
39 | LAX | 1 | 0 |
40 | LAX | 1 | 0 |
41 | LAX | 0 | 0 |
42 | LAX | 0 | 0 |
43 | LAX | 1 | 0 |
44 | TOR | 0 | 1 |
45 | TOR | 0 | 0 |
46 | TOR | 1 | 0 |
47 | TOR | 1 | 1 |
0 Response to Mark observations selected in sample
Post a Comment