I want to randomly select 40 observations from a dataset for INSURED (1=yes, 0=no) by each CITY. I also want to keep all the data set and only create a dummy variable (SELECT) that marks if those observations were randomly selected or no.
I figure that the code for the random selection is:
sample 40 count, by(INSURED, CITY)
But I am having troubles to keep my complete dataset and only create the SELECT variable.
(Simple sample of my dataset)
| Unique_ID | CITY | INSURED | SELECT |
| 34 | TOR | 1 | 1 |
| 35 | BOS | 0 | 0 |
| 36 | BOS | 0 | 0 |
| 37 | BOS | 1 | 0 |
| 38 | BOS | 1 | 0 |
| 39 | LAX | 1 | 0 |
| 40 | LAX | 1 | 0 |
| 41 | LAX | 0 | 0 |
| 42 | LAX | 0 | 0 |
| 43 | LAX | 1 | 0 |
| 44 | TOR | 0 | 1 |
| 45 | TOR | 0 | 0 |
| 46 | TOR | 1 | 0 |
| 47 | TOR | 1 | 1 |
0 Response to Mark observations selected in sample
Post a Comment