Hi All,
I have a large balanced panel dataset in hand, and I want to sample the data by its ID while keeping the panel feature of the dataset.
I know I should use the "sample" command, but I am not sure how could I keep the panel feature of the data.
I want to sample according to individuals in the dataset, and if the person is sampled, I want to keep all their data across the different periods.
I could only think of a cumbersome way: is to duplicate drop by idcode first, then sampled the unique idcode, and then save the data to later merge with the original dataset if matched. This could take very long for my large dataset, so I am wondering if any of you have a better idea.
Thank you so much for your help,
Alex
For example, I want to sample 5% according to the idcode, but if the idcode is sampled, I would like to retain all its periods' data.
webuse nlswork, clear
sort idcode year
list in 1/50
idcode year
1. 1 70
2. 1 71
3. 1 72
4. 1 73
5. 1 75
6. 1 77
7. 1 78
8. 1 80
9. 1 83
10. 1 85
11. 1 87
12. 1 88
13. 2 71
14. 2 72
15. 2 73
16. 2 75
17. 2 77
18. 2 78
19. 2 80
20. 2 82
21. 2 83
22. 2 85
23. 2 87
24. 2 88
25. 3 68
26. 3 69
27. 3 70
28. 3 71
29. 3 72
30. 3 73
31. 3 75
32. 3 77
33. 3 78
34. 3 80
35. 3 82
36. 3 83
37. 3 85
38. 3 87
39. 3 88
40. 4 70
41. 4 71
42. 4 72
43. 4 73
44. 4 75
45. 4 80
0 Response to How to sample panel data by ID while keeping the length of Panel?
Post a Comment