I have a very large dataset that includes people ("personid") who could have been seen at 4 multiple sites (distinguished by "siteid"). The variable "monthyear" is the month and the year that the visit occurred. I want to determine:
1) the counts of people who were seen at more than one site and what the combination of these sites were (for example, in the below dataex example of 4 sites, how many people were seen at sites 19 and 32, 19 and 24, 19 and 47, 32 and 47, etc.)
2) how many people were seen at site 24 AFTER being seen at site 19 (essentially the same table as in my first question but incorporating time)
3) how many people were seen specifically at site 24 AFTER site 19 within 3 months
Any advice much appreciated!
Sarah
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long personid byte siteid str7 monthyear 2 19 "2021_03" 2 19 "2020_07" 3 19 "2019_11" 51 19 "2020_03" 51 19 "2019_01" 51 19 "2019_07" 51 19 "2019_01" 51 19 "2020_12" 52 32 "21-Jan" 52 19 "2020_03" 52 32 "21-Feb" 52 19 "2020_01" 52 19 "2020_03" 53 19 "2021_09" 53 19 "2020_02" 82 19 "2019_10" 82 19 "2021_04" 82 19 "2020_02" 83 19 "2022_02" 83 19 "2019_07" 83 19 "2022_02" 83 19 "2021_02" 83 19 "2022_02" 84 19 "2019_01" 84 47 "19-Oct" 84 19 "2019_04" 84 19 "2019_01" 84 19 "2019_05" 84 19 "2020_08" 84 19 "2019_01" 145 32 "21-Apr" 145 32 "21-Mar" 145 32 "21-Feb" 214 32 "21-Jan" 217 47 "20-Jan" 246 47 "20-Sep" 257 32 "21-Jan" 300 47 "20-May" 306 47 "18-Apr" 335 47 "18-Dec" 347 32 "21-Sep" 347 32 "21-Feb" 375 47 "18-Oct" 379 32 "21-Jan" 379 32 "20-Nov" 399 47 "19-Feb" 432 32 "20-Oct" 432 32 "21-Mar" 432 32 "21-Jan" 432 32 "21-Apr" 432 32 "20-Sep" 432 32 "21-Jan" 432 32 "20-Mar" 432 32 "20-Dec" 432 32 "22-Jan" 432 32 "21-Jan" 432 32 "20-Dec" 451 47 "18-Mar" 451 47 "18-Mar" 451 47 "20-Oct" 451 47 "19-Jul" 451 47 "18-Jul" 451 47 "18-Mar" 452 32 "19-Oct" 452 32 "20-Dec" 457 47 "19-Mar" 463 47 "18-Jun" 539 47 "19-Feb" 545 47 "20-Jan" 545 47 "20-Sep" 545 47 "20-Jan" 552 47 "20-Mar" 570 47 "18-Jan" 570 47 "18-Mar" 570 47 "18-Mar" 8775 47 "20-Jan" 8779 24 "17-Feb" 8779 24 "17-Sep" 8781 47 "19-Aug" 8781 47 "18-Oct" 8795 47 "20-Oct" 8795 47 "18-Oct" 8798 32 "21-Nov" 8806 47 "22-Feb" 8806 47 "21-Aug" 8806 47 "21-Oct" 8806 47 "21-Sep" 8806 47 "21-Mar" 8814 47 "18-Nov" 8815 32 "19-May" 8815 32 "19-Nov" 8815 32 "19-Dec" 8815 32 "20-Mar" 8815 32 "19-Nov" 8815 32 "19-Nov" 8815 32 "20-Mar" 8828 47 "19-Sep" 8828 47 "19-Sep" 8838 47 "18-Oct" 8838 47 "18-Oct" end
0 Response to Counts in a long dataset taking into account other variables including time
Post a Comment