Dear all,

I have a dataset with 4 variables (time1, index1, time2, index2) of about 500 observations each, sorted by time1 and time2. I am trying to find

(a) within each group of time1 == n, how many entries in index1 appear in index2 for time2 == n?
(b) and vice versa: within each group of time2 == n, how many entries in index2 appear in index1 for time1 == n?

Here is an example of my data:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int time1 byte index1 int time2 byte index2
   0 23    0 17
   0 11    0 17
1000  2    0  8
1000  3 1000  5
1000 11 1000  2
1000 19 1000 16
1000 13 1000 18
2000 16 1000 19
3000  . 1000 18
3000  . 1000  9
3000  . 2000  .
3000  . 2000  .
4000  . 3000 21
4000  6 3000  .
4000 13 3000 27
4000 12 4000 18
5000  . 4000 21
5000 12 4000 19
5000  . 4000 26
5000 12 5000 25
6000  6 5000 24
   .  . 5000  .
   .  . 5000  .
   .  . 6000  .
   .  . 6000 19
end
I am new to Stata and have done some research on this problem. It seems that, if each group within time1 were the same size as its time2 counterpart, I could create a temporary file and use joinby to generate a count of the matches. But I am struggling with this problem because the number of entries in time1 does not correspond to time2. How should I move forward?

Thank you for reading. I really appreciate your time.

Frances Grace