I have data on automobile collisions in a particular region and, among other things, have data on the street intersection that each collision took place at. I have a variable that records the number of times the intersection that a collision took place at is found in the dataset. So for instance, if I had data on 4 collisions, one of which was at the intersection "Main & Side" and three of which were at the intersection "King & Queen", the data would look like this:

Collision ID Intersection intcount
1 Main & Side 1
2 King & Queen 3
3 King & Queen 3
4 King & Queen 3

I want to regress the number of times an intersection appears in the dataset on particular traits of that intersection. So the dependent variable would be intcount and the independent variables would be things like traffic control, speed limit, etc.

The problem is that because of the way I have set up the variable intcount, Stata counts each collision as a separate observation. To reference the table above, really I only have two observations -- one of Main & Side with an intcount value of 1, and one of King & Queen with an intcount value of 3. But Stata thinks I have four observations because it sees three instances of intcount=3 and one instance of intcount=1.

I'm looking for a way to recode the intcount variable so that its value is missing in all collisions except one for each intersection. That way when I run the regression the number of observations will be the number of distinct intersections, not the total number of collisions. Any help on how to do this would be greatly appreciated. Feel free also to let me know if there is an entirely different way to approach this issue that would be better.