Collision ID | Intersection | intcount |
1 | Main & Side | 1 |
2 | King & Queen | 3 |
3 | King & Queen | 3 |
4 | King & Queen | 3 |
I want to regress the number of times an intersection appears in the dataset on particular traits of that intersection. So the dependent variable would be intcount and the independent variables would be things like traffic control, speed limit, etc.
The problem is that because of the way I have set up the variable intcount, Stata counts each collision as a separate observation. To reference the table above, really I only have two observations -- one of Main & Side with an intcount value of 1, and one of King & Queen with an intcount value of 3. But Stata thinks I have four observations because it sees three instances of intcount=3 and one instance of intcount=1.
I'm looking for a way to recode the intcount variable so that its value is missing in all collisions except one for each intersection. That way when I run the regression the number of observations will be the number of distinct intersections, not the total number of collisions. Any help on how to do this would be greatly appreciated. Feel free also to let me know if there is an entirely different way to approach this issue that would be better.
0 Response to Create a variable that has only one value per category
Post a Comment