Dear Stata users,

I am working with panel data for funds and look for a solution to calculate standard errors (SEs) of a single variable (return) on a given day t. These SEs need to be clustered around the respective values for the cluster_variable (which refers to different investment styles in this case). I.e. I want the SEs only to be calcluated for all observations with the same cluster_variable on day t, and not for the whole sample on the day. As you can see, the cluster_variable is static over time for each fund.

Here is a short example.



Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input byte(fund t) double return byte cluster_variable
1 1  .1 1
2 1  .2 1
3 1 .08 2
4 1  .9 2
5 1  .7 2
1 2  .4 1
2 2  .5 1
3 2 .03 2
4 2  .2 2
5 2  .4 2
end




I have contemplated to produce the SDs and then count the observations (obs) of each cluster variable to produce SEs, following SE = SD/sqrt(obs). So I started with: egen SD = sd(return) by (cluster_variable t) to generate the following.

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input byte(fund t) double return byte cluster_variable float SD
1 1  .1 1 .07071068
2 1  .2 1 .07071068
3 1 .08 2  .4275512
4 1  .9 2  .4275512
5 1  .7 2  .4275512
1 2  .4 1 .07071068
2 2  .5 1 .07071068
3 2 .03 2  .1852026
4 2  .2 2  .1852026
5 2  .4 2  .1852026
end


Can anyone provide a more elegant way to derive the desired SEs or provide help how to count the number of same cluster_variable observations on a given day t?
The counting result (obs) should look like this in a new variable:

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input byte(fund t) double return byte cluster_variable float(SD obs)
1 1  .1 1 .07071068 2
2 1  .2 1 .07071068 2
3 1 .08 2  .4275512 3
4 1  .9 2  .4275512 3
5 1  .7 2  .4275512 3
1 2  .4 1 .07071068 2
2 2  .5 1 .07071068 2
3 2 .03 2  .1852026 3
4 2  .2 2  .1852026 3
5 2  .4 2  .1852026 3
end



The data above is a simplified example. The real dataset has >1.000 funds and around 12 cluster variables.

Best,
Daniel