Hello,

My data set consists of funds and their historical returns from 2004-2018. My goal is in a first step to compare the performance of 20% best and 20% worst (calculated for each observation date) performing funds for the whole time period.

My variables are id (different for each fund), date, and hret (historical return).
I did already build dummy variables for the best and worst 20% of funds for each date. Now I want to analyze if the performance difference within these dummy variables is statistically significant over the whole time period (2004-2018) or not.
For now my code looks the following:

Code:
* Using egenmore, xtile
ssc install egenmore
egen hret_decile = xtile(hret), by(date) nq(10)
gen byte top_performer_hret = 1 if inlist(hret_decile, 9, 10)
gen byte bottom_performer_hret = 1 if inlist(hret_decile, 1, 2)
Now I do have two dummy variables for each date’s top and bottom performers.

How can I now compare these two dummy variables regarding their hret (historicalreturn)? I would like to see the mean hret of both dummy variables and analyze if the difference is statistically significant or not by performing a t-test. However, I do not how to do it. I would appreciate any advice.

Furthermore, is my approach of using dummy variables the best one or are there other (perhaps more suitable) possibilities?

Thank you for your help,

Tim Wolf