Hi everyone,

I am trying to generate a dummy variable that is conditional on whether the variable of interest (ma_score) is in the top quartile in both years t-2 and t-1. Specifically, the paper's instructions are as follows, "To identify high-ability managers, we first form quartiles (by industry and year) of the MA-Score. We define High-Ability Managers as those in the top quartile of MA-Score in both years t-2 and t-1. This approach reduces the likelihood that idiosyncratic performance in a single year affects our identification of high-ability managers. Note that we do not expect managerial ability to change in the short run. Rather, we consider the scores across 2 years to reduce possible measurement error".

Can anyone help me with these instructions? I don't understand how to make the variable based on quartiles (by industry and year) as well.

Sample data:
input double year long gvkey float sic_2 double MA_SCORE_2018_w
1984 1001 58 .1674201
1985 1001 58 .0530939
1983 1003 57 .048832
1984 1003 57 .0081078
1986 1003 57 .0695462
1987 1003 57 .1106393
1988 1003 57 .0730525
1989 1003 57 .0283304
1980 1004 50 -.0183764
1981 1004 50 -.0333748
1982 1004 50 -.0341477
1983 1004 50 -.0444578
1984 1004 50 -.0505183
1985 1004 50 -.0110314
1986 1004 50 -.0288378
1987 1004 50 -.0385843
1988 1004 50 -.0431447
1989 1004 50 -.0293015
1990 1004 50 -.0577608
1991 1004 50 -.0493341
1992 1004 50 -.0543336
1993 1004 50 -.0513123
1994 1004 50 -.0827768
1995 1004 50 -.0793594
1996 1004 50 -.0692754
1997 1004 50 .0072931
1998 1004 50 -.0387679
1999 1004 50 -.0730754
2000 1004 50 -.0696314
2001 1004 50 -.020128
2002 1004 50 -.0847194
2003 1004 50 -.0698555
2004 1004 50 -.0677616
2005 1004 50 -.0726101
2006 1004 50 -.0773012
2007 1004 50 -.0549248
2008 1004 50 -.089513
2009 1004 50 -.0799284
2010 1004 50 -.0529496
2011 1004 50 -.0370836
end