Hello!

For my thesis I want to test a specific hypothesis that requires me to generate a variable UNDERPERFORMING if a certain company (GVKEY) underperformed in a certain year (YEAR)

My data is in a panel data structure, and every company (for every company there is a unique company ID: GVKEY) belongs to a Standard Industry Classification category (SIC). This SIC code exists out of 2 digits (ranging from 20 to 39).

I wanted to know whether there is a quick way to generate a dummy variable that equals "1" if
- a certain company belongs to the bottom 25% companies (relative to the SIC median) based on a profit margin measure (EBIT)
- only for that year
- within a SIC category

For example: company 1932, SIC=21, YEAR=2010, EBIT=5%
--> in the UNDERPERFORMING column, the value should equal 1 if for that year, in SIC=21, that 5% falls in the lowest 25% of all firms in SIC=21, for 2010

And that for every observation, of course.

Right now, I only would know how to do everything manually and that would cost me a lot of time.

I know it is very complicated, I would be very grateful if someone were to be able to help me!