Hello everyone,

I am currently sitting on an issue regarding taking the median of the 350 highest values per period. Basically I have the market equity (variable "me") for each year (variable "monthly_date") for multiple companies (variable "companies"), I want Stata to group all companies within that year, take the 350 companies with the highest market value and from the 350 companies with the highest market value I want to calculate the median.

My data looks like this:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int company float(monthly_date me)
 2 677   73280
 2 689  109610
 2 701  244600
 2 713  563190
 3 641   36570
 3 653   44750
 3 665  143170
 3 677   54250
 3 689   18690
 5 605   76750
 5 617   81250
 5 629   81000
 5 641  112480
 6 425   18190
 6 473   27920
 6 485   28800
 6 497   22210
 6 509   24310
 6 521   16050
 6 533   17640
 6 545   11290
 6 557    5120
 6 569    5460
 6 581    3220
 6 593    2930
 6 605    4780
 6 617    1110
 7 581  157430
 7 593  240560
 7 605  440400
 7 617  757340
 7 629  827040
 7 641  904180
 7 653  761700
 7 665 1041450
 7 677 1560010
 7 689 1990410
 7 701 2735150
 7 713 3030950
 8 425   77470
 8 437   92210
 8 449   80130
 8 461  174640
 8 473  191140
 8 485  825560
 8 497  892010
 8 509  340390
 8 521   95670
 8 533  191290
 8 545  287880
 8 557  953860
 8 569 1232120
 8 581  949150
 8 593  986490
 8 605 1426260
 8 617 2553610
 8 629 2979150
 8 641 4589000
 8 653 5965620
 8 665 5379720
 8 677 3682250
 8 689 3980100
 9 689  102530
 9 701   19490
10 641  393680
10 653  843550
10 665 1238050
10 677 1849900
10 689 1221640
10 701  514660
10 713  727080
11 581    3160
11 593    6980
11 605   11150
11 617    7070
11 629    7680
11 641    7640
11 653    7930
11 665   11710
11 677   15030
11 689   11860
11 701   23300
11 713   35440
12 545     470
12 557    1440
12 569    5680
12 581    3830
12 593   10080
12 605   18070
12 617   31590
12 629   52380
12 641  111210
12 653  104650
12 665  121710
12 677  253560
12 689  392530
12 701  680960
12 713  189260
13 569   95150
13 581   48910
end
format %tm monthly_date
I was thinking to do the command rowsort but I am not sure how to use it when I have a time variable that needs to be considered

After collecting the data I would simply run this command:
Code:
bys monthly_date: egen size = pctile(me), p(50)

Happy New Year to everyone and stay healthy