I am working with a dataset of firms in Brazil and I need to create some descriptive statistics at the municipal level.
In particular, I would like to create two variables.
The first tells me which municipality (I have a string variable for municipality) has the largest value for a given numeric variable (this variable is named 'order_var') and the second variable tells me which municipality has the second-largest value for the same numeric variable ('order_var').
The dataset is like
municipality | order_var |
SALVADOR | 21.5 |
BELO HORIZONTE | 15.8 |
RIO DE JANEIRO | 31 |
CANOAS | 29.5 |
CAXIAS DO SUL | 25.4 |
and I would like it be as the following
municipality | order_var | max1_var | max2_var |
SALVADOR | 21.5 | RIO DE JANEIRO | CANOAS |
BELO HORIZONTE | 15.8 | RIO DE JANEIRO | CANOAS |
RIO DE JANEIRO | 31 | RIO DE JANEIRO | CANOAS |
CANOAS | 29.5 | RIO DE JANEIRO | CANOAS |
CAXIAS DO SUL | 25.4 | RIO DE JANEIRO | CANOAS |
where 'max1_var' is the first variable aforementioned and 'max2_var' is the second one, respectively.
Can you help me with that?
Thank you very much.
Any help is greatly appreciated.
Obs: here is the code to import the fictitious data (Let's suppose that we have 10 firms
clear
input str25 municipality order_var
"SALVADOR" 21.5
"BELO HORIZONTE" 15.8
"RIO DE JANEIRO" 31
"CANOAS" 29.5
"CAXIAS DO SUL" 25.4
end
0 Response to How to create a variable that indicates the value of a variable associated with the largest value for another variable
Post a Comment