Dear All, I found this question here (in Chinese). The raw data set is:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int year long province byte Var1
2001 1 3
2001 2 4
2001 3 5
2002 1 4
2002 2 5
2002 3 8
end
label values province province
label def province 1 "北京", modify
label def province 2 "天津", modify
label def province 3 "河北", modify
The desired outcome is
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int year long province byte(Var1 Var2)
2001 1 3 0
2001 1 3 1
2001 1 3 2
2001 2 4 1
2001 2 4 0
2001 2 4 1
2001 3 5 2
2001 3 5 1
2001 3 5 0
2002 1 4 0
2002 1 4 1
2002 1 4 4
2002 2 5 1
2002 2 5 0
2002 2 5 3
2002 3 8 4
2002 3 8 3
2002 3 8 0
end
label values province province
label def province 1 "北京", modify
label def province 2 "天津", modify
label def province 3 "河北", modify
I know that Var1 can be obtained by
Code:
expand 3
sort year province
The problem is how to obtain Var2.

As you can see, for year 2001 in province 1 (or 北京), the values for Var2 are 0, 1, and 2. They are obtained as follows.
Look at the first observation of the raw data (2001, 北京), the value of Var1 is 3. It is then used to minus the values in the same year (2001) for different provinces (3,4, and 5) , respectively, and in absolute values.
Thus, the first three values for Var2, 0, 1, and 2 are calculated as |3-3|, |3-4|, and |3-5|, respectively. Any suggestions? Thanks.