1. Problem description
I have a data set that is produced by a data base in the format as in the table below, although this example is simplified. The data data set is outputted in a dataex script below the table. The data set is a result of merging two data sets, data1 and data2. In the example below the variable names show which variable comes from which data set. The two data sets are merged on country.
After I have this data I want to multiply the value in data2_varK with the value of the variable listed in data1_varY. So for country Aland I want to multiply data1_varY with data2_varK (22 * .25), but for Eland I wan to multiply data1_varZ with data2_varK (5 * .5). While most countries will only have the one value I want to use in data1_varX, data1_varY and data1_varZ, there are cases, like Cland in the example, that has multiple values.
We want to do this multiple times and sometimes it is not multiplication. It could also be generate a dummy if value in data1_varX, data1_varY or data1_varZ is higher than a cut-off value that also coming from data2.
country | data1_varX | data1_varY | data1_varZ | data2_varJ | data2_varK |
Aland | 22 | data1_varY | .25 | ||
Bland | 34 | data1_varY | .75 | ||
Cland | 15 | 42 | data1_varX | .6 | |
Dland | 24 | data1_varX | .85 | ||
Eland | 34 | data1_varZ | .75 | ||
Fland | 5 | data1_varZ | .5 |
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str5 country byte(data1_varX data1_varY data1_varZ) str10 data2_varJ double(data2_varK result) "Aland" . 22 . "data1_varY" .25 5.5 "Bland" . 34 . "data1_varY" .75 25.5 "Cland" 15 42 . "data1_varX" .6 9 "Dland" 24 . . "data1_varX" .85 20.4 "Eland" . . 34 "data1_varZ" .75 25.5 "Fland" . . 5 "data1_varZ" .5 2.5 end
Test 2a;
Code:
gen result2a = data2_varK * `=data2_varJ[_n]'
Code:
gen result2a = data2_varK * data1_varY
Code:
bys data2_varJ : gen result2b = data2_varK * `=data2_varJ[_n]'
Code:
bys data2_varJ : gen result2b = data2_varK * data1_varY
Code:
bys data2_varJ : gen result2c = data2_varK * `=data2_varJ[1]'
3. What works but we are concerned will be too slow
Code:
gen results3a = . forvalue obs = 1/`=_N' { replace results3a = data2_varK * `=data2_varJ[`obs']' if _n == `obs' }
Thanks,
Kristoffer
0 Response to Dynamically change variable used in evaluation across rows, for example like this: gen result2a = *data2_varK * `=data2_varJ[_n]'
Post a Comment