Please, help me to resolve the following issue with reduced sample of data below:
Below is the dataset, where “ID” – is company id, “project_id” – is an identifier of the project; “share” – is the percentage of shares of that company; “leader” – is equal to 1 for a leader of the purchase (the are might be more than one leaders); “participant_num” – is total number of participants. What I want is to impute missing “share” from the same “project id” structure that has non-missing “share”. For example: “project_id” = 19 has 5 participants with one leader. The same structure has “project_id” 16820 and 16000. So, on average the leader in 16820 and 16000 has shares of 35.7 ([50+21.4]/2) and all the rest participant have 16.0875 ([(21.5+14.3+28.6+14.3)/2] + [(10+10+10+20)/2]) / 2. This numers should be replaced for “project_id” = 19 (i.e. "ID" =
7827 should equal to 35.7 and all the rest of the missing "shares" of “project_id” = 19 should equal to 16.0875 . Please, help me with this issue.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double(project_id ID share) float(participant_num leader) 19 6003 . 5 . 19 6039 . 5 . 19 5849 . 5 . 19 7827 . 5 1 19 5952 . 5 . 98 5893 33 3 . 98 5862 50 3 1 98 6835 17 3 . 99 6835 17 3 . 99 5893 33 3 . 99 5862 50 3 1 100 5862 . 3 1 100 6835 . 3 . 100 5893 . 3 . 311 5831 66.7 2 . 311 6162 33.3 2 1 433 6140 . 2 1 433 5950 . 2 . 437 5862 85.7 2 1 437 6168 14.3 2 . 590 6123 30 4 . 590 7878 30 4 . 590 7839 20 4 . 590 5914 20 4 . 627 6003 . 4 . 627 5891 . 4 . 627 5893 . 4 . 627 7910 . 4 . 628 6003 25 4 . 628 5893 25 4 . 628 7910 25 4 . 628 5891 25 4 . 735 6036 15 7 . 735 6139 10 7 . 735 5952 10 7 . 735 6033 25 7 1 735 5978 15 7 . 735 6075 15 7 . 735 5905 10 7 . 850 5889 25 7 1 850 6075 12.5 7 . 850 6135 12.5 7 . 850 6014 12.5 7 . 850 5952 12.5 7 . 850 5893 12.5 7 . 850 6123 12.5 7 . 907 6059 . 7 . 907 5891 . 7 1 907 6033 . 7 . 907 6082 . 7 . 907 6036 . 7 . 907 5982 . 7 . 907 5831 . 7 . 1116 6059 20 7 1 1116 5980 8 7 . 1116 6838 8 7 . 1116 5914 24 7 1 1116 2446 16 7 . 1116 6442 8 7 . 1116 6836 16 7 . 1117 5980 . 7 . 1117 2446 . 7 . 1117 5914 . 7 1 1117 6836 . 7 . 1117 6442 . 7 . 1117 6059 . 7 . 1117 6838 . 7 . 16000 7847 10 5 . 16000 7852 10 5 . 16000 13497 20 5 . 16000 5862 10 5 . 16000 7855 50 5 1 16819 7847 28.6 5 1 16819 13497 21.5 5 . 16819 7852 14.3 5 . 16819 5862 21.4 5 1 16819 7855 14.3 5 . 16820 7852 14.3 5 . 16820 5862 21.4 5 1 16820 7855 14.3 5 . 16820 7847 28.6 5 . 16820 13497 21.5 5 . end
0 Response to Replace missing observations with the average of non-missing observations
Post a Comment