Replace missing observations with the average of non-missing observations

Dear Stata Users,

Please, help me to resolve the following issue with reduced sample of data below:

Below is the dataset, where “ID” – is company id, “project_id” – is an identifier of the project; “share” – is the percentage of shares of that company; “leader” – is equal to 1 for a leader of the purchase (the are might be more than one leaders); “participant_num” – is total number of participants. What I want is to impute missing “share” from the same “project id” structure that has non-missing “share”. For example: “project_id” = 19 has 5 participants with one leader. The same structure has “project_id” 16820 and 16000. So, on average the leader in 16820 and 16000 has shares of 35.7 ([50+21.4]/2) and all the rest participant have 16.0875 ([(21.5+14.3+28.6+14.3)/2] + [(10+10+10+20)/2]) / 2. This numers should be replaced for “project_id” = 19 (i.e. "ID" =
7827 should equal to 35.7 and all the rest of the missing "shares" of “project_id” = 19 should equal to 16.0875 . Please, help me with this issue.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(project_id ID share) float(participant_num leader)
   19  6003    . 5 .
   19  6039    . 5 .
   19  5849    . 5 .
   19  7827    . 5 1
   19  5952    . 5 .
   98  5893   33 3 .
   98  5862   50 3 1
   98  6835   17 3 .
   99  6835   17 3 .
   99  5893   33 3 .
   99  5862   50 3 1
  100  5862    . 3 1
  100  6835    . 3 .
  100  5893    . 3 .
  311  5831 66.7 2 .
  311  6162 33.3 2 1
  433  6140    . 2 1
  433  5950    . 2 .
  437  5862 85.7 2 1
  437  6168 14.3 2 .
  590  6123   30 4 .
  590  7878   30 4 .
  590  7839   20 4 .
  590  5914   20 4 .
  627  6003    . 4 .
  627  5891    . 4 .
  627  5893    . 4 .
  627  7910    . 4 .
  628  6003   25 4 .
  628  5893   25 4 .
  628  7910   25 4 .
  628  5891   25 4 .
  735  6036   15 7 .
  735  6139   10 7 .
  735  5952   10 7 .
  735  6033   25 7 1
  735  5978   15 7 .
  735  6075   15 7 .
  735  5905   10 7 .
  850  5889   25 7 1
  850  6075 12.5 7 .
  850  6135 12.5 7 .
  850  6014 12.5 7 .
  850  5952 12.5 7 .
  850  5893 12.5 7 .
  850  6123 12.5 7 .
  907  6059    . 7 .
  907  5891    . 7 1
  907  6033    . 7 .
  907  6082    . 7 .
  907  6036    . 7 .
  907  5982    . 7 .
  907  5831    . 7 .
 1116  6059   20 7 1
 1116  5980    8 7 .
 1116  6838    8 7 .
 1116  5914   24 7 1
 1116  2446   16 7 .
 1116  6442    8 7 .
 1116  6836   16 7 .
 1117  5980    . 7 .
 1117  2446    . 7 .
 1117  5914    . 7 1
 1117  6836    . 7 .
 1117  6442    . 7 .
 1117  6059    . 7 .
 1117  6838    . 7 .
16000  7847   10 5 .
16000  7852   10 5 .
16000 13497   20 5 .
16000  5862   10 5 .
16000  7855   50 5 1
16819  7847 28.6 5 1
16819 13497 21.5 5 .
16819  7852 14.3 5 .
16819  5862 21.4 5 1
16819  7855 14.3 5 .
16820  7852 14.3 5 .
16820  5862 21.4 5 1
16820  7855 14.3 5 .
16820  7847 28.6 5 .
16820 13497 21.5 5 .
end

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Replace missing observations with the average of non-missing observations
Replace missing observations with the average of non-missing observations

0 Response to Replace missing observations with the average of non-missing observations

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Replace missing observations with the average of non-missing observations Replace missing observations with the average of non-missing observations

Related Posts with Replace missing observations with the average of non-missing observations

0 Response to Replace missing observations with the average of non-missing observations

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Replace missing observations with the average of non-missing observations
Replace missing observations with the average of non-missing observations