I am using Stata version 15.1 and I have 47 variables and 1,010 observations in my dataset. I am doing educational research with 112 colleges included. I have values that are missing and I want to do a mean replacement of some of these values by the institution. I do not want to just take the general mean and apply it since the mean per institution would be best suited to help with the missing data. I acknowledge the shortcomings of mean replacement, but I am looking at how this might change my outcomes with it included.


What I have researched - I have read two very helpful articles to address this and I feel certain my problem should be solved, but yet I am having technical issues. 1. https://www.statalist.org/forums/for...mean-or-median 2. https://www.stata.com/statalist/arch.../msg00062.html. The first article provided a code solution and the second article helped me to look at how to do a mean replacement based on insitution type.

My code (I tried to model it on article one and the code utilized there with the addition of "by unitid" which I understood to be supporting my approach by producing a mean replacement for missing values based on the mean of the insitutions values and not just the general mean)

Nick Cox recommended using the ds command instead of dropping string variables so I utilized this and then I tried to merge it with the code recommended in the first post by Joro Kolev.

. ds, has (type numeric)
. foreach var of varlist * {
2. summ `var'
3. replace `var' =r(mean), by unitid if missing (`var')
4. }

. ds, has (type numeric)
unitid total_~75000 total_stu_~v tuition_fe~d act_~75_pctl exp_aux_en~l net_revenue number_adm~d ne~e_missing act_compos.. number_app~g
year net_pr~75000 fips sa~d_25_pctl rev_total_~t exp_total_~t merge_admi~e number_enr~l tuition_fe~g act_compos.. number_adm~g
total_~30000 total~110000 type_of_aid sa~d_75_pctl exp_instru~l endowment~eg composite_~x merge_admi~t sat_crit_r.. exp_acad_s~g number_enr~g
net_pr~30000 net_p~110000 avg_tuitio~t sat_math_2~l exp_acad_s~l assets_net merge_fina~i net_price_.. sat_crit_r.. endowment~ng
total_~48000 total_stu_~e tuition_pu~d sat_math_7~l exp_studen~l cpi sex net_price_.. sat_math_2~g composite_~s
net_pr~48000 net_price_~e fees_publi~d act_~25_pctl exp_inst_s~l hepi number_app~d net_price_.. sat_math_7~g sex_missing

. foreach var of varlist * {
foreach command may not result from a macro expansion interactively or in do files
r(198);

I read in the article below that there may be an issue pasting from word. I did not do that and instead took my code directly from post #1 so I am hoping I haven't overlooked something quite obvious.
--https://ift.tt/8cMiwvY

This is the mdesc for the first few variables I have:

. mdesc

Variable | Missing Total Percent Missing
----------------+-----------------------------------------------
unitid | 0 1,010 0.00
year | 0 1,010 0.00
inst_name_~a | 14 1,010 1.39
opeid | 14 1,010 1.39
total_~30000 | 2 1,010 0.20
net_pr~30000 | 2 1,010 0.20
total_~48000 | 2 1,010 0.20
net_pr~48000 | 3 1,010 0.30
total_~75000 | 2 1,010 0.20
net_pr~75000 | 5 1,010 0.50
total~110000 | 2 1,010 0.20
net_p~110000 | 12 1,010 1.19
total_stu_~e | 2 1,010 0.20
net_price_~e | 28 1,010 2.77

------sorry the pasting did not come through very well. I tried a table insert and Excel, but it didn't make it work any better.

-->Can I provide anything else to help reach a solution? I tried my best to be thorough and to read the posting guidelines before sending across my first question, but I would be very happy to update or expand any information to reach a resolution. I really appreciate any help anyone might offer.