I'm regressing something on a factor variable with many levels. I would like to save the coefficient and SE for each level, and save them in a dataset that I can process further. I thought I could do that using statsby, but the output I'm getting is not in a desirable format. Should I be using statsby differently, or is there a different command that does what I want.

Here's a toy version of my problem. Load the auto data, and break each auto's name into make and model.
sysuse auto.dta, clear
rename make make_model
split make_model, limit(2)
rename make_model1 make
rename make_model2 model
Now I'd like to regress price on make, separately for foreign and domestic autos. I can't do it the obvious way
bysort foreign: reg price i.make
because Stata won't accept strings as factor variables. (That always annoys me.) I need to encode the auto make as a number
encode make, gen(makenum)
list makenum if make=="Datsun", nolabel /* Datsun is make number 7, but still labeled Datsun in output without the nolabel option */
Now I can run my regression:
bysort foreign: reg price i.makenum
And the output looks very clear. For example, the coefficient for a Datsun (makenum==7) is -1986.
Now I wrap this up with statsby:
sort foreign
statsby _b _se, by (foreign) clear: reg price i.makenum
Now I've got all my coefficients in a dataset in memory. But the way they're laid out is not very informative. The columns are labeled _stat_1, stat_2, etc. The coefficient for Datsun's is in _stat_7, which I might have guessed, but the standard error for Datsun is in _stat_31, which is impossible to guess unless you look at the variable labels.

I wasn't expecting this. I would think that statsby would put the coeffiecient and standard error for Datsun's (make 7) in columns called _b_make_7 and _se_make_7 (or really _b_Datsun and _se_Datsun, but maybe that's asking for too much).

Is there a way to get what I want out of statsby? Or a different command that I should use? Thanks.

Paul


.