Hello Listers,

I have a client who is taxing my skills when it comes to formatting tables for publication - something I wish I were better at and so I welcome their challenge. I'm working with survey data and they have requested a table which lists each (categorical) survey question, all the possible outcomes, and various percentages (weighted, unweighted), counts and "influence" on a particular outcome variable.

I have successfully created such a dataset, but lack the skills to quite get it out into MS Word they way I would like it to look. Something tells me that figuring this out well could be useful in the future, so I am trying to find an algorithmic way to accomplish this, rather than merging cells by hand in Word/Excel.

I wanted to create an MWE to show, so I found the NHANES2 dataset and applied my code.

HTML Code:
webuse nhanes2, clear
. svyset psu [pw=finalwgt], strata(strata)
label variable region "This variable does not have a very long label, but if it did have a very long label, then it would likely be cut off."

    quietly: postfile AnnexDataTabs str20 Variable str60 Question str40 ResponseOptions MeanORF SvyPcnt Obs Pcnt using "AnnexTabulations.dta", replace    
    foreach var of varlist region smsa sex race hlthstat heartatk diabetes sizplace female black orace fhtatk rural agegrp highlead {
        *display in red "1. Variable: `var'"               //  Just in case I need to debug and figure out where my problem is arrising.
        quietly: tab `var'                                 //  So, what are we looking at here?
        local numRows = `r(r)'                             //  Figure out how many rows there are to the table.
        local numResponses = `r(N)'                        //  Responding Sample size (to make percents from counts later)
        local vlab: variable label `var'                   //  Storing the VARIABLE'S label.
        quietly: svy: tab `var'                            //  Running svy tab so I can store percentages of each response as a matrix
            matrix freqTable = e(Obs)                      //  matrix of observation counts
            matrix pcntTable = e(Obs)/`numResponses'       //  matrix of observation counts converted to percentages.
            matrix valueNames = e(Row)'                    //  values for row variable (ie 0, 1, 2, etc.)
            matrix svyFreqTable = e(Prop)                  //  matrix of cell proportions for each response
        *matrix list freqTable                             //  These are just here in case you want to peek at the matrices.
        *matrix list valueNames
        *matrix list svyFreqTable
        *display in red "2. Variable: `var', Variable Label: `vlab', Number of rows: `numRows'"  //  Just for QC when building.
        
        // In this section I'm going to start storing each table row's values as locals so that I can stuff them into the postfile at the end.
        forvalues x = 1/`numRows' {
            local y = valueNames[`x',1]
            local specificValueLabel: label (`var') `y'  //Here I grab the variable label VALUES (word(s)) for that particular variable.
            *display in red "3. Variable: `var', Row Number: `x', Var Value: `y'"    //  Uncomment to figure out where things are failing.
            local matRow`x' = freqTable[`x',1]
            local labRow`x' = valueNames[`x',1]
            quietly: svy: mean bpsystol if `var' == `y'
                matrix meanBP = e(b)                       //  e(b)         coefficient vector
                local MeanBP = meanBP[1,1]
            *display in red "This row: `labRow`x'', Number of obs: `matRow`x''"      //  Uncomment to figure out where things are failing.
            post AnnexDataTabs ("`var'") ("`vlab'") ("`specificValueLabel'") (round(`MeanBP',0.1)) (round(svyFreqTable[`x',1]*100,0.1)) (freqTable[`x',1]) (round(pcntTable[`x',1]*100,0.1)) 
        }
    }
    postclose AnnexDataTabs
    use "AnnexTabulations.dta", clear
    list, sepby(Variable)
The main question that I have is whether there is a way to export this to MS Word (not TeX) in a way where the first column of the table would appear once (centered vertically) instead of being repeated multiple times. So I guess that's two questions: 1) Suggestions on how to push to Word from the dataset (not from analysis) and 2) Formatting the table as it's pushed. Thank you for any help you can provide. I think there is a third question about label lengths floating around in there as well, but I suspect this post from Nick might contain enough for me to solve that problem.

Below is the kind of visual I'm attempting to create:

Array