Hi all,

I have a problem when I use the -collapse- command. In the following code every time I use the command -collapse- it does not matter if my variable with two categories has (0 and 1) or (1 and 2) or any other pair of numbers; I always get for that variable a collapse of only 0 ( if my two categories are 0 and 1).

To explain myself, better If I have two variables X and Z. My variable Z is numerical categorical and its values ​​are of two categories, for example, have values ​​of 2 and 3. Then, when I use the command -collapse-:
collapse X, by (z) -> I have as result from X and Z only one value for Z that is 2. I do not understand why?

I should have obtained observations for Z = 2 and Z = 3 as a result.

My code is the following

Code:
* Creamos los labels para los loops
local trat_agrup agrupado
local trat_byciu byciudad

* Creamos los labels para los loops // EMPLEO //
preserve
use "$data\labels_var.dta", replace
rename name_var variables
keep variables
duplicates drop variables, force
levelsof variables, local(variables)
disp `variables'
count    
restore

* Agrupamos los tratamientos
foreach base in `trat_agrup' `trat_byciu' {

* Llamamos la base
    quietly use "$data\EMPSAL_synth_byciudad.dta", clear 
    quietly joinby cvemun using "$data/SYNTH_cvemun_`base'_weight.dta", unmatched(both)
    quietly tab _merge, mis
    
    quietly keep emp_* salprom_* salmed_* year month Weight_* norte_tax cvemun

* Separamos las unidades tratadas
    quietly preserve
    quietly keep if norte_tax==1
    quietly collapse (mean) emp_* salprom_* salmed_* (sum) Weight_* (first) norte_tax, by(year month)
    quietly tempfile base_did_1
    quietly save `base_did_1', replace
    quietly restore

* Juntamos las unidades control con el agrupado de las unidades tratadas
    quietly preserve
    quietly keep if norte_tax==0
    quietly tempfile base_did_2
    quietly save `base_did_2', replace
    quietly restore

    quietly use `base_did_2', clear
    quietly append using `base_did_1'
    quietly replace cvemun=33000 if cvemun==.

* Volvemos a generar el indice para las variables
foreach var in `variables' {
            quietly gen Log`var'=log(`var')
            quietly gen byte baseyearmonth=1 if (year==2018 & month==9)
            quietly by cvemun (baseyearmonth), sort: gen Index`var' = Log`var' - Log`var'[1]
            quietly drop baseyearmonth
}    

    quietly drop norte_tax
    quietly gen norte_tax=(cvemun==33000)

    
* Creamos los labels para los loops // EMPLEO //
    quietly preserve
    quietly use "$data\labels_var.dta", replace
    quietly keep if empsal=="emp"
    quietly rename name_var variables
    quietly keep variables
    quietly duplicates drop variables, force
    quietly levelsof variables, local(variables1)
    quietly disp `variables1'
    quietly count    
    quietly restore

foreach var in `variables1' {

* Generamos solo las tendencias de los grupos control y tratamiento
            quietly preserve

            quietly collapse (mean) Index`var' [aw=Weight_`var'], by(year month norte_tax)
        
            quietly reshape wide Index`var',  i(year month) j(norte_tax)
        
            quietly gen date = ym(year,month)
            quietly tsset  date, monthly
            quietly rename (date Index`var'0 Index`var'1) (_time _Y_synthetic _Y_treated)

* Pasamos a porcentajes el valor de la estimacion
            quietly replace _Y_treated=_Y_treated*100 + 100
            quietly replace _Y_synthetic=_Y_synthetic*100 +100
            quietly gen dif=_Y_treated - _Y_synthetic
        
            quietly twoway     (line _Y_treated _time, lwidth(medthick) lpattern(dash) lcolor(green) sort)  //// 
                            (line _Y_synthetic _time, lwidth(medthick) lpattern(solid) lcolor(blue) sort) ////
                            (line dif _time, lwidth(medthick) lpattern(shortdash) lcolor(black) sort yaxis(2)), ////
                             xtitle("", /*margin(medium) height(12)*/ size(medsmall)) ////
                             xlabel(`=tm(2015m1)'(4)`=tm(2019m10)', angle(45) grid glwidth(medthin) glpattern(dash) labsize(medsmall)) ////
                             xline(`=tm(2019m1)', lwidth(medthin) lpattern(dash) lcolor(red)) ////
                             ytitle("Índice (%)", margin(medium) /*height(12)*/ size(medsmall)) ////
                             ylabel(70(10)125, angle(0) /*format(%03.2f)*/ grid glwidth(medthin) glpattern(dash) labsize(medsmall)) ////
                             ytitle("Diferencia T - C", axis(2) margin(medium) /*height(12)*/ size(medsmall)) ////
                             ylabel(-4(4)16, axis(2) angle(0) /*format(%03.2f) grid glwidth(medthin) glpattern(dash)*/ labsize(medsmall)) ///
                             yline(0, axis(2) lwidth(medthin) lpattern(solid) lcolor(gs9)) ///
                             text(125 `=tm(2019m1)' "Incremento" "ene. 2019", size(medium) color(grey) place(w)) ////
                             legend(order(1 "Tratamiento (Municipios Zona Norte)" ///
                                           2 "Control sintético (Municipios NO Zona Norte)" ///
                                           3 "Tratamiento - Control") row(3) size(medsmall) symxsize(*0.6) /*span region(lcolor(white))*/) ////
                             title("Variable: `var'", margin(medlarge)) ////
                             subtitle("`base'", height(-12)) ///
                             caption("NOTA1: Índice con base en septiembre 2018" ///
                                     "NOTA2: Estimaciones hasta mes de Mayo" size(small))       ////             
                             graphregion(fcolor(white))
            quietly graph export "$graph\synth_manu`base'_`var'.emf", replace font("Times New Roman")
            quietly restore
}
}
I do not know to load a database to statalist. If you could explain to me how it is loading a database.

Thanks,

Alexis Rodas