Greetings
I would like to run the following operation a 1000 times and store the coefficient estimates used in each sub sample. Here is a summary of what I am doing, I have a data set of about 64033 individuals , it has been difficult to create a weight matrix due to memory challenges. I use the following loop to draw a random sample from full data set and create weights based on the random sample. Also i calculate the mean hiv prevalence for all the neighbors in my sample , hiv is my dependent variable. after this i keep the only the sample that has been drawn and run a probit regression on it . However, I am trying to repeat this operation a 1000 times and store the regression estimates of each randomly drawn sample. I need to create a loop that will enable me to do this but it has been a challenge could anyone assist. The procedure i am trying to repeat a 1000 times is given below
*Load full data
use "D:\Users\mi8318ch\Stata Files\Medical Study\DHS Data\Panel Data DHS\New Merge\DHSFinalAnalysis.dta"
/*Create a random variable to select sub-sample*/
quietly generate random = uniform()
quietly generate sample = random<.1
/*Creating a temporal variable to stack the spatial lagged variable*/
quietly generate Wy = .
gsort -sample random
quietly generate idtemp = _n
/*Calculating the spatial lagged variable*/
/* for the dependant variable */
/*Identifying the begining and end of the calulation to be made*/
quietly summarize idtemp if sample==1
/*Identifying the # of observations to loop on*/
global nstart = r(min)
global nend = r(max)
/*Starting the calculation for the same time period*/ z 1/3
forvalues i = $nstart/$nend {
*display `i'
/*Determining the number of nearest neighbours*/
global nn = 5
/*Identifying the geographical coordinate of the observation*/
quietly summarize longnum if idtemp==`i'
global xi = r(mean)
quietl summarize latnum if idtemp==`i'
global yi = r(mean)
/*Calculating the distance to all other observations*/
quietly generate distance = sqrt((longnum - $xi)^2 + (latnum - $yi)^2)
/*Creating a temporary variable*/
quietly generate temp = sample==0
quietly replace temp = 2 if idtemp==`i'
/*Sorting observations*/
gsort distance
/*Calculating the mean hiv prevalence of nearest neighbours*/
quietly summarize hivpositive if idtemp!=`i' & _n<=($nn + 1)
/*Replacing the value for the spatial lagged variable*/
quietly replace Wy = r(mean) if idtemp==`i'
/*Dropping unecessary and temporary variables*/
drop distance temp
}
/*Save the temporary data set with a subsample*/
keep if sample==1
drop idtemp
*savetemporary data (that is radom sample)
save "D:\Users\mi8318ch\Stata Files\Medical Study\DHS Data\Panel Data DHS\New Merge\Temporary data\TempDHSdata4.dta", replace
/*Make the analysis*/
svy: probit hivpositive churchkm Wy churchW lnhospitalkm dhsprotestant2 Age Age2 Married female urban1 river10kmdum Explorer50kmdum Rail50kmdum lnElevationMean i.Province1 i.WealthIndex HIVKnowledge i.occupation2 i.highested
estimates store reg_1
Related Posts with Creating a Loop over a Loop
Economic SignificanceHi, I have a large dataset. The dependent variable is Y and independent variable is X and there are…
Estimation of the same survival models converges sometimes and sometimes doesn'tI estimate survival models. However, the model converges sometimes and sometimes it doesn't even tho…
Panel data with more than one observation in a YearHi all, My data set looks like below, input int YEAR str47 Acquirer long DealNumber double MA_DVAL…
Cluster by firmHi- does clustering by firm has any impact on coefficient (or on Constant)? I read it reduces coeffi…
Negative -hausman- in the logit/xtlogit decisionI have unbalanced panel data and want to decide whether to compute (pooled) -logit-, -xtlogit,fe- or…
Subscribe to:
Post Comments (Atom)
0 Response to Creating a Loop over a Loop
Post a Comment