Greetings
I would like to run the following operation a 1000 times and store the coefficient estimates used in each sub sample. Here is a summary of what I am doing, I have a data set of about 64033 individuals , it has been difficult to create a weight matrix due to memory challenges. I use the following loop to draw a random sample from full data set and create weights based on the random sample. Also i calculate the mean hiv prevalence for all the neighbors in my sample , hiv is my dependent variable. after this i keep the only the sample that has been drawn and run a probit regression on it . However, I am trying to repeat this operation a 1000 times and store the regression estimates of each randomly drawn sample. I need to create a loop that will enable me to do this but it has been a challenge could anyone assist. The procedure i am trying to repeat a 1000 times is given below
*Load full data
use "D:\Users\mi8318ch\Stata Files\Medical Study\DHS Data\Panel Data DHS\New Merge\DHSFinalAnalysis.dta"
/*Create a random variable to select sub-sample*/
quietly generate random = uniform()
quietly generate sample = random<.1
/*Creating a temporal variable to stack the spatial lagged variable*/
quietly generate Wy = .
gsort -sample random
quietly generate idtemp = _n
/*Calculating the spatial lagged variable*/
/* for the dependant variable */
/*Identifying the begining and end of the calulation to be made*/
quietly summarize idtemp if sample==1
/*Identifying the # of observations to loop on*/
global nstart = r(min)
global nend = r(max)
/*Starting the calculation for the same time period*/ z 1/3
forvalues i = $nstart/$nend {
*display `i'
/*Determining the number of nearest neighbours*/
global nn = 5
/*Identifying the geographical coordinate of the observation*/
quietly summarize longnum if idtemp==`i'
global xi = r(mean)
quietl summarize latnum if idtemp==`i'
global yi = r(mean)
/*Calculating the distance to all other observations*/
quietly generate distance = sqrt((longnum - $xi)^2 + (latnum - $yi)^2)
/*Creating a temporary variable*/
quietly generate temp = sample==0
quietly replace temp = 2 if idtemp==`i'
/*Sorting observations*/
gsort distance
/*Calculating the mean hiv prevalence of nearest neighbours*/
quietly summarize hivpositive if idtemp!=`i' & _n<=($nn + 1)
/*Replacing the value for the spatial lagged variable*/
quietly replace Wy = r(mean) if idtemp==`i'
/*Dropping unecessary and temporary variables*/
drop distance temp
}
/*Save the temporary data set with a subsample*/
keep if sample==1
drop idtemp
*savetemporary data (that is radom sample)
save "D:\Users\mi8318ch\Stata Files\Medical Study\DHS Data\Panel Data DHS\New Merge\Temporary data\TempDHSdata4.dta", replace
/*Make the analysis*/
svy: probit hivpositive churchkm Wy churchW lnhospitalkm dhsprotestant2 Age Age2 Married female urban1 river10kmdum Explorer50kmdum Rail50kmdum lnElevationMean i.Province1 i.WealthIndex HIVKnowledge i.occupation2 i.highested
estimates store reg_1
Related Posts with Creating a Loop over a Loop
Panel data problemHi all, I'm trying to clean up my data to use it in a data panel, but I'm encountering several prob…
histogram colorsHi everyone, i hope you are all keeping safe. I have a simple question. i am creating a histogram pl…
Counting the number in a groupI have data on district name and school name. I am trying to get a list of how many schools there ar…
Create hbar with multiple dummy varsHello everyone, I would like to create a bar in which on the X axis I have a numerical var and on t…
Logit - Panel - BaseHello, I have a panel data set from 2013 - 2017 where each individual is included for 3 years. My d…
Subscribe to:
Post Comments (Atom)
0 Response to Creating a Loop over a Loop
Post a Comment