Greetings

I would like to run the following operation a 1000 times and store the coefficient estimates used in each sub sample. Here is a summary of what I am doing, I have a data set of about 64033 individuals , it has been difficult to create a weight matrix due to memory challenges. I use the following loop to draw a random sample from full data set and create weights based on the random sample. Also i calculate the mean hiv prevalence for all the neighbors in my sample , hiv is my dependent variable. after this i keep the only the sample that has been drawn and run a probit regression on it . However, I am trying to repeat this operation a 1000 times and store the regression estimates of each randomly drawn sample. I need to create a loop that will enable me to do this but it has been a challenge could anyone assist. The procedure i am trying to repeat a 1000 times is given below

*Load full data
use "D:\Users\mi8318ch\Stata Files\Medical Study\DHS Data\Panel Data DHS\New Merge\DHSFinalAnalysis.dta"

/*Create a random variable to select sub-sample*/
quietly generate random = uniform()
quietly generate sample = random<.1

/*Creating a temporal variable to stack the spatial lagged variable*/
quietly generate Wy = .
gsort -sample random
quietly generate idtemp = _n
/*Calculating the spatial lagged variable*/
/* for the dependant variable */
/*Identifying the begining and end of the calulation to be made*/
quietly summarize idtemp if sample==1

/*Identifying the # of observations to loop on*/
global nstart = r(min)
global nend = r(max)

/*Starting the calculation for the same time period*/ z 1/3
forvalues i = $nstart/$nend {
*display `i'
/*Determining the number of nearest neighbours*/
global nn = 5
/*Identifying the geographical coordinate of the observation*/
quietly summarize longnum if idtemp==`i'
global xi = r(mean)
quietl summarize latnum if idtemp==`i'
global yi = r(mean)
/*Calculating the distance to all other observations*/
quietly generate distance = sqrt((longnum - $xi)^2 + (latnum - $yi)^2)
/*Creating a temporary variable*/
quietly generate temp = sample==0
quietly replace temp = 2 if idtemp==`i'
/*Sorting observations*/
gsort distance
/*Calculating the mean hiv prevalence of nearest neighbours*/
quietly summarize hivpositive if idtemp!=`i' & _n<=($nn + 1)
/*Replacing the value for the spatial lagged variable*/
quietly replace Wy = r(mean) if idtemp==`i'
/*Dropping unecessary and temporary variables*/
drop distance temp
}

/*Save the temporary data set with a subsample*/
keep if sample==1
drop idtemp
*savetemporary data (that is radom sample)
save "D:\Users\mi8318ch\Stata Files\Medical Study\DHS Data\Panel Data DHS\New Merge\Temporary data\TempDHSdata4.dta", replace

/*Make the analysis*/

svy: probit hivpositive churchkm Wy churchW lnhospitalkm dhsprotestant2 Age Age2 Married female urban1 river10kmdum Explorer50kmdum Rail50kmdum lnElevationMean i.Province1 i.WealthIndex HIVKnowledge i.occupation2 i.highested
estimates store reg_1