Hi, I have a large dataset on course enrollment. Individual students take courses in different semesters. Observations are unique at the individual-semester-coursenum level. Individuals also have different graduation years ("cohort"). I would like to choose, for each individual, a random sample of individuals in their cohort of a different size ("total") that is different for each individual. The best possible way I can think of is to loop through the individual observations and use the randomtag command, and create a unique identifer for each value of random tag (possibly the unique identifer of the student) - so for example, I could use the following commands:
preserve
keep id cohort total
duplicates drop /* We now have one observation per individual */
sort id
local N=_N
set seed 1357
forvalues i = 1/`N' {
local id = id[`i']
local year = cohort[`i']
local groupsize = total[`i']
randomtag if cohort == `year', count(`groupsize') g(selected)
g randomgroup = .
replace randomgroup = `id'*selected
}
sort id
save randomgroups.dta
restore
sort id
merge id using randomgroups.dta
I'm wondering if there is a faster way to do this, rather than looping over individual observations to generate random samples one at a time. Thank you for your suggestions.
Related Posts with Drawing random sample from a large data set for each observation
Creating matched-adjusted performance measuresHi everyone, I am new to this forum and I'm hoping that someone can help me. I have a sample of fir…
Calculating timeframes between variables in long formatDear Forum-Users, I have a data set in long format – combining prescription data and dates of doc vi…
Dropping many variables if all observations contain a certain textDear All, I am having trouble with a command in Stata and no matter how much I researched beforehand…
Comparing interaction effects using "nlcom"Dear Statalisters, In my model, I have two moderation effects which I would like to compare using t…
The error 'option robust not allowed' in the "Oaxaca_rif" commandHi. I am going to use the "Oaxaca_rif" command. However, although I run the example in the help fil…
Subscribe to:
Post Comments (Atom)
0 Response to Drawing random sample from a large data set for each observation
Post a Comment