Dear statalisters,
I am running the following code to cut up a very large file (>15g) into smaller pieces:
I have adapted the code from this thread: https://www.statalist.org/forums/for...ort-large-file
scalar recordstart = 1
scalar stepsize = 20000000
qui describe using large_file.dta
scalar nrecords = r(N)
scalar num_files = ceil(nrecords / stepsize)
forval part = 1/`=num_files' {
scalar start = 1 + ((`part' - 1) * stepsize)
di "This is the value of start for iteration `part' :" start
scalar stop = min((start + stepsize -1), nrecords)
di "This is the value of stop for iteration `part' :" stop
use "large_file.dta" in `=start'/`=stop', clear
save "large_file_`part'", replace
}
Which gives me the following output:
This is the value of start for iteration 1 :1
This is the value of stop for iteration 1 :20000000
file large_file_1.dta saved
This is the value of start for iteration 2 :15706
This is the value of stop for iteration 2 :15859
file large_file_2.dta saved
This is the value of start for iteration 3 :17348
This is the value of stop for iteration 3 :21550
file large_file_3.dta saved
As you can see, the values of start/stop after the first iteration are wrong. When I comment out the last two lines of the code, it returns the correct output:
forval part = 1/`=num_files' {
scalar start = 1 + ((`part' - 1) * stepsize)
di "This is the value of start for iteration `part' :" start
scalar stop = min((start + stepsize -1), nrecords)
di "This is the value of stop for iteration `part' :" stop
// use "large_file.dta" in `=start'/`=stop', clear
// save "large_file_`part'", replace
}
Output:
This is the value of start for iteration 1 :1
This is the value of stop for iteration 1 :20000000
This is the value of start for iteration 2 :20000001
This is the value of stop for iteration 2 :40000000
This is the value of start for iteration 3 :40000001
This is the value of stop for iteration 3 :43660920
What is causing this behaviour??
Thanks.
Related Posts with Odd scalar behaviour
Probit Interaction Marginal EffectsI would like to test if the "gap" in a discrete variable,Y, changes between two groups from year to …
Calculate mean when values are missingI need to create a variable that is the mean of the observations from three other variables. I did t…
forbidden regression (quadratic regression)Dear All, Suppose that I regress y on x and x^2, along with other covariates. In addition, x is endo…
How to use esttab with psmatch2?Hi everyone, I am using propensity score matching for my research. I wanted to use esttab to create…
Error in -merge- commandHi, I have to merge about 40 datasets using a key ID. The "hhid" variable present in all datasets, b…
Subscribe to:
Post Comments (Atom)
0 Response to Odd scalar behaviour
Post a Comment