Dear statalisters,
I am running the following code to cut up a very large file (>15g) into smaller pieces:
I have adapted the code from this thread: https://www.statalist.org/forums/for...ort-large-file
scalar recordstart = 1
scalar stepsize = 20000000
qui describe using large_file.dta
scalar nrecords = r(N)
scalar num_files = ceil(nrecords / stepsize)
forval part = 1/`=num_files' {
scalar start = 1 + ((`part' - 1) * stepsize)
di "This is the value of start for iteration `part' :" start
scalar stop = min((start + stepsize -1), nrecords)
di "This is the value of stop for iteration `part' :" stop
use "large_file.dta" in `=start'/`=stop', clear
save "large_file_`part'", replace
}
Which gives me the following output:
This is the value of start for iteration 1 :1
This is the value of stop for iteration 1 :20000000
file large_file_1.dta saved
This is the value of start for iteration 2 :15706
This is the value of stop for iteration 2 :15859
file large_file_2.dta saved
This is the value of start for iteration 3 :17348
This is the value of stop for iteration 3 :21550
file large_file_3.dta saved
As you can see, the values of start/stop after the first iteration are wrong. When I comment out the last two lines of the code, it returns the correct output:
forval part = 1/`=num_files' {
scalar start = 1 + ((`part' - 1) * stepsize)
di "This is the value of start for iteration `part' :" start
scalar stop = min((start + stepsize -1), nrecords)
di "This is the value of stop for iteration `part' :" stop
// use "large_file.dta" in `=start'/`=stop', clear
// save "large_file_`part'", replace
}
Output:
This is the value of start for iteration 1 :1
This is the value of stop for iteration 1 :20000000
This is the value of start for iteration 2 :20000001
This is the value of stop for iteration 2 :40000000
This is the value of start for iteration 3 :40000001
This is the value of stop for iteration 3 :43660920
What is causing this behaviour??
Thanks.
Related Posts with Odd scalar behaviour
CONSORT DiagramDear Statalist, Is is possible to create a CONSORT flow diagram in stata without specifying the coo…
Help to shorten a loop:>I would like to know what error am I making. I am trying to run the next loop. I don't want to w…
Identifying two prescriptions within a variable length episode of care using a date variable in a long datasetHello Statalisters, I have the following example long data set. I have matched together data on eme…
Converting X Y to Lat LongHas anyone written a package that will convert X Y (e.g., NAD83) into Long Lat (e.g., WGS 1984) in S…
Peculiar AR(1) and AR(2) Results from xtabond2Hello! For a university project, I am investigating the impact of income inequality on economic gro…
Subscribe to:
Post Comments (Atom)
0 Response to Odd scalar behaviour
Post a Comment