Counting with _n

Hello Statalist,

Question here about counting with _n in large datasets. I am attempting to create a row identifier variable which simply takes the value of the row number. In the past I've used the command

Code:

gen row_id = _n

. This works fine for the first ~16.5 million rows. After this point, however, the values in "row_id" do not match the row number! For example, here's what the output looks like:

True Row row_id
16961210 16961210
16961211 16961212
16961212 16961212
16961213 16961212
16961214 16961214
16961215 16961216
16961216 16961216
16961217 16961216

As is apparent, "row_id" begins a cycle of correct then incorrect numbers, returning to the correct value every so often before deviating again. I've tried looping through the integers 1-17million to manually create "row_id." Again, I run into the same problem. I'd include a full example here, but it seems too large to post.

Can anyone provide insight on what I'm doing wrong here?

I'm running StataMP 17.0 on Mac OS.

Thanks in advance!

Andy

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Counting with _n
Counting with _n

0 Response to Counting with _n

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Counting with _n Counting with _n

Related Posts with Counting with _n

0 Response to Counting with _n

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Counting with _n
Counting with _n