Hello everyone,
I need help reading in a 10GB .csv dataset with very limited physical memory on my computer. I am having trouble with the dictionary I believe.

I am trying to read in the FCC dataset https://opendata.fcc.gov/Wireline/Fi...2019/whue-6pnt which I've downloaded on my usb. I need only CA state data. Because I only have 10GB on my machine as memory (and I can't change that today), I want to infile this only within the range of the state that I need and the variables that I need.

I can see the data if I run
import delim using "Fixed_Broadband_Deployment_Data__December_2019.cs v" , rowr(1:100) clear

but can't run
infile using fccdict in 1, clear

as I get the following:

'LogicalReco' cannot be read as a number for logicalrecordnumber[1]
'rdNumber,Pr' cannot be read as a number for providerid[1]
'oviderID,FR' cannot be read as a number for frn[1]
'rName,DBA' cannot be read as a number for censusblockfipscode[1]
'Name,Ho' cannot be read as a number for consumer[1]
'ldingCo' cannot be read as a number for maxadvertiseddownstreamspeedmbps[1]
'mpanyNam' cannot be read as a number for maxadvertisedupstreamspeedmbps[1]
(1 observation read)

.


. list

+-------------------------------------------------------------------------------+
| logica~r provid~d frn state census~e consumer m~down~s m~upst~s |
|-------------------------------------------------------------------------------|
1. | . . . N, . . . . |
+-------------------------------------------------------------------------------+



with the dictionary file fccdict.dct being

dictionary using "Fixed_Broadband_Deployment_Data__December_2019.cs v"{
* for fcc data
*
*1 logicalrecord~r long %12.0g Logical Record Number
*2 providerid long %12.0g Provider ID
*3 frn long %12.0g FRN
*4 providername str46 %46s Provider Name
*5 dbaname str39 %39s DBA Name
*6 holdingcompan~e str46 %46s Holding Company Name
*7 holdingcompan~r long %12.0g Holding Company Number
*8 holdingcompan~l str46 %46s Holding Company Final
*9 state str2 %9s State
*10 censusblockfi~e double %10.0g Census Block FIPS Code
*11 technologycode byte %8.0g Technology Code
*12consumer byte %8.0g Consumer
*13 m~downstreams~s int %8.0g Max Advertised Downstream Speed (mbps)
*14m~upstreamspe~s float %9.0g Max Advertised Upstream Speed (mbps)
*15business byte %8.0g Business
*
_first(1)
_lines(1)
long logicalrecordnumber %12.0g
long providerid %12.0g
long frn %12.0g
str2 state %9s
double censusblockfipscode %10.0g
byte consumer %8.0g
int maxadvertiseddownstreamspeedmbps %8.0g
float maxadvertisedupstreamspeedmbps %9.0g

}



Doing
import delim using "Fixed_Broadband_Deployment_Data__December_2019.cs v"

just gets Stata stuck and according to stata I need a lot of memory. I am on my work computer so getting more memory is a whole bureaucratic affair that will have to wait till Monday and I need to process my dataset today.

THANKS IN ADVANCE!

About information on my machine:

Stata/MP 14.2 for Windows (64-bit x86-64)
Revision 29 Jan 2018
Copyright 1985-2015 StataCorp LLC

Total physical memory: 16594036 KB
Available physical memory: 9371188 KB

538-user 4-core Stata network perpetual license:

Licensed to: Stata/MP 14 (4 cores)