I have a large patient level covid-19 test dataset which is exceeding 400MBs. It is in excel format and loading it to Stata takes a long time. I was in fact timing it once and this took >2hours.
Next, I tried saving the file in csv and loading it to Stata and it only took less than 2 minutes to load the dataset into Stata.
My question is, what risks, in terms of data loss, exists when converting files to csv and loading into Stata? I have noticed date formats changing to strings, which is fixable. My concern is more about loosing information/changing quality of data.
Appreciate any thoughts on this.
Related Posts with Loading CSV vs Excel files in to Stata - risks
Scale Y values for each seperate graphHi all, I am having trouble with scaling the Y values for each individual graph. The data is in lon…
Different results for weighted median using same Stata Manual Methodology [Stata/SE 15.0]Hello everyone, I found something about weighted medians in Stata/SE. To my knowledge and what I ha…
Dropping observations is missing valuesHello community, I've downloaded a big dataset with data missing across multiple variables in no re…
Control variables for difference-in-differenceHi all, I have been struggling to get help on this I have sales data during an intervention (the i…
Exporting the table results of qrprocess to excel or wordHi STATALIST, I need advice on how I can export the table results of qrprocess to excel or word. R…
Subscribe to:
Post Comments (Atom)
0 Response to Loading CSV vs Excel files in to Stata - risks
Post a Comment