I have a large patient level covid-19 test dataset which is exceeding 400MBs. It is in excel format and loading it to Stata takes a long time. I was in fact timing it once and this took >2hours.
Next, I tried saving the file in csv and loading it to Stata and it only took less than 2 minutes to load the dataset into Stata.
My question is, what risks, in terms of data loss, exists when converting files to csv and loading into Stata? I have noticed date formats changing to strings, which is fixable. My concern is more about loosing information/changing quality of data.
Appreciate any thoughts on this.
Related Posts with Loading CSV vs Excel files in to Stata - risks
Calculating ratio with confidence intervalDear All, Do you perhaps have any advice on how to calculate the ratio with 95% confidence interval…
Display or Show joint test after estimating REGHDFEHello guys, I run the REGHDFE command to estimating this: reghdfe ly $quatre $cinq, a (cmap dveduc3…
Regression Discontinuity Design for change in speed limitsHey! I`m trying to get graphs for my panel data. I`m looking to graph with a regression discontinui…
<istmt>: 3301 subscript invalid r(3301) when rdplot in nested loopHi Community, I'm writing a nested Loop with Stata, the Variable Cur is a numeric with label, and I …
Difference in Sargan after xtdpdsysIs it possilbe to compute difference in Sargan after xtdpdsys ? …
Subscribe to:
Post Comments (Atom)
0 Response to Loading CSV vs Excel files in to Stata - risks
Post a Comment