I have a large patient level covid-19 test dataset which is exceeding 400MBs. It is in excel format and loading it to Stata takes a long time. I was in fact timing it once and this took >2hours.
Next, I tried saving the file in csv and loading it to Stata and it only took less than 2 minutes to load the dataset into Stata.
My question is, what risks, in terms of data loss, exists when converting files to csv and loading into Stata? I have noticed date formats changing to strings, which is fixable. My concern is more about loosing information/changing quality of data.
Appreciate any thoughts on this.
Related Posts with Loading CSV vs Excel files in to Stata - risks
Capturing the Return Code from an External ProgramI have an ado file that calls an external binary executable. Is there any way to capture the return …
log dependent variable and % independent variableHello Statalisters, I probably have an easy to answer question. However, I couldn't find the answer…
Is there an easy way to rename multiple variables?Hi, I have a wide dataset and variable names are like Code: v1, v2, v3,..., v100. I want to drop v3…
Comparing coefficients across DRsI run doubly robust regressions after running PSMatch2 on two different samples. I would like to com…
Keep only matching valuesI have a dataset that lists all products sold in a one year period in two stores. The quantity sold …
Subscribe to:
Post Comments (Atom)
0 Response to Loading CSV vs Excel files in to Stata - risks
Post a Comment