I have a large patient level covid-19 test dataset which is exceeding 400MBs. It is in excel format and loading it to Stata takes a long time. I was in fact timing it once and this took >2hours.
Next, I tried saving the file in csv and loading it to Stata and it only took less than 2 minutes to load the dataset into Stata.
My question is, what risks, in terms of data loss, exists when converting files to csv and loading into Stata? I have noticed date formats changing to strings, which is fixable. My concern is more about loosing information/changing quality of data.
Appreciate any thoughts on this.
Related Posts with Loading CSV vs Excel files in to Stata - risks
doflist command in reghdfeHi there, I am currently having issues with collinearity of fixed effects in my 'reghdfe- regressio…
merging datasets with different dimensionsHi all, I would like to merge together 16 datasets which are defined by country and are made as fol…
AR(1) is insignificant in difference GMMHello ! I am trying to estimate the following model in both difference and system GMM. However, the…
Using reghdfe command with if-statementsHello, bit of a complex one here: I’m currently working as a research assistant, using my superviso…
Impact of an average change on dependent variable over timeHello! For my thesis I wanted to test a hypothesis which requires a model that I wouldn't know how t…
Subscribe to:
Post Comments (Atom)
0 Response to Loading CSV vs Excel files in to Stata - risks
Post a Comment