I have a large patient level covid-19 test dataset which is exceeding 400MBs. It is in excel format and loading it to Stata takes a long time. I was in fact timing it once and this took >2hours.
Next, I tried saving the file in csv and loading it to Stata and it only took less than 2 minutes to load the dataset into Stata.
My question is, what risks, in terms of data loss, exists when converting files to csv and loading into Stata? I have noticed date formats changing to strings, which is fixable. My concern is more about loosing information/changing quality of data.
Appreciate any thoughts on this.
Related Posts with Loading CSV vs Excel files in to Stata - risks
Marge to dataDear Sir, I am trying to marge two data in Stata 14, individual with household data based on househo…
Generating a yearly count variable based on a string identifierHi Everyone, I am looking to create a yearly count and cumulative sum variable based on a string id…
Expand each observation based on age in monthsDear Stata Users, I am dealing with a DHS data set. Here is what it looks like: ID number Age In…
Create visitnumber variable for panel dataI have panel data, an example is shown below. How do I create a visit number by id (statenumber)? …
Adaptive lasso with fixed lambdaDear all, I am trying to run an adaptive Lasso where I want to input my choice of the penalty param…
Subscribe to:
Post Comments (Atom)
0 Response to Loading CSV vs Excel files in to Stata - risks
Post a Comment