Hi, I'm working with a big database (15 million of observations, 2 gb) and I would like to make it as lighter as possible. The only ways that I come up to do that is code string variables (i.e. gender: "male" vs. "female" as 0 and 1). However it's unclear to me which role values' labels plays. Does a label 0 = "Male", makes it heaver? I'm asking this because I have for example a variable "country" with more than 200 possible values. Furthermore, is there any other trick or thing to consider in order to make a database lighter?
Cheers
Related Posts with Coding variables to make the database lighter
In need of help with Survey data analysis, survey settings/svysetHello there I am trying to perform a multinominal logistic regression using survey data and having t…
Edit graph with codeI've got 20 .gph files and I need to change the titles on all of them. Can I do that with Stata comm…
Fixed effects and instrumental variablesHello all, I have 3 years of cross sectional data of which 2 years have information on my independe…
Tempfile and error due to "spaces" at the end of a do-file lineHello to everyone. I really hope someone can help me with this. I have a dataset in which I am runni…
Incorrect margins/pred. probabilities after ivprobit - instrument should be ignored?Hello, I am using the ivprobit command and want to calculate predicted probabilities with the margi…
Subscribe to:
Post Comments (Atom)
0 Response to Coding variables to make the database lighter
Post a Comment