Hi, I'm working with a big database (15 million of observations, 2 gb) and I would like to make it as lighter as possible. The only ways that I come up to do that is code string variables (i.e. gender: "male" vs. "female" as 0 and 1). However it's unclear to me which role values' labels plays. Does a label 0 = "Male", makes it heaver? I'm asking this because I have for example a variable "country" with more than 200 possible values. Furthermore, is there any other trick or thing to consider in order to make a database lighter?
Cheers
Related Posts with Coding variables to make the database lighter
LoopsHi, I have the following code : egen identifier = group(isin) sum identifier scalar max2=r(max) lo…
Cox Proportional Hazards Model with multiple failure events and panel dataI would like to use a Cox Proportional Hazards Model to estimate the hazard rate of an individual in…
Export ANOVA OutputHi, I am trying to save the output from running multiple ANOVAs, however the Word output isn't corr…
Problem with time series: repeated time values within panelHi, I want to create a time series variable, but when I enter the commend tsset date, I receive the …
Interpreting logistic regression "Odds ratio ouputHello everyone I am investigating the factors that influence Adherence Pre-Exposure Prophylaxis (Pr…
Subscribe to:
Post Comments (Atom)
0 Response to Coding variables to make the database lighter
Post a Comment