I have a big dataset (1.7million obs) and I need to create contour plots by a member_id within the dataset.
I have the member_id, and each member has about 45,000 obs. After filtering and looking at a subset, I end up with about 35,000 obs for each member.
I tried to use histogram/contour plot to create a contour plot/heatmap and with all of the data it took a really long time to render.
To increase the processing speed and distill the data I did the following....
For the 35,000 obs, I have two continuous variables that are the X and Y variable... and I want their incidence/intensity displayed in a contourplot or a heat map.
I created two categorical variables, a 15-category variable and an 11-category variable that were created by substantive concerns for 1 (known categories with substantive import) and distributional properties for the other (created reasonably sized groups in each interval.
I cross-tabbed the two categorical variables and then created a counter/incidentce variable using the crosstab results....
I did it using brute force...
generate counter=0
recode counter 0=(number in crosstab) if v1==1 and V2==1
recode counter 0=(number in crosstab) if v1==1 and v2==2
I did this for all of the cells in an 11x15 crosstab....
I then created a contour plot with the x variable, the y variable, and used the 'counter' as the z-variable.
The plots work and, with some options, convey the appropriate information.
BUT, this is hugely clumsy and labor intensive when I have 40 'members' and need to do this for 3-4 files per week.
What I would like to do is run the crosstab -
tab v1 v2 - generate the results, and use a mat command to automate the recode of the 'counter_group1' and place each of the cell values in the code to create my counter.
I hope this is clear and I have data to use for an example...
I have set up matrices and run tabs and then used the mat commands to export data into tables, I assume I can do the same and use the mat command to place 'results' into a recode statement that prevents me from manually taking crosstab results and typing them into recode statements.
Any thoughts?
Related Posts with Taking Results from Tab2 and Using them to Generate/Recode a new variable
What is the problem when using the grqreg code?hi, guys! I'm wondering a question that when I use grreq to plot the coefficient of quantile regress…
Generating macro containing name of current do fileHello, Is there a way to do the above? I'm doing parallel analysis, with duplicate do files, except …
PPML excluded regressorsHi, Why some regressors are excluded with PPML estimator? What is the interpretation of that? Than…
Simple Regression Questions: Time Series and % ChangeHello, I'm very much a Stata and linear regression newbie. I am running a linear regression with Bi…
Comparing coefficients in a regressionHi, I have the following linear regression model specified. These are at the consumer order level, i…
Subscribe to:
Post Comments (Atom)
0 Response to Taking Results from Tab2 and Using them to Generate/Recode a new variable
Post a Comment