I have a big dataset (1.7million obs) and I need to create contour plots by a member_id within the dataset.
I have the member_id, and each member has about 45,000 obs. After filtering and looking at a subset, I end up with about 35,000 obs for each member.
I tried to use histogram/contour plot to create a contour plot/heatmap and with all of the data it took a really long time to render.
To increase the processing speed and distill the data I did the following....
For the 35,000 obs, I have two continuous variables that are the X and Y variable... and I want their incidence/intensity displayed in a contourplot or a heat map.
I created two categorical variables, a 15-category variable and an 11-category variable that were created by substantive concerns for 1 (known categories with substantive import) and distributional properties for the other (created reasonably sized groups in each interval.
I cross-tabbed the two categorical variables and then created a counter/incidentce variable using the crosstab results....
I did it using brute force...
generate counter=0
recode counter 0=(number in crosstab) if v1==1 and V2==1
recode counter 0=(number in crosstab) if v1==1 and v2==2
I did this for all of the cells in an 11x15 crosstab....
I then created a contour plot with the x variable, the y variable, and used the 'counter' as the z-variable.
The plots work and, with some options, convey the appropriate information.
BUT, this is hugely clumsy and labor intensive when I have 40 'members' and need to do this for 3-4 files per week.
What I would like to do is run the crosstab -
tab v1 v2 - generate the results, and use a mat command to automate the recode of the 'counter_group1' and place each of the cell values in the code to create my counter.
I hope this is clear and I have data to use for an example...
I have set up matrices and run tabs and then used the mat commands to export data into tables, I assume I can do the same and use the mat command to place 'results' into a recode statement that prevents me from manually taking crosstab results and typing them into recode statements.
Any thoughts?
0 Response to Taking Results from Tab2 and Using them to Generate/Recode a new variable
Post a Comment