I have a big dataset (1.7million obs) and I need to create contour plots by a member_id within the dataset.
I have the member_id, and each member has about 45,000 obs. After filtering and looking at a subset, I end up with about 35,000 obs for each member.
I tried to use histogram/contour plot to create a contour plot/heatmap and with all of the data it took a really long time to render.
To increase the processing speed and distill the data I did the following....
For the 35,000 obs, I have two continuous variables that are the X and Y variable... and I want their incidence/intensity displayed in a contourplot or a heat map.
I created two categorical variables, a 15-category variable and an 11-category variable that were created by substantive concerns for 1 (known categories with substantive import) and distributional properties for the other (created reasonably sized groups in each interval.
I cross-tabbed the two categorical variables and then created a counter/incidentce variable using the crosstab results....
I did it using brute force...
generate counter=0
recode counter 0=(number in crosstab) if v1==1 and V2==1
recode counter 0=(number in crosstab) if v1==1 and v2==2
I did this for all of the cells in an 11x15 crosstab....
I then created a contour plot with the x variable, the y variable, and used the 'counter' as the z-variable.
The plots work and, with some options, convey the appropriate information.
BUT, this is hugely clumsy and labor intensive when I have 40 'members' and need to do this for 3-4 files per week.
What I would like to do is run the crosstab -
tab v1 v2 - generate the results, and use a mat command to automate the recode of the 'counter_group1' and place each of the cell values in the code to create my counter.
I hope this is clear and I have data to use for an example...
I have set up matrices and run tabs and then used the mat commands to export data into tables, I assume I can do the same and use the mat command to place 'results' into a recode statement that prevents me from manually taking crosstab results and typing them into recode statements.
Any thoughts?
Related Posts with Taking Results from Tab2 and Using them to Generate/Recode a new variable
"too many base levels specified" after -suest-Hi all, I am using -suest- to compare coefficients of the same variable across two logit model (usi…
Reshaping Long to Wide when the ID is Not UniqueHi Everyone, I have Stacked on how to manipulate my data to make the Reshape Command Works. In my c…
Coarsened Exact Matching (CEM)I have a dataset containing treatment and control firms. Each firm belongs to a technology class def…
How to add another variable as a label above stacked bar chartHi, First, I want to say that I don't have a good english. So,if something is not understood I can …
Mixed results of datas in normality testI have a data set with datas satisfying normality and for few of the variables, it doesn't. Please s…
Subscribe to:
Post Comments (Atom)
0 Response to Taking Results from Tab2 and Using them to Generate/Recode a new variable
Post a Comment