Hello,
I have been using R but started learning STATA recently. To briefly introduce what I have been trying to do, I have some dataset, which I received as a homework in my statistics course last year. By reusing the dataset, I have been trying to rerun in the course by using STATA. That said, I already know what statistical results I should get, which makes easier for me to see whether I am going to the right direction.
(1) I am trying to regress two categorical variables and the interactions of the two on one dependent variable, as in "reg DV IV1##IV2."
The code I am trying to use are, "reg Score Condition##Experience" and "anova Score Condition##Experience."
Initially, two independent variables are categorical, coded as 0 and 1.
However, I have learned in my statistics class that centering categorical variables is always useful for interpretation purpose when I regress interactions. Thus, I recoded 0 and 1 into -1 and 1 and tried rerunning the code (by recoding the values, the mean of the variable is 0 implying that the variables are correctly centered). The new code I used is, "reg Score ConditionC##ExperienceC" and "anova Score ConditionC##ExperienceC." The alphabet C just indicates that the variables are centered. However, when I tried rerunning the code, I got the error saying "ConditionC: factor variables may not contain noninteger values." I further tried putting "i." in front of the centered variables as in, "reg Score i.ConditionC##i.ExperienceC." However, the code did not still work, and I still got the same error message.
Based on the above and by searching through the forum, I got this first question and would like to confirm my understanding: "Am I not allowed to have negative values in a factor variable? My rationale for this question is simple. To me, both 0 and 1, and -1 and 1 can imply YES or NO. However, STATA seems not to allow negative values in a categorical variable. Am I correct?
(2) Again, by searching through the forum, I changed my code by putting 'c.' in the front. For example, "reg Score c.ConditionC##c.ExperienceC." Then this worked well, and I got the statistical result I should have gotten (As I mentioned, I already have the HW key, so compared my STATA result with the HW Key).
Here comes my second question. If I put 'c.' in front my categorical variable, which is coded as -1 and 1, how does STATA interpret the code? Simply as a continuous variable? How can it interpret a categorical variable as a continuous variable?
Thank you for your help in advance!
Related Posts with Negative values for categorical variable.
Putexcel Loop over several cross tabulationsHello, I'd like to run some weighted cross-tabulations, and use putexcel to save the output in excel…
How to convert string id variable starting with "00" to numeric without losing the "00"?Hi, could any one help me with the following two issues? First, my id variable values start with 00…
Help needed for Matplotlib in StataHi, I am using Stata 17 and was using some python codes to plot some bar charts. Below is my code: …
Scatterplot with weighted regression line using aaplot and lfitDear Statalist, I'm trying to graph a scatterplot with a regressionline. Instead of a standard OLS …
Observations with missing cells in the data - should they be dropped?Hi all. I am organizing a panel data with information on the spending with government advertising by…
Subscribe to:
Post Comments (Atom)
0 Response to Negative values for categorical variable.
Post a Comment