Hello,
I have been using R but started learning STATA recently. To briefly introduce what I have been trying to do, I have some dataset, which I received as a homework in my statistics course last year. By reusing the dataset, I have been trying to rerun in the course by using STATA. That said, I already know what statistical results I should get, which makes easier for me to see whether I am going to the right direction.
(1) I am trying to regress two categorical variables and the interactions of the two on one dependent variable, as in "reg DV IV1##IV2."
The code I am trying to use are, "reg Score Condition##Experience" and "anova Score Condition##Experience."
Initially, two independent variables are categorical, coded as 0 and 1.
However, I have learned in my statistics class that centering categorical variables is always useful for interpretation purpose when I regress interactions. Thus, I recoded 0 and 1 into -1 and 1 and tried rerunning the code (by recoding the values, the mean of the variable is 0 implying that the variables are correctly centered). The new code I used is, "reg Score ConditionC##ExperienceC" and "anova Score ConditionC##ExperienceC." The alphabet C just indicates that the variables are centered. However, when I tried rerunning the code, I got the error saying "ConditionC: factor variables may not contain noninteger values." I further tried putting "i." in front of the centered variables as in, "reg Score i.ConditionC##i.ExperienceC." However, the code did not still work, and I still got the same error message.
Based on the above and by searching through the forum, I got this first question and would like to confirm my understanding: "Am I not allowed to have negative values in a factor variable? My rationale for this question is simple. To me, both 0 and 1, and -1 and 1 can imply YES or NO. However, STATA seems not to allow negative values in a categorical variable. Am I correct?
(2) Again, by searching through the forum, I changed my code by putting 'c.' in the front. For example, "reg Score c.ConditionC##c.ExperienceC." Then this worked well, and I got the statistical result I should have gotten (As I mentioned, I already have the HW key, so compared my STATA result with the HW Key).
Here comes my second question. If I put 'c.' in front my categorical variable, which is coded as -1 and 1, how does STATA interpret the code? Simply as a continuous variable? How can it interpret a categorical variable as a continuous variable?
Thank you for your help in advance!
Related Posts with Negative values for categorical variable.
Transform wide Datastream format to longDear all, For the purpose of my research in which I wish to analyze the effect of buy/sell recommen…
dummy for interaction variables in regressionHi Here I am again i.wealthindex*i.educationlevel === What is the correct stata command / syntha…
Insignificant OLS results but significant IV regression resultsDear all, I do a regression (Y is a dependent variable and X is an independent variable) OLS gets …
storing logit regression coefficients as new variable?Hi everyone! My question is as stated in the title: I'm running a logit regression and am looking to…
counting total of variable based on other variableHello, i want to count the total M&A's (labeled as 1 in maactive) of a firm (based on gvkey) en…
Subscribe to:
Post Comments (Atom)
0 Response to Negative values for categorical variable.
Post a Comment