Dear all,
first and foremost a happy new year to all of you!
In my current research project, I am analyzing the use of language in Twitter and its effects on an financing event.
My data is structured as follows:
1. For all observations, I have twitter language values for a certain period.
2.This period starts with the first financing event and either ends with the second financing event or, in case that no second event occurred (i.e., the data is right-censored) the period ends with the last recorded tweet.
3. What do I want to find out? Does the language used on twitter affect the probability/reduce the time that a second financing event occurs?
4. I set up my data using the following code:
stset TimeToSecond, failure(SecondFunding)
TimeToSecond are the days counted between first and second event (or last tweet in case no event happened). SecondFunding is my financing event and coded 0/1 (1=happened, 0=did not happen in the considered period).
5. Now estimating the effects with the cox-model:
stcox languageVariables* controlVariables*
What's my problem?
The results I get are meaningful. Nevertheless, I think I have the problem that the proportionality assumption is not true for my data. To test the proportional hazards assumption, I re-estimated my models. I interacted the independent variables with my Time-Variable (as suggested in a teaching book).
stcox languageVariables* controlVariables*, tvc(languageVariables* controlVariables*) texp(TimeToSecond)
The result of this estimation is, that some of the interactions of my control variables are significant, which is a sign for disproportionality (according to the book).
My question would now be:
Is it actually a problem, if only some of the control variable interactions are significant, but the explaining variables (languageVariables*) used are not significant? What alternatives are there to make it correctly (i.e. something like "disproportional hazards"?).
Best regards and stay healthy
Related Posts with Cox regressions for Twitter data (proportional hazards)
nlcom after elastic net?Hi! I have used the user-written cvlasso (Ahrens, A., Hansen, C.B., Schaffer, M.E. 2018. cvlasso: Pr…
A question on insheetjson - Keep getting "Invalid column name/selector" error message when using -insheetjson-I am trying to get some data from the API of American Community Survey by using insheetjson. But whe…
How to tag the most recent fiscal year end value based on the calandar time?Dear Stata experts, I would like to find the most recent fiscal-year-end investment based on the ca…
Did Stata 15 Change Loops?I just updated to Stata 15 and now a code I consistently use is broken. Have I overlooked something …
Creating a Panel and Column namesHi, How could get the first row first Column: Germany to become a panel (so to match each observati…
Subscribe to:
Post Comments (Atom)
0 Response to Cox regressions for Twitter data (proportional hazards)
Post a Comment