Hello everybody,

For my master's thesis, I need help using logistic regression. It's about the effect of team composition on success in soccer in one season. About my data:
The time variable here is the game days, from 1-34. There is also a group variable: team (18 teams). Each game is considered individually, so there are 612 obs.

The dependent variable is binary (0 = not won, 1 = won).
Independent variables are:

1. the number of core players (from 0 to 11 players)
2. the percentage of core players playing on the respective match day (0-100%)
3. percentage of players who played in the last game (0-100%)
4. Percentage of times a team has played together throughout the season (0-100%)
In addition, the control variable for home games (0 = away, 1 = home)

Am I right here that I have to use the xtlogit model?
Is it possible to do this regression with percentages or do I have to modify the variables? If yes, how?
Do I have to examine the independent variables separately like:

xtlogit win no_of_core_team_players i.outwards
xtlogit win percentage_of_core_team i.outwards
xtlogit win percentage_of_line_up_last_game i.outwards
xtlogit win percentage_of_line_up_all_games i.outwards

Or all in an entire model:

xtlogit win no_of_core_team_players percentage_of_core_team percentage_of_line_up_last_game percentage_of_line_up_all_games i.outwards

Do further calculations such as odds ratio or AME make sense?
What other options are possible, e.g. to calculate effects between the teams?

Or must a completely different approach be used here?
I have attached the excel file, if someone could look at it, it would be great.


I am very grateful for any help!

Regards Julius