Hi,

I have little experience with logistic regression on panel data and would like to know which STATA command is the correct one for my purpose and in which form the data have to be present.

I have data on acquisitions and have merged this with a manager dataset such that for each deal in my sample I have information on the CFO who was in place at the acquirer at the time of the deal.

The main sample criteria are as follows:
- US deals (acquirer and target are US companies).
- Deals announced between 2000-2018
- Deal Value: > 1 million USD
- Listing Acquiror: Public
- Listing Target: Public/Private
- Deal type: Completed deals only

In addition to some variables on deal characteristics, I have demographic information on the CFOs in place such as age, nationality and gender.

I would like to analyze the influence of certain CFO characteristics (such as his or her M&A experience) on the probability that a deal is being announced.

Following related literature, the dependent variable is a dummy that assumes the value of one if a firm completes at least one acquisition in a given year, and zero otherwise.


Deal Sample (without some of the deal characteristics):

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str7 Acq_ID str26 CFO_Name str6 CFO_Board str7 CFO_ID str1 CFO_Gender str13 CFO_Nat byte CFO_MBA float CFO_No_Deals int(CFO_Start CFO_End) double(Deal_No Deal_Value) int Deal_Announced
"9386"  "Bill Tamblyn"       "No" "323826" "M" "American" 0  0 13666 16192 950128020    91.237 14649
"24075" "Cameron Waite"      "No" "457917" "M" ""         1  0 13270 15461 958086020         . 14612
"16427" "Sean Orr"           "No" "61392"  "M" "American" 1  0 14396 14641 958589020         . 14613
"2355"  "Fred Anderson Jr"   "No" "55417"  "M" "American" 1  3 13209 16223 959635020         . 14616
"12839" "Philip Roth"        "No" "321983" "M" ""         1  3 13270 16299 959721020         . 14614
"23701" "Joe Tyson"          "No" "207203" "M" ""         0  0 14518 17104 960108020      12.2 14619
"32542" "Gene Godick"        "No" "451029" "M" ""         0  0 14518 14792 961102020         . 14620
"2382"  "Joe Bronson"        "No" "35168"  "M" "American" 1  0 13880 14945 961107020  1846.235 14621
"32542" "Gene Godick"        "No" "451029" "M" ""         0  0 14518 14792 961108020         . 14620
"32019" "Michael Sileck Jr"  "No" "43521"  "M" "American" 1 10 14518 15341 961134020   688.045 14621
"32672" "Joel Katz"          "No" "327563" "M" "American" 0 10 14335 15188 961332020   553.593 14619
"27292" "Chris Dodds"        "No" "44621"  "M" ""         1  4 14426 17304 961446020  2612.997 14622
"29159" "Joe Bellino"        "No" "344488" "M" "American" 1  0 13788 15364 961490020         . 14622
"29159" "Joe Bellino"        "No" "344488" "M" "American" 1  0 13788 15364 961499020         . 14622
"10266" "H Austin"           "No" "60057"  "M" "American" 1  0 12904 15614 962226020 16006.384 14627
"17374" "Tony Muller"        "No" "58146"  "M" "American" 1  7 13898 14731 962266020 15393.528 14626
"2593"  "Nicholas Winiewicz" "No" "327998" "M" "American" 1  0 14276 14915 962357020         . 14615
"31618" "Ron Lataille"       "No" "480213" "M" ""         0  0 14610 18786 962539020     7.338 14626
"12698" "Tim Barker"         "No" "324704" "M" "American" 0  0 14610 14823 962597020         . 14627
"1221"  "Doctor Tim Weller"  "No" "205750" "M" "American" 0  0 14546 15676 962650020       200 14627
"5168"  "Bill Ruehle"        "No" "62132"  "M" "American" 1  8 13666 16527 962661020   110.718 14627
"7130"  "Andy Hajducky III"  "No" "326377" "M" "American" 0  3 13057 15157 962797020         . 14628
"6908"  "Larry Carter"       "No" "32837"  "M" "American" 0  5 13696 14792 963097020     214.5 14628
"6908"  "Larry Carter"       "No" "32837"  "M" "American" 0  5 13696 14792 963098020       567 14628
"18958" "Steve Lifshatz"     "No" "325382" "M" "American" 0  0 13515 15341 963191020    191.25 14629
end
format %tddd/nn/CCYY CFO_Start
format %tddd/nn/CCYY CFO_End
format %tddd/nn/CCYY Deal_Announced
format %15.0g Deal_No
Whereas:
Acq_ID: Unique Identifier of the Acquiring firm
CFO_Board: "Yes" if the CFO is on the Board of the Acquiring firm, "No" otherwise.
CFO_ID: Unique Identifier of the CFO in place
CFO_MBA: 1 if the CFO has an MBA degree, 0 otherwise
CFO_Start: Day on which the person became CFO in the acquiring firm
CFO_End: Day on which the Person ends his/her role as CFO in the acquiring firm
CFO_No_Deals: Number of deals a specific CFO has already managed (outside of the acquiring firm) up to the announcement date of a deal. -> Proxy for M&A experience
Deal_No: Unique Identifier of a Deal

Correct me if I am wrong but in my deal data set, the data is in inappropriate form for a logistic regression analysis, right? The above mentioned analysis I would like to perform refers to the firm level, whereas the data in the deal sample is at deal level.
Therefore, as a first step, I created a firm-deal dataset that contains the number of deals per year on a firm basis (I still have to recode the data as being 0 or 1 for the logistic regression):

Firm-Deal Sample:
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str7 Acq_ID float(Deals_2000 Deals_2001 Deals_2002 Deals_2003 Deals_2004 Deals_2005 Deals_2006 Deals_2007 Deals_2008 Deals_2009 Deals_2010 Deals_2011 Deals_2012 Deals_2013 Deals_2014 Deals_2015 Deals_2016 Deals_2017 Deals_2018)
"1000824" 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
"1001"    0 0 1 3 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0
"1001479" 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0
"10015"   0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
"10024"   1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
"1002402" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
"10031"   0 0 0 0 1 0 2 0 3 0 0 0 0 0 0 0 0 0 0
"10037"   0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
"1004151" 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
"1005570" 0 0 0 0 0 0 0 1 1 2 2 0 0 0 1 0 0 0 0
"100636"  0 0 0 0 1 1 0 2 0 0 0 0 0 0 0 0 0 0 0
"1007477" 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
"1007505" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0
"10080"   0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0
"10100"   0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
"10104"   0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 3 0 0 0
"1011"    0 0 0 0 0 1 0 1 2 0 0 1 0 0 0 0 0 0 0
"1011137" 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0
"1011333" 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0
"10130"   0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
"1013081" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 2
"1013530" 0 0 0 0 0 0 0 1 0 1 2 1 0 0 0 0 1 0 2
"10137"   0 1 1 2 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0
"1015505" 0 0 0 0 2 0 1 7 0 0 0 0 0 0 0 0 0 0 0
"1015577" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
end
I was wondering if creating this dataset is even necessary or if there is a way to perform the logistic regression based on the deal sample.
Either way, I am not quite sure how to consider my independent variables in my analysis. In the deal dataset CFO_No_Deals is given on a deal level: Assuming that is is actually necessary to build the Firm-Deal Sample, I would have to create another variable for each existing Deals_2000 - Deals_2018 variable that contains the number of deals that the CFO has managed, right?

Any help is much appreciated.
Thank you very much.