Weakly balanced Panel Data Set

Hi Statalisters,

I am a novice user in Stata and it's my first post. I'm working with Stata.14 and Windows 7.

I'm working on a Panel Data Set for all commerical banks (ID Variable) in the U.S. for the period Q4-1995 - Q4-2018 (time variable). So I have data on a bank-year level (I only use the Q4 data of each year). My goals are

1) to calculate four bank risk proxies
2) to show the correlations between all four risk proxies
3) and do finally two regeressions with the risk proxies (a binary probability model).

I have converted the string variables name and date to numeric variables name2 date2. I have replaced missing variables with 0 and have checked for other missing variables.
There are duplicates within my bank names, (because banks with the same name have several bank branches ) which I fixed with generating the ID variable id.
I tryed to use the dataex command to provide you a data sample, but I'm not sure if I used the command correctly, so here is a data example:

name2

date2

asset

lnatres

ore

1st American State Bank of Minnesota

Q4 1995

16050

1st Bank

Q4 1995

15908

16050

1st Bank & Trust

Q4 1995

12888

16050

1st Bank of Troy

Q4 1995

16050

1st Business Bank

Q4 1995

16050

1st Constitution Bank

Q4 1995

16050

1st Financial Bank South Dakota

Q4 1995

16050

1st Floyd Bank

Q4 1995

16050

1st National Bank

Q4 1995

16050

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float id long name2 float date2 double asset long(lnatres ore)
 1  6 143   16050   103   37
 2  7 143   99230  1419    0
 3  8 143   54413   450    0
 4 11 143   14627   132    0
 5 13 143  922734  8995  692
 6 18 143  130484  1077    0
 7 23 143   77569   393    0
 8 28 143   40343   921    5
 9 30 143   26353   245    0
10 34 143   62315   207    0
11 37 143   38636   266  111
12 42 143   20637   114    0
13 43 143   20313   184    0
14 44 143 1736391 27470 1452
15 47 143   54356   257  270
end
format %tq!Qq-CCYY date2
label values name2 name2
label def name2 6 "1st American State Bank of Minnesota", modify
label def name2 7 "1st Bank", modify
label def name2 8 "1st Bank & Trust", modify
label def name2 11 "1st Bank of Troy", modify
label def name2 13 "1st Business Bank", modify
label def name2 18 "1st Choice Bank", modify
label def name2 23 "1st Constitution Bank", modify
label def name2 28 "1st Financial Bank South Dakota", modify
label def name2 30 "1st Floyd Bank", modify
label def name2 34 "1st National Bank", modify
label def name2 37 "1st National Community Bank", modify
label def name2 42 "1st Security Bank of Laurel", modify
label def name2 43 "1st Security Bank of West Yellowstone, Montana", modify
label def name2 44 "1st Source Bank", modify
label def name2 47 "1st State Bank and Trust Company of Palos Hills", modify

Code:

replace p3asset = 0 if (p3asset >= .)

Code:

mdesc

Code:

gen id = _n

Code:

sort id date2

Code:

xtset id date2
       panel variable:  id (weakly balanced)
        time variable:  date2, Q4-1995 to Q4-2018
                delta:  1 quarter

Code:

xtdescribe

      id:  1, 2, ..., 172431                                 n =     172431
   date2:  Q4-1995, Q4-1996, ..., Q4-2018                    T =         24
           Delta(date2) = 1 quarter
           Span(date2)  = 93 periods
           (id*date2 uniquely identifies each observation)

Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                         1       1       1         1         1       1       1

     Freq.  Percent    Cum. |  Pattern*
 ---------------------------+--------------------------
     9940      5.76    5.76 |  1.......................
     9527      5.53   11.29 |  .1......................
     9142      5.30   16.59 |  ..1.....................
     8773      5.09   21.68 |  ...1....................
     8579      4.98   26.65 |  ....1...................
     8314      4.82   31.48 |  .....1..................
     8079      4.69   36.16 |  ......1.................
     7887      4.57   40.74 |  .......1................
     7769      4.51   45.24 |  ........1...............
    94421     54.76  100.00 | (other patterns)
 ---------------------------+--------------------------
   172431    100.00         |  XXXXXXXXXXXXXXXXXXXXXXXX
 ------------------------------------------------------
 *Each column represents 4 periods.

A weakly balanced dataset arise, if each panel contains the same number of observations, but NOT the same time points. I do have this case, because I have a lot more observations in one time period.
Can you please help me, how to have a balanced panel? Does a weakly balanced panel has an influence on my correlation and regression results?

Thank you very much!

Katharina

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Weakly balanced Panel Data Set
Weakly balanced Panel Data Set

0 Response to Weakly balanced Panel Data Set

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Weakly balanced Panel Data Set Weakly balanced Panel Data Set

0 Response to Weakly balanced Panel Data Set

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Weakly balanced Panel Data Set
Weakly balanced Panel Data Set