Appropriate regression model

My dataset consists of an amount of lubricant added to a process at various times. The lubricant is in predetermined volumes:

Code:

. tab lub

        lub |      Freq.     Percent        Cum.
------------+-----------------------------------
     100.00 |          1        0.65        0.65
     200.00 |          3        1.95        2.60
     300.00 |          5        3.25        5.84
     400.00 |         26       16.88       22.73
     450.00 |          1        0.65       23.38
     500.00 |          1        0.65       24.03
     600.00 |         57       37.01       61.04
     800.00 |         16       10.39       71.43
     900.00 |         11        7.14       78.57
    1000.00 |          7        4.55       83.12
    1050.00 |          1        0.65       83.77
    1200.00 |         21       13.64       97.40
    1400.00 |          1        0.65       98.05
    1500.00 |          2        1.30       99.35
    2000.00 |          1        0.65      100.00
------------+-----------------------------------
      Total |        154      100.00

. reg lub t

      Source |       SS           df       MS      Number of obs   =       151
-------------+----------------------------------   F(1, 149)       =     10.84
       Model |  996788.379         1  996788.379   Prob > F        =    0.0012
    Residual |  13696754.7       149   91924.528   R-squared       =    0.0678
-------------+----------------------------------   Adj R-squared   =    0.0616
       Total |    14693543       150  97956.9536   Root MSE        =    303.19

------------------------------------------------------------------------------
         lub |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           t |   4.627758   1.405351     3.29   0.001     1.850765     7.40475
       _cons |   513.9383   64.98879     7.91   0.000     385.5196     642.357
------------------------------------------------------------------------------

When I investigate the residuals using dpplot and qenvnormal from SSC, I have no worries about normality.

My concern is the fact that lub, although a continuous variable, can only take specific values. Is it legitimate to use OLS in this situation or should I consider other techniques such as npregress.

I would be grateful for any advice.

This is a data subset:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(lub t)
 400  54
 600  27
1000  42
 800  29
 200  30
 200  29
 400  34
 400  48
 400  39
 400  36
 300  43
 800  12
 100  48
2000  51
1000  46
 400  29
 600  30
 400  15
 400  37
 400  31
 600  19
 800  66
 800  21
 400  29
 800  29
 400  38
 400  27
 400  61
 400  59
 400  46
1400  54
 400  29
 800  43
 800  22
 600  29
 800  67
 600 100
 400  66
1200  67
 800   .
 400  31
 800  29
 800  32
 600  67
 800  55
 600  59
 400  61
1200  60
1000  51
 400  18
end

Thank you,

Janet

Stata IC 16.0

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Appropriate regression model
Appropriate regression model

0 Response to Appropriate regression model

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Appropriate regression model Appropriate regression model

Related Posts with Appropriate regression model

0 Response to Appropriate regression model

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Appropriate regression model
Appropriate regression model