Dear Statalist,
I have a dataset that records the time when staffs successfully pick the products according to the bar code. For instance, in the first row, the data means that the first picker (pickid=1) pick a certain item at 01oct2019 12:41:27.
I observe that during a certain day, a picker may work several shifts. For instance, in the 8th observation, pickid 1 picks the item at 01oct2019 15:00:35, then he took a rest for about two hours, and starts his second shift during that day at around 17:00.
clear
input float pickid double pick float(date shift)
1 1885552897000 21823 1
1 1885553368000 21823 1
1 1885553548000 21823 1
1 1885554267000 21823 1
1 1885555750000 21823 1
1 1885559182000 21823 1
1 1885561029000 21823 1
1 1885561235000 21823 1
1 1885568751000 21823 2
1 1885568794000 21823 2
1 1889124609000 21864 1
1 1889125397000 21864 1
1 1889125511000 21864 1
1 1889126216000 21864 1
1 1889126746000 21864 1
1 1889127005000 21864 1
1 1889127208000 21864 1
1 1889128021000 21864 1
1 1889128155000 21864 1
1 1889128219000 21864 1
1 1891763437000 21895 1
1 1891763843000 21895 1
1 1891765585000 21895 1
1 1891765845000 21895 1
1 1891766075000 21895 1
1 1891766170000 21895 1
1 1891767753000 21895 1
1 1891768296000 21895 1
1 1891790940000 21895 2
1 1891791650000 21895 2
2 1889050832000 21864 1
2 1889051543000 21864 1
2 1889055404000 21864 1
2 1889058548000 21864 1
2 1889061753000 21864 1
2 1889062340000 21864 1
2 1889063595000 21864 1
2 1889065232000 21864 1
2 1889116535000 21864 2
2 1889116760000 21864 2
end
format %tc pick
format %td date
[/CODE]
I want to generate a variable SHIFT showing the shifts a picker works a day. For instance, shift=1 if it is the first shift of the picker during the day. If the time gap between the current pick and the previous pick is more than 2 hours, then the current pick (and later picks) belongs to the second shift, and so on so forth. A picker can have several shifts a day. Can someone help me generate a SHIFT variable?
In addition, I also want to generate two experience variables. The first variable is called CURRENTEXP, which shows the picker’s cumulative working hours until the current pick during a day. I want to use the following logic to construct a variable as a proxy for CURRENTEXP, please let me know if it is not appropriate. For instance, for picker 1 in the first shift in the first day, for the current pick, picker’s CURRENTEXP should be the previous pick time minus the first pick time in shift 1 (01oct2019 12:41:37). For picker 1 in the second shift, picker’s CURRENTEXP should be [(the last pick time in shift 1-the first pick time in shift 1)+(previous pick time-first pick time in shift 2)], and so on so forth.
The second variable is called TOTEXP, which shows the picker’s cumulative working hour until the current pick. For instance, during the nth day, for picker 1 in the first shift, the TOTEXP should be cumulative working hour during n-1 days plus the cumulative working hour during that day. The logic I have in mind is as follows, and please let me know if I am wrong. Can someone help me generate these two variables?
(Day1 shift1 last pick time-Day1 shift 1 first pick time)+ (Day1 shift2 last pick time-Day1 shift 2 first pick time)+(Day2 shift1 last pick time-Day2 shift 1 first pick time)+…
The last question is about the estimation techniques. For each pick, I can actually calculate the time duration that a picker takes to find the product. For instance, for product 1, it takes picker 1 10s to find it. For product 2, it takes picker 1 20s to find it. I have multiple products, multiple pickers, and multiple days. I see the literature using the AFT model to model the factors affecting pick time, such as picker experience. I am wondering is it appropriate to use the AFT mode? I would use the following code:
stset pick_time ////should I take log here?
streg totexp currentexp, distribution(lognormal) time nolog
I am sorry for having so many questions since I do not have much experience dealing with such datasets. I hope I have made myself clear. Any help will be highly appreciated. Thank you very much!
Best,
Changjun
Related Posts with Variable Construction based on Time Stamps
IPWRA and PSMATCH, how to choose..Hi all, I have a longitudinal dataset with individuals some of which received a treatment at some t…
Finding stock beta with a forvalues loopI am performing an event study and am trying to estimate the beta of 60 different public companies d…
ivreghdfe cannot show first stage outcomeDear statalist, I am using ivreghdfe and want to know the regression in the first stage. However, s…
reshapeHi, I am trying to reshape my data. Stata says that the data is currently wide. The current data is…
Problem with excluding observationsHello, everyone! I am fairly new to Stata and I am trying to work out how to exclude some oberserva…
Subscribe to:
Post Comments (Atom)
0 Response to Variable Construction based on Time Stamps
Post a Comment