Dear Statalist,
I have a dataset that records the time when staffs successfully pick the products according to the bar code. For instance, in the first row, the data means that the first picker (pickid=1) pick a certain item at 01oct2019 12:41:27.
I observe that during a certain day, a picker may work several shifts. For instance, in the 8th observation, pickid 1 picks the item at 01oct2019 15:00:35, then he took a rest for about two hours, and starts his second shift during that day at around 17:00.
clear
input float pickid double pick float(date shift)
1 1885552897000 21823 1
1 1885553368000 21823 1
1 1885553548000 21823 1
1 1885554267000 21823 1
1 1885555750000 21823 1
1 1885559182000 21823 1
1 1885561029000 21823 1
1 1885561235000 21823 1
1 1885568751000 21823 2
1 1885568794000 21823 2
1 1889124609000 21864 1
1 1889125397000 21864 1
1 1889125511000 21864 1
1 1889126216000 21864 1
1 1889126746000 21864 1
1 1889127005000 21864 1
1 1889127208000 21864 1
1 1889128021000 21864 1
1 1889128155000 21864 1
1 1889128219000 21864 1
1 1891763437000 21895 1
1 1891763843000 21895 1
1 1891765585000 21895 1
1 1891765845000 21895 1
1 1891766075000 21895 1
1 1891766170000 21895 1
1 1891767753000 21895 1
1 1891768296000 21895 1
1 1891790940000 21895 2
1 1891791650000 21895 2
2 1889050832000 21864 1
2 1889051543000 21864 1
2 1889055404000 21864 1
2 1889058548000 21864 1
2 1889061753000 21864 1
2 1889062340000 21864 1
2 1889063595000 21864 1
2 1889065232000 21864 1
2 1889116535000 21864 2
2 1889116760000 21864 2
end
format %tc pick
format %td date
[/CODE]
I want to generate a variable SHIFT showing the shifts a picker works a day. For instance, shift=1 if it is the first shift of the picker during the day. If the time gap between the current pick and the previous pick is more than 2 hours, then the current pick (and later picks) belongs to the second shift, and so on so forth. A picker can have several shifts a day. Can someone help me generate a SHIFT variable?
In addition, I also want to generate two experience variables. The first variable is called CURRENTEXP, which shows the picker’s cumulative working hours until the current pick during a day. I want to use the following logic to construct a variable as a proxy for CURRENTEXP, please let me know if it is not appropriate. For instance, for picker 1 in the first shift in the first day, for the current pick, picker’s CURRENTEXP should be the previous pick time minus the first pick time in shift 1 (01oct2019 12:41:37). For picker 1 in the second shift, picker’s CURRENTEXP should be [(the last pick time in shift 1-the first pick time in shift 1)+(previous pick time-first pick time in shift 2)], and so on so forth.
The second variable is called TOTEXP, which shows the picker’s cumulative working hour until the current pick. For instance, during the nth day, for picker 1 in the first shift, the TOTEXP should be cumulative working hour during n-1 days plus the cumulative working hour during that day. The logic I have in mind is as follows, and please let me know if I am wrong. Can someone help me generate these two variables?
(Day1 shift1 last pick time-Day1 shift 1 first pick time)+ (Day1 shift2 last pick time-Day1 shift 2 first pick time)+(Day2 shift1 last pick time-Day2 shift 1 first pick time)+…
The last question is about the estimation techniques. For each pick, I can actually calculate the time duration that a picker takes to find the product. For instance, for product 1, it takes picker 1 10s to find it. For product 2, it takes picker 1 20s to find it. I have multiple products, multiple pickers, and multiple days. I see the literature using the AFT model to model the factors affecting pick time, such as picker experience. I am wondering is it appropriate to use the AFT mode? I would use the following code:
stset pick_time ////should I take log here?
streg totexp currentexp, distribution(lognormal) time nolog
I am sorry for having so many questions since I do not have much experience dealing with such datasets. I hope I have made myself clear. Any help will be highly appreciated. Thank you very much!
Best,
Changjun
Related Posts with Variable Construction based on Time Stamps
How to do Reality check and SPA test by Stata?Dear Statalist, I want to know whether there are any user written commands by stata for performing …
Regression on Panel DataHello, I am having a bit of trouble running the xtreg command on my dataset. I have panel data on 2…
How to model "Sparial Variability" or "Choice of Location"I have a survey data on 10000 delivery person. I have number of delivery they made in 49 neighborhoo…
Question about diff-in-diff with multiple control groups and one treatment groupHello! I am running a Difference in Difference (DD) regression to see whether the introduction of a…
Parametric survival analysis Good morning. I am doing parametric survival analysis through streg. After streg, I got the followi…
Subscribe to:
Post Comments (Atom)
0 Response to Variable Construction based on Time Stamps
Post a Comment