This is my first post but I've read up on the FAQ so I hope it will be acceptable. To give some context, I'm working with a dataset that records power outages. Each observation is a sensor that records when an outage begins and when it ends. Here is an example dataset:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int(outage_id site_id) long(outage_time restore_time) str24 sensor_id 1 14 1528913151 1528919452 "530039001351363038393739" 1 14 1528913153 1528919542 "200031000951343334363138" 1 19 1528913151 1528919423 "3b0045000151353432393339" 1 36 1528913152 1528935236 "2b004b001251363038393739" 1 36 1528913151 1528935235 "380025001451343334363036" 2 14 1529042683 1529047119 "530039001351363038393739" 2 16 1529042684 1529047117 "43005d000951343334363138" 2 17 1529042684 1529047119 "280021001251363038393739" 2 30 1529042675 1529061132 "48003c001151363038393739" 2 39 1529042682 1529061134 "560044000151353432393339" 2 44 1529042682 1529061134 "500030001951353339373130" 2 46 1529042683 1529061132 "2e001f001251363038393739" 2 46 1529042684 1529061134 "1e0036000951343334363138" end
My goal: create a new variable 'med_restore_time' that is the median restore time within each 'outage_id'. I'm using the egen function in Stata 17.0. Here is what I have tried:
Code:
* begin by looking at what the median should be desc restore_time quietly sum restore_time if outage_id==1, d di %12.0g `r(p50)' quietly sum restore_time if outage_id==2, d di %12.0g `r(p50)' * try median using egen by outage_id: egen med_restore1 = median(restore_time) format %12.0g med_restore1 desc med_restore1 * now let's try using different storage types recast double restore_time by outage_id: egen double med_restore2 = median(restore_time) // specify type format %12.0g med_restore2 desc med_restore2
Best,
Adam
0 Response to Unexpected behavior of the egen function median()
Post a Comment