I have ID, time and Sentiment as variables. Now Based on the first value of the sentiment for each ID, I want to see when sentiment increased by at least by 0.25 or 25% compared to the first value of sentiment for each ID. So, as soon as the sentiment increases by 0.25 or 25%, the "required" variable should give me 1, otherwise, it should give me a missing value.
Now the tricky part is that I need to know when the increase first occurred by 0.25 or 25%. So, when the increase happens, it gives me a value of 1 in "required" variable, otherwise, 0.
For example, for ID 1, the value of sentiment at time 5 and 6 is 1.25 and 1.4 respectively, Although the sentiment increased in both instances, I am only interested to find the First increase.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte(id time) double sentiment byte required 1 1 1 . 1 2 1 . 1 3 1.2 . 1 4 1.22 . 1 5 1.25 1 1 6 1.4 . 1 7 1.2 . 1 8 1.3 . 1 9 1.4 . 1 10 1.2 . 2 1 1.2 . 2 2 .9 . 2 3 . . 2 4 .8 . 2 5 .95 . 2 6 1 . 2 7 1.5 1 2 8 1.2 . 2 9 1.3 . 2 10 .7 . end
I have made a simple code but I am not sure about its reliability for more than 50,000 IDs and more than 10 million obs. Plus it looks very messy.
Code:
generate a = sentiment if time == 1 // this generates the first value of sentiment bysort id: replace a = a[_n-1] if a ==. // now the first value of sentiment is carried forward to populate "a" generate b = sentiment - a // this will let me know if the value is greater than 0.25 or not. generate c = 1 if b >= 0.25 & !missing(sentiment) // This will generate value of 1 if "b" is equal to or greater than 0.25. Furthermore, it will ignore any missing value in the original sentiment // QUESTION: for "c" is there any way to find the increase by lets say 25%? not by a specific number like 0.25? by id (time), sort: gen d = sum(c) // this will sum my non missing values by id: gen e = d if d == 1 & d[_n - 1] != d // this will give me answer of 1 for my first occurence
Thanks
0 Response to How to find and identify increase in a variable based on the first value of the variable?
Post a Comment