I have poorly entered data where date is embedded within text (var1). I need to extract the date from this as shown below:
Code:
var1 | Extracted date |
XXXXXXXX XXX 15/14.04 | 15.04 |
XXXXXXXX XXX 12.04 | 12.04 |
XXXXXXXX XXX 11.04(10.04) | 11.04 |
XXXXXXXX XXX 12.04 | 12.04 |
XXXXXXXX XXX XXXXX 15/14.04 | 15.04 |
XXXXXXXX XXX 20.04.2020 | 20.04 |
XXXXXXXX XXX 15(13.04) | 15.04 |
XXXXXXXX XXX 20/17.04 | 20.04 |
XXXXXXXXXXX XXXXXXXX XXX 15/14.04 | 15.04 |
XXXXXXXXXXXX XXX 18.04/XXXXXX | 18.04 |
Code:
gen dated1= regexs(0) if regexm(var1, "([0-9][0-9]*[0-9][0-9]*[0-9][0-9]*$)") * I get 2020 for observations in format "XXXXXXXX XXX 20.04.2020" gen dated2 = regexs(1) if(regexm(var1, ".*([0-9][0-9]*[0-9][0-9])*")) *invalid number, outside of allowed range gen dated3 = regexs(0) if(regexm(report1, ".*([0-9][0-9]*[0-9][0-9])*")) * dated3 is the same as var1 gen dated4 = regexs(0) if(regexm(var1, "(*([0-9][0-9]+)*([0-9][0-9]+))*")) *regexp: ?+* follows nothing gen dated5 = regexs(0) if(regexm(var1, "(*([0-9][0-9])*([0-9][0-9]))*")) *regexp: ?+* follows nothing gen day = regexs(0) if regexm(var1, "[0-9][0-9]*") gen month = regexs(0) if(regexm(var1, "*[0-9][0-9]*([0-9][0-9])*")) *regexp: ?+* follows nothing
Thank you
Deepali
0 Response to Extract date from a string variable containing other text
Post a Comment