I have been provided with a dataset that contains a string variable 'term' which details the date range of a term duration.
Within the string is a start date (i.e ..period from DD/MM/YYYY) and end date (i.e. ..to DD/MM/YYYY).
I would like to create two variables "date_start" and "date_end" from the original variable provided. Is there a solution to extract each of these two dates noting that three dates are provided within the one string? I have attempted using regexs(1) but have not been able to successfully achieve the outcome.
A small sample of the data is provided here:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long id str100 term float(date_start date_end) 75 "The term of this agreement is for the period from 01-01-2020 to 31-12-2020 commencing on 01-01-2020." . . 96 "The term of this agreement is for the period from 20-04-2015 to 29-04-2016 commencing on 20-04-2015" . . 117 "The term of this agreement is for the period from 01-01-2020 to 31-12-2020 commencing on 01-01-2020." . . 117 "The term of this agreement is for the period from 01-01-2018 to 31-12-2018 commencing on 01-01-2018" . . 117 "The term of this agreement is for the period from 01-01-2017 to 31-12-2017 commencing on 01-01-2017" . . 138 "The term of this agreement is for the period from 30-04-2020 to 30-04-2021 commencing on 30-04-2020." . . 138 "The term of this agreement is for the period from 30-04-2020 to 30-04-2021 commencing on 30-04-2020." . . 159 "The term of this agreement is for the period from 01-01-2018 to 31-12-2018 commencing on 01-01-2018" . . 180 "The term of this agreement is for the period from 30-04-2020 to 30-04-2021 commencing on 30-04-2020." . . 201 "The term of this agreement is for the period from 01-01-2020 to 31-12-2020 commencing on 01-01-2020." . . end label values id id
I am using Stata 16.
Thanks in advance,
Liam
0 Response to Extracting multiple dates from a string
Post a Comment