In a dataset, I have rows with fyear (fiscal year) and CIK numbers, which are company identifiers from Compustat.
See the code below.
The problem is that the CIK numbers are defined as string (str7). I need to change CIK to a numeric variable in order to merge this dataset with my other dataset, where the CIK variable is a numeric variable. I used gen nummericCIK = real(CIK) but what STATA then does is remove all the 0's in the CIK number. Where it is good that STATA removes the "first" 0's in the number, because my CIK numbers do not start with 0, it is wrong that STATA removes the 0's in the rest of the number.
For example, the first CIK number is 0912057. STATA should remove the first 0 here, but not the second zero.
I tried "replace CIK = subinstr(CIK, "0", 1)" and this works for removing the 0's; however, if I want to destring the variable then, STATA keeps giving me the error that there the variable contains nonnumeric characters.
Anyone who knows what to do?
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int fyear str7 CIK 1993 "0912057" 1993 "0032377" 1993 "0950131" 1993 "0353944" 1993 "0038777" 1993 "0912057" 1993 "0912057" 1993 "0868016" 1993 "0950124" 1993 "0950123" 1993 "0950152" 1993 "0060302" 1993 "0051296" 1993 "0950131" 1993 "0891618" 1993 "0808450" 1993 "0096935" 1993 "0889810" 1993 "0912057" 1993 "0950131" 1993 "0912057" 1993 "0034501" 1993 "0898430"
0 Response to Problems with destring CIK numbers
Post a Comment