Dear All,

consider the following example code:
Code:
clear all
input hhid year income
    117  2 63000
    117  3 74500
    117  4 72200
    119  1 33200
    119  2 37040
    120 -1 67200
    120  0 91600
end

list, sepby(hhid)
reshape wide income, i(hhid) j(year)

In the above
  • hhid is the ID of a household,
  • year indicates deviation from some selected base year, and
  • income is income of that household in that year.
Running this code in Stata 16.1 (Windows) results in the following error message:
Code:
income-1 invalid variable name
r(198);
I don't know if this executes fine in Stata 17 yet.

I didn't see anywhere in reshape's documentation any requirement for the i-s and j-s to be non-negative, hence I expected it to be applicable in this case as well.

Tracing the code indicates that the problem is deep within reshape's own code, around reshape.Widefix--> reshape.Subname, which attempts to create a variable name with original variable name subscripted with the category code.

My expectation is that variable income_1 is created for j=-1, while income1 is created for j=1, in other words the underscore character to replace the minus sign which makes the variable name invalid, currently.

However, since this is StataCorp's supplied code, it needs to be fixed by the developers. If the above expectation cannot be granted, perhaps a simple check for the category codes and clear message "Negative codes are not allowed in variable %j%, but found ### occurrences" would be very helpful.

Once it is known that the problem is due to the negatives, I understand that I can use numerous workarounds to fix the problem, such as to rebase the year, for example, but it is the strange error coming out somewhere from deep inside the code that is the problem.

PS: I was also surprised to see that -reshape- utilizes globals, meaning some globals such as S_1 will be ruined by the -reshape- command without any notice to the user.

Thank you, Sergiy Radyakin