I just realised that I have no idea what is the logic behind how scalars and results such as r(something) are being dereferenced and used. I have been using those with some success for 2 decades now, but if you ask me to explain why I am using one form or another, I would have nothing to say... I just memorised which is the form that works in a particular context...

Can somebody put some order and logic into all of this, or point to references that I can read to clarify it for myself?

Examples follow:


1. Example 1: in this example, if I refer to r(mean) directly it works, if I dereference it `r(mean)' it works too, and if I pass it through a scalar, it works too.

Code:
. sysuse auto, clear
(1978 Automobile Data)

. clonevar price2 = price

. clonevar price3 = price

. qui summ price

. replace price = . if price<r(mean)
(52 real changes made, 52 to missing)

. qui summ price2

. replace price2 = . if price2<`r(mean)'
(52 real changes made, 52 to missing)

. qui summ price3

. sca Mean = r(mean)

. replace price3 = . if price3<Mean
(52 real changes made, 52 to missing)


. summ price*

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       price |         22    9814.364    3022.929       6229      15906
      price2 |         22    9814.364    3022.929       6229      15906
      price3 |         22    9814.364    3022.929       6229      15906

.
And then I encounter the first problem, when I try to dereference the scalar Mean, it does not work:

Code:
. sysuse auto, clear
(1978 Automobile Data)

. summ price

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       price |         74    6165.257    2949.496       3291      15906

. return list

scalars:
                  r(N) =  74
              r(sum_w) =  74
               r(mean) =  6165.256756756757
                r(Var) =  8699525.974268788
                 r(sd) =  2949.495884768919
                r(min) =  3291
                r(max) =  15906
                r(sum) =  456229

. sca Mean = r(mean)

. replace price = . if price < `Mean'
invalid syntax
r(198);

So when I write -return list- Stata claims that r(mean) is a scalar... But then I am able to dereference it as `r(mean)', whereas when I generate manually the scalar Mean, I am not able to dereference the scalar by `Mean'...

The mystery gets deeper when I try to use those in loops:


Example 2:

Code:
. sysuse auto, clear
(1978 Automobile Data)

. keep in 1/3
(71 observations deleted)

. qui summ price

. forvalues i = 1/r(N) {
  2. dis `i'
  3. }
invalid syntax
r(198);

. forvalues i = 1/`r(N)' {
  2. dis `i'
  3. }
1
2
3

. sca Nobs = r(N)

. forvalues i = 1/Nobs {
  2. dis `i'
  3. }
invalid syntax
r(198);

. forvalues i = 1/`Nobs' {
  2. dis `i'
  3. }
invalid syntax
r(198);

. dis r(N)
3

. dis Nobs
3

. dis `Nobs'


. dis `r(N)'
3

.
So here Stata accepted only the second syntax `r(N)' and Stata rejected everything else as invalid syntax.

Does anybody see any logic to all of this?