I would like to collapse string variables after obtaining a concatenated list of all distinct values by group. For example, my data looks like this:
HTML Code:
clear
input str8 item str8 color
"Shirt" "Green"
"Shirt" "Green"
"Shirt" "Gray"
"Jeans" "Gray"
"Jeans" "Gray"
"Jeans" "Blue"
"Jeans" "Black"
end

list
     +---------------+
     |  item   color |
     |---------------|
  1. | Shirt   Green |
  2. | Shirt   Green |
  3. | Shirt    Gray |
  4. | Jeans    Gray |
  5. | Jeans    Gray |
     |---------------|
  6. | Jeans    Blue |
  7. | Jeans   Black |
     +---------------+
preserve
And I would like to collapse by item while not losing any of the distinct values of color. I would like these distinct values in alphabetical order as a comma-separated list. So far I have managed to accomplish it this way:
HTML Code:
egen group_id = group(item)
gen color_concat = ""
forvalues val=1/2 {
    levelsof color if group_id == `val', clean separate(", ")
    replace color_concat = r(levels) if group_id == `val'
}

collapse (first) color_concat, by(item)
list
     +---------------------------+
     |  item        color_concat |
     |---------------------------|
  1. | Jeans   Black, Blue, Gray |
  2. | Shirt         Gray, Green |
     +---------------------------+
However, I wonder if this is a poor use of levelsof and if there is a faster, more elegant solution. I am trying to apply this to datasets with over several hundred thousand observations and this gets very, very slow. I have tried this solution with the code below, but unfortunately I still obtain repeated levels rather than distinct. I would appreciate any help with this.
HTML Code:
restore
sort item color, stable
by item : gen allcolors = color[1]
by item : replace allcolors = allcolors[_n-1] + ", " + color if _n > 1
by item : replace allcolors = allcolors[_N]

collapse (first) allcolors, by(item)
list
     +---------------------------------+
     |  item                 allcolors |
     |---------------------------------|
  1. | Jeans   Black, Blue, Gray, Gray |
  2. | Shirt        Gray, Green, Green |
     +---------------------------------+