Hi,
I would like to spot the observations that are very much alike within one string variable.
Let's say for instance that I have a variable with 4 observations, such as:
var1
observation1: "cat"
obs 2: "caty"
obs 3: "the cat is beautiful"
obs 4: "cat"
I would like to have some distance measure that tells me that observation 1 and 4 are equal, observations 1 and 2 are quite similar, but observations 1 and 3 are very different. Is it possible?
Thanks
Related Posts with similar observations within one variable
Replacing values in one goIn my data set, the values of a string variable are composed of more than one word. Hence there is a…
How to create binary variables from a composite string variableDear Community, I am relatively new to Stata and am still learns how to use macros and loops which I…
Calculating AARs and CAARs and their t-testsDear forum members, I am currently performing event study in M&A setting. I calculated ARs and …
Joint significance of FEs using PPML estimatorHello statalists, Using the PPML estimator to estimate a gravity model of the form; ppml_panel_sg …
Creating dummy variable based on another dummy variable within a group (panel data)Hi! In my dataset, I have variable firm_id, year, industry_id and acquisition_dummy (=1 if an event…
Subscribe to:
Post Comments (Atom)
0 Response to similar observations within one variable
Post a Comment