Dear all,
I have a variable called place_birth in my dataset. Some of the locations weren't recorded properly.
place_birth
Feucherolles (Saint-James = Le château royal de Sainte-Gemme)
ST B(?), canton de Chaillot
(?Chanvrand)Canton de La Guiche
Seine-Inférieure (Seine-Maritime)
Épinay-sur-Seine ,
Autine (?) Outines
Darrois ? Darvois
I would like to do two things.
First, separate what is inside parenthesis () and comma , and = from the text. With what I separate I can create an new variable called place_new
Second, clean both variable from weird signs like ?, =, . at the end, /, etc...
For example
Épinay-sur-Seine ,
should look like
Épinay-sur-Seine
replace ? and (?) with a comma
Autine (?) Outines
it becomes
Autine , Outines
For this one:
Feucherolles (Saint-James = Le château royal de Sainte-Gemme)
Eliminate "Saint-James =" and just leave:
Feucherolles (Le château royal de Sainte-Gemme)
Then I can separate the strings by comma and parenthesis so that for example:
place_birth
(?Chanvrand)Canton de La Guiche
becomes:
place_new
Chanvrand
Or:
place_birth
Seine-Inférieure (Seine-Maritime)
Becomes in the new var:
place_new
Seine-Maritime
0 Response to Cleaning string variable
Post a Comment