Hi everyone,
I'm working on a project using multiple tennis data sets (initially csv files, using a loop to run them into Stata) and two of the columns in my original data, LRank and LPts (referring to the rank of the player who loses the match and LPts referring to the number of points the losing player has coming into the match) contain numbers, except for a few rows where the value is "N/A." When I run my loop function, all the datasets are displayed in Stata, however, once Stata reaches a dataset that contains "N/A" in one of these columns, it doesn't input any of the values for LRank or LPts. I've tried the following code below but when I do this, I normally get a type mismatch error:
gen LRank_str_num = string(LRank)
gen LPts_str_num = string(LPts)
replace LRank_str_num = 99999 if LRank_str_num == "N/A"
destring LRank_str_num, replace
replace LPts_str_num = 99999 if LPts_str_num == "N/A"
destring LPts_str_num, replace
I've included the rest of my loop function here for reference but nothing seems to be wrong with that. The problem is the LRank and LPts columns when they contain "N/A" as a value. All I need to do I believe is add a line of code in my loop function that will change those values to a number (ie replace LPts = 99999 if LPts == "N/A") or convert the entire column into string data, then change the "N/A" values to a number, then convert the column back to integer values.
cd "/Users/rtwb11/Dropbox/23-ECON3720-Project/Original Data"
local files : dir "`filepath'" files "*.csv"
di `"`files'"'
tempfile master
save `master', replace empty
foreach x of local files {
di "`x'"
qui: import delimited "`x'", delimiter(",") case(preserve) clear
qui: gen id = subinstr("`x'", ".csv", "", .)
append using `master', force
save `master', replace
}
save projectdata
Again, I am new to using Stata and appreciate any help. If additional code or data would be helpful, please let me know. I would include the -dataex- values however each tournament is >100 rows so the -dataex- doesn't display the problem I am trying to describe because the first dataset doesn't contain "N/A" in any of its rows. Thank you!
0 Response to Replacing string values with numerical values
Post a Comment