I am struggling to find a way to get Stata to ignore these " combinations without removing the binding quotes and so also inadvertently interpreting interior commas as delimiters. It may be possible in another language to do a replace-all of " in the raw csvs, but as I am not very familiar with e.g. python, I am hoping there is a complete solution in Stata.
Here is an example mytest1.csv with six rows, including some of the problem rows:
Code:
"Kuzmin, S","UKR","0.0","Rakhmanin, Y","Lugansk","8","2019-07-21","14129876","Kuzmin, S","1741","20","14126630","Rakhmanin, Y","2053","0.00","-2.80","b","0","r" "Kuzmin, S","UKR","0.0","Medvedsky, V","Lugansk","9","2019-07-21","14129876","Kuzmin, S","1741","20","14138395","Medvedsky, V","1817","0.00","-8.00","b","0","r" "Vysochin, S","UKR","1.0","Drobot, S","\"Cup Independence - 2019 - \"A\" \"Open\"","1","2019-08-23","14103516","Vysochin, S","2493","10","14131129","Drobot, S","2093","1.00","0.80","b","0","r" "Piesik, P","POL","1.0","Kaluzny, K","Turniej Szachowy \"Ferie Zimowe 2009' - grupa A - o Puchar Burmistrza Malborka","3","2009-02-03","1136194","Piesik, P","2197","15","1147285","Kaluzny, K","1847","1.00","1.65","w","0","r" "Gopal, , K.n.","IND","1.0","Karthik, P","Namuduru,","1","2009-01-27","5001447","Gopal, , K.n.","2204","15","5089719","Karthik, P","1854","1.00","1.65","w","0","r" "Blackman, J","BAR","1.0","Wilson, A",\N,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
Code:
import delimited "mytest1", clear
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str12 v4 strL v5 str34 v6 str36 v7 "Rakhmanin, Y" "Lugansk" "8" "2019-07-21" "Medvedsky, V" "Lugansk" "9" "2019-07-21" "Drobot, S" `"\"Cup Independence - 2019 - \"A\" \"Open\","1","2019-08-23","14103516","Vysochin"' `" S","2493","10","14131129","Drobot"' `" S","2093","1.00","0.80","b","0","r""' "Kaluzny, K" `"Turniej Szachowy \"Ferie Zimowe 2009' - grupa A - o Puchar Burmistrza Malborka","3","2009-02-03","1136194","Piesik"' `" P","2197","15","1147285","Kaluzny"' `" K","1847","1.00","1.65","w","0","r""' "Karthik, P" "Namuduru," "1" "2009-01-27" "Wilson, A" "\N" "\N" "\N" end
Code:
import delimited "mytest1", bindquotes(nobind) clear
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str9 v1 str3 v2 str6 v3 str5 v4 str10 v5 str8 v6 str80 v7 `""Kuzmin"' `" S""' `""UKR""' `""0.0""' `""Rakhmanin"' `" Y""' `""Lugansk""' `""Kuzmin"' `" S""' `""UKR""' `""0.0""' `""Medvedsky"' `" V""' `""Lugansk""' `""Vysochin"' `" S""' `""UKR""' `""1.0""' `""Drobot"' `" S""' `""\"Cup Independence - 2019 - \"A\" \"Open\""' `""Piesik"' `" P""' `""POL""' `""1.0""' `""Kaluzny"' `" K""' `""Turniej Szachowy \"Ferie Zimowe 2009' - grupa A - o Puchar Burmistrza Malborka""' `""Gopal"' " " `" K.n.""' `""IND""' `""1.0""' `""Karthik"' `" P""' `""Blackman"' `" J""' `""BAR""' `""1.0""' `""Wilson"' `" A""' "\N" end
Code:
import delimited "mytest1", stripquotes(nobind) clear
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str15 v1 str5(v2 v3) str14 v4 strL v5 str34 v6 `""Kuzmin, S""' `""UKR""' `""0.0""' `""Rakhmanin, Y""' `""Lugansk""' `""8""' `""Kuzmin, S""' `""UKR""' `""0.0""' `""Medvedsky, V""' `""Lugansk""' `""9""' `""Vysochin, S""' `""UKR""' `""1.0""' `""Drobot, S""' `""\"Cup Independence - 2019 - \"A\" \"Open\"","1","2019-08-23","14103516","Vysochin"' `" S","2493","10","14131129","Drobot"' `""Piesik, P""' `""POL""' `""1.0""' `""Kaluzny, K""' `""Turniej Szachowy \"Ferie Zimowe 2009' - grupa A - o Puchar Burmistrza Malborka","3","2009-02-03","1136194","Piesik"' `" P","2197","15","1147285","Kaluzny"' `""Gopal, , K.n.""' `""IND""' `""1.0""' `""Karthik, P""' `""Namuduru,""' `""1""' `""Blackman, J""' `""BAR""' `""1.0""' `""Wilson, A""' "\N" "\N" end
0 Response to Importing delimited csv with special characters including double quotes
Post a Comment