I hope you can help me out with some of the questions I have regarding the nearest neighbour test in Stata. It's my first time using this program for doing research so please excuse me for what might seem like basic questions about Stata. I have tried my best to do proper research before asking these questions, however, some of the possible solutions are not clear to me still.
Goal
I want to use the nearest neighbour matching to compare fraudulent crowdfunding campaigns with "normal" campaigns on IndieGoGo on several outcome variables. (e.g Funds Raised Percent (funds_raised_percent) but also Number of FB friends).
So far I have collected 60 fraudulent (treated) campaigns and I have circa 19,000 "normal" campaigns. I specifically want to do nearest neighbour matching so I can find the most comparable control campaigns and further collect data on those as well in order to compare the treated sample on those further data points. I would like to do this with replacement in order to decrease the potential for biases.
Problem
I think using the teffects nnmatch command in Stata is the best way to go. When running this in Stata
Code:
gen obs=_n sort obs teffects nnmatch (funds_raised_percent goal2 yr category_n currency_n) (fraud), nn(1) gen(match)
Code:
teffects nnmatch (funds_raised_percent goal2) (fraud), ematch(yr category_n currency_n)
When adding this to the code. I again get the error 16585 observations have no exact matches; they are identified in the osample() variable.
Only when making the osample 3 times it would run it but this made my limited treatment sample drop by half.
Code:
teffects nnmatch (funds_raised_percent goal2) (fraud), ematch(yr category_n currency_n) osample(no_match) teffects nnmatch (funds_raised_percent goal2) (fraud) if no_match == 0, ematch(yr category_n currency_n) osample(no_match2) teffects nnmatch (funds_raised_percent goal2) (fraud) if no_match == 0 & no_match2 == 0, ematch(yr category_n currency_n) osample(no_match3) teffects nnmatch (funds_raised_percent goal2) (fraud) if no_match == 0 & no_match2 == 0 & no_match3 == 0, ematch(yr category_n currency_n) osample(no_match4)
Additional Questions
Furthermore, I have a question about the command pscore in Stata. I see a lot of methods first indeed calculating the pscore before doing the psmatch2 command. However, when I look for examples using teffects nnmatch or teffects psmatch I don't see them doing this. Is this only a necessity when using psmatch2?
Additionally, as I said I want to compare the two groups on multiple outcomes as I said before. Do I simply run the command each time but then for every different outcome or is there a better way to do this?
Lastly, I want to make sure that my data is robust. I found examples where they use the command pscore and then graphically check or use pstest to evaluate the matches Stata has given them. Is this also possible when using teffects? Or are there any other ways to make sure that the results are robust by doing different robustness checks.
Here you can find an extract of the first 100 observations of my Data. I hope that so far everything about my questions is clear and I would be very grateful if anyone wants to discuss these matters with me.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float fraud double funds_raised_percent float(goal2 close_date2) long(category_n currency_n) 1 14.4182 50000 20375 22 11 1 .3644 10000 20429 9 11 1 .15964 50000 20539 17 11 1 72.5397 10000 20076 17 11 1 1.7944170771756978 9135 19711 17 11 1 5.96116 50000 20287 17 11 1 15.02533 100000 19582 9 11 1 10.743513281919451 29175 21436 12 11 1 197.105 4000 20828 22 11 1 615.295 5000 21534 19 11 1 61.4848 10000 21277 12 11 1 30.69185 20000 21023 2 11 1 45.92786666666667 30000 20801 19 11 1 11.887853333333334 75000 20568 19 11 1 .46478 50000 21384 9 11 1 80.68761333333333 75000 20181 22 11 1 7.6123 30000 20560 2 11 1 12.0938 5000 21413 12 11 1 61.0788 20000 21252 11 11 1 65.3859 20000 21307 11 11 1 75.0848 20000 21344 19 11 1 45.16225 20000 21428 3 11 1 197.8408 10000 21387 9 11 1 300.3877 10000 21161 13 11 1 6.9476 5000 21346 17 11 1 17.4329 30000 21380 12 11 1 16.357866666666666 30000 21162 12 11 1 16.57308 25000 21399 22 11 1 2.46165 80000 20943 13 11 1 4.814493333333333 75000 21181 12 5 1 69.9352 20000 21393 22 11 1 26.513475 80000 20470 2 11 1 41.270525 40000 20463 12 11 1 4.789 25000 21042 3 11 1 3.81288 25000 20700 8 6 1 265.455 3000 21193 2 11 1 103.1740649404028 4866 21103 9 11 1 3.4396 10000 21219 19 11 1 138.031 1000 19922 22 11 1 2.3247125 80000 20934 22 11 1 5.80118 50000 21428 22 11 1 3.6510333333333334 30000 21289 22 11 1 .08903 100000 20752 22 11 1 28.0302 50000 19992 9 11 1 10.80585 100000 19828 12 11 1 .406605 1000000 20274 10 11 1 53.102716636922466 15129 21314 12 11 1 3.1738125 160000 20386 13 11 1 6.181226626776365 13370 20095 17 11 1 7.449975 40000 19703 17 11 1 204.342 500 19804 17 11 1 3.86392 100000 19692 12 11 1 8.81946 50000 20680 12 11 1 5.137 2000 21295 17 11 1 28.2963 10000 21074 17 11 1 238.7733 10000 21575 17 11 0 .518 500 21066 9 6 0 .007 100000 20964 5 11 0 .0025 12000 19499 8 11 0 .0436 50000 21267 25 2 0 .024333333333 3000 19772 4 11 0 .065 30000 21730 10 11 0 .289 10000 21037 14 11 0 2.481066666667 15935.484 21166 9 11 0 .528 5000 19073 12 11 0 .000416666667 36000 20335 19 11 0 .216 2500 19998 14 11 0 .539090669246 11371 21864 26 11 0 .11755 20000 19935 23 11 0 1.064 20000 21165 24 11 0 1.2 750 20202 18 11 0 .0735 10000 20319 15 11 0 .021666666667 6000 20111 18 5 0 .199333333333 15000 19762 14 2 0 .194 3000 20612 4 5 0 2.934666666667 1544.298 20930 11 11 0 .005 30000 20927 13 11 0 .149 10000 19601 1 5 0 .016834189378 40691 21378 24 11 0 .024 5000 21697 11 11 0 .013432 250000 20121 8 11 0 1.0172 15476.8 21882 15 11 0 1.2 500 20553 23 11 0 .39 500 21740 5 6 0 .125566666667 120000 20321 22 11 0 .007857142857 21000 21367 5 2 0 .171833333333 6000 19084 8 11 0 1.03612 62175.23 20918 1 11 0 .524 20000 21637 21 11 0 1.135 1000 20022 26 2 0 1 6000 20317 17 11 0 5.3128 5423.129 21739 22 11 0 1.0198 10133.36 21777 1 11 0 2.451 1000 21723 26 11 0 .399230769231 13000 21644 16 11 0 1.101230769231 6500 21430 20 11 0 31.6692 2545.565 21636 22 6 0 .003706666667 75000 20120 18 11 0 11.283083399105 4397.025 21856 13 11 0 15.7015 13103.59 21642 17 11 end format %tdMon_DD,_CCYY close_date2 label values category_n category_n label def category_n 1 "Art", modify label def category_n 2 "Audio", modify label def category_n 3 "Camera Gear", modify label def category_n 4 "Comics", modify label def category_n 5 "Culture", modify label def category_n 8 "Environment", modify label def category_n 9 "Fashion & Wearables", modify label def category_n 10 "Film", modify label def category_n 11 "Food & Beverages", modify label def category_n 12 "Health & Fitness", modify label def category_n 13 "Home", modify label def category_n 14 "Human Rights", modify label def category_n 15 "Local Businesses", modify label def category_n 16 "Music", modify label def category_n 17 "Phones & Accessories", modify label def category_n 18 "Photography", modify label def category_n 19 "Productivity", modify label def category_n 20 "Tabletop Games", modify label def category_n 21 "Transportation", modify label def category_n 22 "Travel & Outdoors", modify label def category_n 23 "Video Games", modify label def category_n 24 "Web Series & TV Shows", modify label def category_n 25 "Wellness", modify label def category_n 26 "Writing & Publishing", modify label values currency_n currency_n label def currency_n 2 "CAD", modify label def currency_n 5 "EUR", modify label def currency_n 6 "GBP", modify label def currency_n 11 "USD", modify
do you solve it? i meet the same problem as you did
ReplyDelete