Hello,
I am trying to run the Stata code below, and everything runs except at the very end I am getting 'the command i unrecognized r(199) error'. How can I avoid this error? I am new to Stata and I am not so sure. I have attached the pharmacy_small.dta file with this post so that you can run the code on your computer.
STATA CODE:
clear
//import the pharmacy_small Stata dataset
use pharmacy_small
// change the the variables store_type, area, and compliance into binary categorical variables with 0's and 1's
generate chain = store_type == "CHAIN"
generate north = area == "North"
// numericize all the string categorical variables while retaining the same label
encode county, generate(county_num)
python:
# install sklearn, sfi, numpy, and pandas packages first
# make sure to install them first!
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn import metrics # import scikit-learn metrics module for accuracy calculation
from sfi import Data
import numpy as np
import pandas as pd
# Use the sfi Data class to pull data from Stata variables
X = pd.DataFrame(Data.get("educate north county_num chain"),
columns = ['educate', 'north', 'county_num', 'chain'])
Y = pd.DataFrame(Data.get("compliance"), columns = ['compliance'])
# split the pharmacy_small dataset into a training and a test set using the python commands
# splitting data into a test and training set is much easier in Python than in Stata (takes 1 line)
# 'test_size = 0.25' tells Python that we want to reserve 25% of our data for the test set
# train_test_split() will automatically shuffle the data before the split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25)
end
clear
gen Alpha = .
gen AUC = .
local i = 0
range alphas 0.0 1.0 20
foreach a in alphas {
i++
python: a = Data.get("a")
// predict using the best value for alpha
python: mnb = MultinomialNB(alpha = a, class_prior = None, fit_prior = True)
// calculate probability of each class on the test set
// '[:, 1]' at the end extracts the probability for each pharmacy to be under compliance
python: Y_mnb_score = mnb.fit(X_train, np.ravel(Y_train)).predict_proba(X_test)[:, 1]
// make test_compliance python variable
python: test_compliance = Y_test['compliance']
// transfer the python variables Y_mnb_score and test_compliance to STATA
python: Data.setObsTotal(len(Y_mnb_score))
python: Data.addVarFloat('mnbScore')
python: Data.store(var = 'mnbScore', obs = None, val = Y_mnb_score)
python: Data.setObsTotal(len(test_compliance))
python: Data.addVarFloat('testCompliance')
python: Data.store(var = 'testCompliance', obs = None, val = test_compliance)
roctab testCompliance mnbScore
replace AUC = r(area) in `i' // at this point I am getting an error, I think
replace Alpha = `a'
}
Thank you for your help!
Related Posts with stata command unrecognized r(199) error
Percent change with lincom?Hi, Would appreciate some help - I'm relatively new to Stata. I've done an interrupted time series…
How to calculate a confindence interval for multiple *dta file using a loop. Hi everybody!! I have downloaded 32 *.dta files of the Permanent Household Survey of Argentina th…
URGENT Replace missing values depending on values in other observationsHi everyone, I have an extra urgent question for you. Here is my situation : I am working on a medi…
INEQDECO: bygroupDear members, I have a question about the ineqdeco, especially the results of the bygroup() option.…
Kappaetc weightingHi, I am currently trying to measure inter rater reliability for a set of data as below. The raters…
Subscribe to:
Post Comments (Atom)
0 Response to stata command unrecognized r(199) error
Post a Comment