Dear all,
I have a collection of around 2,400 PDFs of parliamentary debate transcriptions that I would like to import into Stata. Having found no easy solution to directly importing PDFs into Stata, I have batch converted them to text files to import them.
I have tried using multimport (multimport delimited, extensions (txt) clear) as a way to bring all of the text files in. However, this command by itself is incorrect because it returns only 200 observations, when there should be around 1 million. I have read the help file and tried to look at alternative approaches (for example a loop involving import delimited) but couldn't solve this issue.
The attraction of multimport is that I can potentially record the filename as a new variable, which would be helpful in later processing.
I have two questions based on this:
1. Is conversion of PDFs to text files before importing the appropriate way to approach this problem?
2. If multimport is the correct command, does anyone have any insight on how to tailor the command to get the appropriate output?
Thanks,
Nate
Related Posts with Help/advice on importing large number of text files into Stata
difference to difference- dependent variable binaryDear all, I am examining the effect of introduction of minimum wage on employment, my dependent var…
Interpretation of Predict / Margins Results following Nonlinear RegressionsHello everyone, I run xtnbreg for the panel data and would like to use predict/margins following th…
Display date format in regression results when using factor variableHello apologies for the probably trivial question - I am running something like regress y i.month …
Margins after stcox with multiple records, cluster and time-varied covariatesHi, I have a complex dataset with multiple observations and the Cox model has the cluster option an…
Getting medians for different x variables across yearsHi there Statlist community! I am currently working on summarizing home value appreciation by neigh…
Subscribe to:
Post Comments (Atom)
0 Response to Help/advice on importing large number of text files into Stata
Post a Comment