Hello,
I'm trying to scrape a webpage, and have imported html code as one variable and am trying to extract my data using string functions.
To parse with different substrings that vary only in the middle (hence the asterisk below) I want to do something like this:
split htmlcode, p(`"<div class="ysf-"'*"<a"*">" [2nd string following data])
However, I don't think including the asterisk is acceptable use for the split command.
Can anyone recommend another way to do this?
Thank you!
-Reese
Related Posts with Web scraping / string parsing help
xtreghello, I have a dataset on which I want to perform an OLS. The dataset looks like this: dependent v…
Writing GMM moment conditions for IPWHi all, I am trying to estimate the ATE of an IPW model by writing a GMM code. I am having trouble …
Removing duplicates across several variables in panel data and keep the dup with non-missing valuesHi As the title suggests, I am trying to remove duplicate values across several variables (c30) in …
Stata evalautes what comes after an if statement that it's not trueHi all, I have a do-file (Testing ceqef.do, attached) which I'm using to test an ado-file (ceqef.ad…
Problem with saving graphs with wildcard in namesHi there! I'm having some difficulties saving and combining graphs that I have created using a fore…
Subscribe to:
Post Comments (Atom)
0 Response to Web scraping / string parsing help
Post a Comment