Hello,
I am a complete noob to programming and computers and whatnot (literally: I know nothing, I just found this website since it seems to provide the service I'm looking for, and am barely figuring it out). I'm looking to scrape a large amount of pages of a certain website (about 434,000 pages), but I do not want the web scraper to scrape everything. I want it to ignore scraping certain items on those pages that have a certain term, like so:
Let's say I was searching eBay for violins, and typed in "violin" in the search bar. Let us also say that for some reason I am not looking for beginner violins (instruments generally are made with the level of skill the musicians looking to buy them have in mind: the kinds of violins advanced musicians would buy would be quite different from the kinds of violins beginner musicians would buy). How do I instruct the web scraper, when scraping through the pages of violins, to ignore the items which contain the word "beginner" in them, or, what selectors do I use to accomplish this? I am at a loss at how to do this.
eBay Example Sitemap:
{"_id":"ebay-violins","startUrl":["violin | eBay a.s-item__link","type":"SelectorLink"},{"delay":0,"id":"price","multiple":false,"parentSelectors":["violin-title"],"regex":"","selector":"span[itemprop='price']","type":"SelectorText"},{"delay":0,"id":"seller-page-link","multiple":false,"parentSelectors":["violin-title"],"selector":"a[aria-label='friendstore1682010 (Member ID)']","type":"SelectorLink"}]}