Trying to scrape information from a product page to know if a particular stocked item is no longer in stock. Easily done with a page if it says out of stock, or similar...
Trouble happens when the product URL is no longer active and throws a 404 error page. I thought it would have been easy - just scrape the text on the 404 page.... but you cant.
You cant scrape the page of any information if the page is a 404 page. No txt, no image, no HTML, no elements, no nothing. webscraper just closes.
Data preview works perfectly fine when you are in the 404 page making the selectors but once you scrape it, it closes without collecting the data. I tested this on multiple sites and multiple URLs. It happens any 404 page, whether its a redirect to a 404 error page or just a standard 404 on the same URL
test any URL that throws a 404 error page to see.
Would love some feedback on this
I am scraping data from ebay.com in particular
Heres a quick demo sitemap:
{"_id":"thissux","startUrl":["https://www.ebay.com/dontexist"],"selectors":[{"id":"title","type":"SelectorText","parentSelectors":["_root"],"selector":"p.error-header__headline","multiple":false,"regex":"","delay":0}]}