What happens when internet connection is lost during a scraping run?

What happens when internet connection is lost during a scraping run? Does webscraper know how to handle this so that you don't lose records? If it can't load a particular page due to lack of internet connection, does it just patiently retry and wait until connection is available again? Or does it skip those pages and leave gaps in your scraping data?

WS doesn't really have failover, so what happens during a bad connection will depend on what type of pagination you use.

If you are using the page range type of pagination, page[1-100], it will just continue trying to load the next pages every XX seconds based on the Page load delay you set. If there was a bad/lost connection on any of the pages, you will usually see some rows with a lot of null,null,null.

If you are using link or click-based pagination, WS will probably stop because it cannot find the link, and you will only have the scraped results up till that point.

3 Likes