Web-scraper-order

Hi,

I noticed there is now a 'web-scraper-order' column with what appears to be a unix-style timestamp, and that set me wondering as to where and how the scraped information is stored.

For example, if I run a weekly scrape on a site, is only the most recent scrape stored? Or are previous scrapes stored which can be retrieved if you know where to look?

Also, taking 1514847081-111 as an example, what does the '111' after the '-' signify? (That example was taken from a scrape with 91 results and the suffixes started at 97 and ran consecutively to 187.)

Lastly, this column is excluded from the CSV output. Is there a reason for that? (Might be useful to include it or have the option.)

Thank you.

1 Like

Only most recent scraping job data for each sitemap is stored. You can't find previous scraping job data because web scraper deletes previous data when starting a new scraping job.

The "web-scraper-order" column contains unix_time_stamp-record_counter. The record counter is counting from 1 to end of your session.

1 Like

I have one more question in this topic. After I scrape my items on the page the record_counter is not reflecting the original order of items on scraped page. Is there a way to save it?