Forward and Backwards ?page=[n-m] Scraping

When you specify a base url that contains the variable ?page=[n-m], Web Scraper will automatically cycle through the pagination between values n and m, however, it currently does so in reverse. If I specify ?page=[1-20] then Web Scraper will start on page 20 and then go to 19 then 18 and so on. Please add a checkbox so that it will start on page 1, and iterate forward to 2, then 3... etc. Thanks.

1 Like

bump. It's been 12 months and the developers have missed my post. I'm sorry, it's probably something I did wrong.

I think not go to implemented because is the same, you have the column with order number, so only need to order the column how you want

The site that I scrape has items ordered newest to oldest, with pagination going from newest batches to oldest batches of results. When I scrape result pages ?page=[1-100] then Webscraper starts on page 100 then goes to page 99 and 98. When new items are added to results page 1, every couple minutes, then the last couple results from each page are pushed onto the next page. This causes them to be missed and skipped over by Webscraper because it's going backwards, not forwards.

If Webscraper were to go forwards, starting at results page 1, then page 2, and so on, then when items are pushed from the previous page to the next page, they will be scraped twice instead of scraped 0 times. I would rather they get scraped twice than not at all.

I don't know why WS works "backwards". I occasionally face issues with this so my workaround is to just bulk-generate URLs that are in the correct order. You can start with a list of descending numbers (many ways to create this):
100
99
98
...
3
2
1

Then just prepend the needed text to create URLs
something?page=3
something?page=2
something?page=1

After that, you can bulk-add the URLs to your sitemap.

See

Same issue here, I'm not really sure why this cycles the numbers backwards, but given that it's been >2 years, any way to have it walk in the correct order yet...?