Add Delay parameter to Pagination

As selector with type Pagination seems to not follow Request interval (ms) and Page load delay (ms) values, please add Delay parameter to Pagination selector.

Hi,

Can you please post a sitemap where the pagination delay is an issue? Thanks

As an example:

{"_id":"pagination_nhl","startUrl":["https://www.nhl.com/"],"selectors":[{"id":"stats_url","linkType":"linkFromHref","multiple":false,"parentSelectors":["_root"],"selector":".nhl-c-header__menu--collapsible > li:nth-of-type(2) a","type":"SelectorLink"},{"id":"skaters","linkType":"linkFromHref","multiple":false,"parentSelectors":["stats_url"],"selector":"a:contains('All Leaders')","type":"SelectorLink"},{"id":"row_element","multiple":true,"parentSelectors":["skaters","pagination_next"],"selector":"div.rt-tbody div.rt-tr-group","type":"SelectorElement"},{"id":"pagination_next","paginationType":"clickMore","parentSelectors":["skaters","pagination_next"],"selector":"div.pagination button.next-button","type":"SelectorPagination"},{"id":"name","multiple":false,"parentSelectors":["row_element"],"regex":"","selector":"a","type":"SelectorText"}]}

If we set Request interval (ms) and Page load delay (ms) to 10000 ms, these values are working while navigating down to Pagination selector. Once Pagination selector starts working, scraping goes as fast as it can.

I have run the sitemap with standard delay and the pagination was working fine and captured all player names. I don't really see why a delay is necessary.

Thanks for fast reply. The sitemap provided was just a practical example to demonstrate that Pagination selector does not follow Request interval delay rule set for the specific scraping session. There's a plethora of websites with access restriction based on the request count and/or frequency, so it would be nice to have the possibility to adjust delay parameter for such use cases.

Understood, thank you for your input. We will consider adding this feature in future releases of the extension.

Generally, the aim is to execute the scraping jobs as fast as possible. Still, in the case, the request frequency is an issue, the pagination can also be executed by utilizing the 'element click' selector type, where the custom delay option is available.

Please see the reference below:

{"_id":"pagination_nhl","startUrl":["https://www.nhl.com/"],"selectors":[{"id":"stats_url","linkType":"linkFromHref","multiple":false,"parentSelectors":["_root"],"selector":".nhl-c-header__menu--collapsible > li:nth-of-type(2) a","type":"SelectorLink"},{"id":"skaters","linkType":"linkFromHref","multiple":false,"parentSelectors":["stats_url"],"selector":"a:contains('All Leaders')","type":"SelectorLink"},{"id":"row_element","multiple":true,"parentSelectors":["pagination_next"],"selector":"div.rt-tbody div.rt-tr-group","type":"SelectorElement"},{"clickActionType":"real","clickElementSelector":"div.pagination button.next-button","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickMore","delay":7000,"discardInitialElements":"do-not-discard","id":"pagination_next","multiple":false,"parentSelectors":["skaters"],"selector":"_parent_","type":"SelectorElementClick"},{"id":"name","multiple":false,"parentSelectors":["row_element"],"regex":"","selector":"a","type":"SelectorText"}]}