Inconsistency in Pagination :/

Hi Guys,

I am trying to scrape data of different projects on Crowdcube. Since there is only a "load more" button, I used pagination to navigate through all the projects. However, my results are very inconsistent: Sometimes the "load more" button is clicked 3 times by the scraper and then the data is scraped (resulting in 3x15 projects scraped), and other times it is clicked, let's say, 5 times (resulting in 5x15 projects scraped).

Ideally, I want to scrape about 500 projects scraped, but it doesn't paginate "far enough". Really looking forward to your help. Thanks!

Url: https://www.crowdcube.com/companies

Sitemap:
{"_id":"crowdcubev2","startUrl":["https://www.crowdcube.com/companies"],"selectors":[{"id":"pagination","parentSelectors":["_root","pagination"],"paginationType":"clickMore","selector":"button.cc-pagination__load","type":"SelectorPagination"},{"id":"link","parentSelectors":["_root","pagination"],"type":"SelectorLink","selector":"a.cc-card__link","multiple":true,"delay":0},{"id":"view_link","parentSelectors":["link"],"type":"SelectorLink","selector":".cc-company__history a","multiple":false,"delay":0},{"id":"money_raised","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Raised') + dd","multiple":false,"delay":0,"regex":""},{"id":"amount_investors","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Investors') + dd","multiple":false,"delay":0,"regex":""},{"id":"target","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Target') + dd","multiple":false,"delay":0,"regex":""},{"id":"equity","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Equity') + dd","multiple":false,"delay":0,"regex":""},{"id":"pre-money valuation","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Pre-money valuation') + dd","multiple":false,"delay":0,"regex":""},{"id":"share price","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Share price') + dd","multiple":false,"delay":0,"regex":""},{"id":"description","parentSelectors":["view_link"],"type":"SelectorText","selector":".column p","multiple":false,"delay":0,"regex":""},{"id":"followers","parentSelectors":["view_link"],"type":"SelectorText","selector":"div:nth-of-type(3) dt","multiple":false,"delay":0,"regex":""},{"id":"largest investment","parentSelectors":["view_link"],"type":"SelectorText","selector":"div:nth-of-type(5) dt","multiple":false,"delay":0,"regex":""},{"id":"rewards","parentSelectors":["view_link"],"type":"SelectorText","selector":".cc-linearSelectorDescriptor div.cc-headedPanel__content","multiple":false,"delay":0,"regex":""},{"id":"Name","parentSelectors":["view_link"],"type":"SelectorText","selector":"div.cc-pitchCover__title","multiple":false,"delay":0,"regex":""},{"id":"date","parentSelectors":["link"],"type":"SelectorText","selector":".cc-company__history td:nth-of-type(1)","multiple":false,"delay":0,"regex":""},{"id":"industry type","parentSelectors":["view_link"],"type":"SelectorText","selector":"li:nth-of-type(4) p","multiple":false,"delay":0,"regex":""}]}

@peterpane Hello, have you tried replacing the 'pagination' selector within an 'Element click' selector instead(with the 'Page load delay' set to at least 3'500-5'000)?

Example:

{"_id":"crowdcubev2","startUrl":["https://www.crowdcube.com/companies"],"selectors":[{"clickElementSelector":"button.cc-pagination__load","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickMore","delay":3500,"discardInitialElements":"do-not-discard","id":"pagination","multiple":true,"parentSelectors":["_root"],"selector":"div.cc-grid__cell","type":"SelectorElementClick"},{"delay":0,"id":"link","multiple":false,"parentSelectors":["pagination"],"selector":"a.cc-card__link","type":"SelectorLink"},{"delay":0,"id":"view_link","multiple":false,"parentSelectors":["link"],"selector":".cc-company__history a","type":"SelectorLink"},{"delay":0,"id":"money_raised","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"dt:contains('Raised') + dd","type":"SelectorText"},{"delay":0,"id":"amount_investors","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"dt:contains('Investors') + dd","type":"SelectorText"},{"delay":0,"id":"target","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"dt:contains('Target') + dd","type":"SelectorText"},{"delay":0,"id":"equity","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"dt:contains('Equity') + dd","type":"SelectorText"},{"delay":0,"id":"pre-money valuation","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"dt:contains('Pre-money valuation') + dd","type":"SelectorText"},{"delay":0,"id":"share price","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"dt:contains('Share price') + dd","type":"SelectorText"},{"delay":0,"id":"description","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":".column p","type":"SelectorText"},{"delay":0,"id":"followers","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"div:nth-of-type(3) dt","type":"SelectorText"},{"delay":0,"id":"largest investment","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"div:nth-of-type(5) dt","type":"SelectorText"},{"delay":0,"id":"rewards","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":".cc-linearSelectorDescriptor div.cc-headedPanel__content","type":"SelectorText"},{"delay":0,"id":"Name","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"div.cc-pitchCover__title","type":"SelectorText"},{"delay":0,"id":"date","multiple":false,"parentSelectors":["link"],"regex":"","selector":".cc-company__history td:nth-of-type(1)","type":"SelectorText"},{"delay":0,"id":"industry type","multiple":false,"parentSelectors":["view_link"],"regex":"","selector":"li:nth-of-type(4) p","type":"SelectorText"}]}

@ViestursWS
Thanks a thousand times for the super quick reply.

I am currently trying it and so far, it goes through all the "load more" buttons. Hope the data gets scraped correctly as well. I'll let you know.

Is there any way to limit the pagination runs? So that I get the first 500 projects?

Best

Worked very well, thanks @ViestursWS