Hi Guys,
I am trying to scrape data of different projects on Crowdcube. Since there is only a "load more" button, I used pagination to navigate through all the projects. However, my results are very inconsistent: Sometimes the "load more" button is clicked 3 times by the scraper and then the data is scraped (resulting in 3x15 projects scraped), and other times it is clicked, let's say, 5 times (resulting in 5x15 projects scraped).
Ideally, I want to scrape about 500 projects scraped, but it doesn't paginate "far enough". Really looking forward to your help. Thanks!
Url: https://www.crowdcube.com/companies
Sitemap:
{"_id":"crowdcubev2","startUrl":["https://www.crowdcube.com/companies"],"selectors":[{"id":"pagination","parentSelectors":["_root","pagination"],"paginationType":"clickMore","selector":"button.cc-pagination__load","type":"SelectorPagination"},{"id":"link","parentSelectors":["_root","pagination"],"type":"SelectorLink","selector":"a.cc-card__link","multiple":true,"delay":0},{"id":"view_link","parentSelectors":["link"],"type":"SelectorLink","selector":".cc-company__history a","multiple":false,"delay":0},{"id":"money_raised","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Raised') + dd","multiple":false,"delay":0,"regex":""},{"id":"amount_investors","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Investors') + dd","multiple":false,"delay":0,"regex":""},{"id":"target","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Target') + dd","multiple":false,"delay":0,"regex":""},{"id":"equity","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Equity') + dd","multiple":false,"delay":0,"regex":""},{"id":"pre-money valuation","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Pre-money valuation') + dd","multiple":false,"delay":0,"regex":""},{"id":"share price","parentSelectors":["view_link"],"type":"SelectorText","selector":"dt:contains('Share price') + dd","multiple":false,"delay":0,"regex":""},{"id":"description","parentSelectors":["view_link"],"type":"SelectorText","selector":".column p","multiple":false,"delay":0,"regex":""},{"id":"followers","parentSelectors":["view_link"],"type":"SelectorText","selector":"div:nth-of-type(3) dt","multiple":false,"delay":0,"regex":""},{"id":"largest investment","parentSelectors":["view_link"],"type":"SelectorText","selector":"div:nth-of-type(5) dt","multiple":false,"delay":0,"regex":""},{"id":"rewards","parentSelectors":["view_link"],"type":"SelectorText","selector":".cc-linearSelectorDescriptor div.cc-headedPanel__content","multiple":false,"delay":0,"regex":""},{"id":"Name","parentSelectors":["view_link"],"type":"SelectorText","selector":"div.cc-pitchCover__title","multiple":false,"delay":0,"regex":""},{"id":"date","parentSelectors":["link"],"type":"SelectorText","selector":".cc-company__history td:nth-of-type(1)","multiple":false,"delay":0,"regex":""},{"id":"industry type","parentSelectors":["view_link"],"type":"SelectorText","selector":"li:nth-of-type(4) p","multiple":false,"delay":0,"regex":""}]}