Help with this pagination

I have problems again with the pagination. Anybody can help ?

Url: https://ranking-empresas.eleconomista.es/ranking_empresas_nacional.html

Sitemap:
{"_id":"eleconomista","startUrl":["https://ranking-empresas.eleconomista.es/ranking_empresas_nacional.html"],"selectors":[{"id":"Empresa","type":"SelectorLink","parentSelectors":["_root"],"selector":"td.tal a","multiple":true,"delay":0},{"id":"empresa","type":"SelectorText","parentSelectors":["Empresa"],"selector":"tr.even:contains('Denominación') td.tal:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"url","type":"SelectorLink","parentSelectors":["Empresa"],"selector":"tr.even:contains('Página Web') a.url","multiple":false,"delay":0},{"id":"pagination","type":"SelectorLink","parentSelectors":["_root"],"selector":"li:nth-of-type(6) a","multiple":true,"delay":0}]}

As there is no link under the pagination button, you have to use the Element Click selector to iterate through the pages. Here is an updated sitemap:

{"_id":"eleconomista","startUrl":["https://ranking-empresas.eleconomista.es/ranking_empresas_nacional.html"],"selectors":[{"id":"Empresa","type":"SelectorLink","parentSelectors":["element-click"],"selector":"td.tal a","multiple":false,"delay":0},{"id":"empresa","type":"SelectorText","parentSelectors":["Empresa"],"selector":"tr.even:contains('Denominación') td.tal:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"url","type":"SelectorLink","parentSelectors":["Empresa"],"selector":"tr.even:contains('Página Web') a.url","multiple":false,"delay":0},{"id":"element-click","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"tr.tr_hover_even","multiple":true,"delay":"2000","clickElementSelector":"li.arrow a:contains('»')","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"}]}

Thanks, but It does not work. Are you sure its right this way ? (check screenshot)empresite

{"_id":"eleconomista","startUrl":["https://ranking-empresas.eleconomista.es/ranking_empresas_nacional.html"],"selectors":[{"id":"Empresa","type":"SelectorLink","parentSelectors":["element-click"],"selector":"td.tal a","multiple":false,"delay":0},{"id":"empresa","type":"SelectorText","parentSelectors":["Empresa"],"selector":"tr.even:contains('Denominación') td.tal:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"url","type":"SelectorLink","parentSelectors":["Empresa"],"selector":"tr.even:contains('Página Web') a.url","multiple":false,"delay":0},{"id":"element-click","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"tr.tr_hover_even","multiple":true,"delay":"2000","clickElementSelector":"li.arrow a:contains('»')","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"}]}

Had the click selector type as 'click once' for testing purposes, forgot to change it back. Should be working now.

Looks fantastic, but I'm now struggling with Captchas.

Do you know we I could get a solution for this?

You can avoid CAPTCHAs while scraping by using proxy and rotating your IP address periodically. Cloud Web Scraper has this feature and you can try it for free, more info: https://www.webscraper.io/cloud-scraper .

Have you tried using residential proxies? Tried these ones, and they actually rarely get any captchas at all. A bit pricy, but if you need to do a lot of scraping, i think its worth it