Pagination - help please

I simply cannot get this pagination to work! I can't use array as I have multiple search items and want to have a universal sitemap. Can someone please help me?

Url: https://www.yellowpages.com.au/search/listings?clue=Auto+wreckers+%26+recyclers&eventType=pagination&pageNumber=1&referredBy=www.yellowpages.com.au

Sitemap:
{"_id":"yellowpages","startUrl":["https://www.yellowpages.com.au/search/listings?clue=Auto+wreckers+%26+recyclers&locationClue=&lat=&lon=&selectedViewMode=list"],"selectors":[{"id":"business_page","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":"div.cell:nth-of-type(n+3) div.srp-brand-bar-container a.listing-name, div.cell div.cell div.body a.listing-name","multiple":true,"delay":0},{"id":"business_name","type":"SelectorText","parentSelectors":["business_page"],"selector":"h1.listing-name","multiple":false,"regex":"","delay":0},{"id":"business_number","type":"SelectorText","parentSelectors":["business_page"],"selector":"a.click-to-call span.text div","multiple":false,"regex":"","delay":0},{"id":"business_address","type":"SelectorText","parentSelectors":["business_page"],"selector":"p.listing-address","multiple":false,"regex":"","delay":0},{"id":"business_class_location","type":"SelectorText","parentSelectors":["business_page"],"selector":"h2.listing-heading","multiple":false,"regex":"","delay":0},{"id":"pagination","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":"a.pagination:nth-of-type(6)","multiple":true,"delay":0}]}

There are two ways to accomplish this.

#1 - (Take a page out of Iconoclast's book. Since the url indicates which page you're on. Change the Start URL (in Metadeta) to"

https://www.yellowpages.com.au/search/listings?clue=Auto+wreckers+%26+recyclers&eventType=pagination&pageNumber=[1-29]&referredBy=UNKNOWN0

(Remember to Delete the Paginate selector!

#2 - Use Link Selector Pagination -

For this you almost had is correct. The Pagination link needs to come second (not first) and you don't need to select multiple. I also found better CSS selector (.Navitagtion:last). Everything else you had was correct. I got 3 pages in and then the site crashed, might need to use a higher delay @iconoclast can verify this

Here is the modified sitemap

{"_id":"a-delete-yellow-pages","startUrl":["https://www.yellowpages.com.au/search/listings?clue=Auto+wreckers+%26+recyclers&locationClue=&lat=&lon=&selectedViewMode=list"],"selectors":[{"id":"pagination","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":".navigation:LAST","multiple":false,"delay":0},{"id":"business_page","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":"div.search-contact-card.call-to-actions-4 div.srp-brand-bar-container a.listing-name","multiple":true,"delay":0},{"id":"business_name","type":"SelectorText","parentSelectors":["business_page"],"selector":"h1.listing-name","multiple":false,"regex":"","delay":0},{"id":"business_number","type":"SelectorText","parentSelectors":["business_page"],"selector":"a.click-to-call span.text div","multiple":false,"regex":"","delay":0},{"id":"business_address","type":"SelectorText","parentSelectors":["business_page"],"selector":"p.listing-address","multiple":false,"regex":"","delay":0},{"id":"business_class_location","type":"SelectorText","parentSelectors":["business_page"],"selector":"h2.listing-heading","multiple":false,"regex":"","delay":0}]}

Scratch that - After page 5 it changes element selectors and layout. Here is a better sitemap to capture the details you're looking for.

{"_id":"a-delete-yellow-pages","startUrl":["https://www.yellowpages.com.au/search/listings?clue=Auto+wreckers+%26+recyclers&eventType=pagination&pageNumber=1"],"selectors":[{"id":"pagination","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":".navigation:LAST","multiple":false,"delay":0},{"id":"business_page","type":"SelectorElement","parentSelectors":["_root","pagination"],"selector":"div.cell.find-show-more-trial:nth-of-type(n+3),div.search-contact-card.call-to-actions-4","multiple":true,"delay":0},{"id":"business_name","type":"SelectorText","parentSelectors":["business_page"],"selector":".listing-name","multiple":false,"regex":"","delay":0},{"id":"business_number","type":"SelectorText","parentSelectors":["business_page"],"selector":".contact-text:first","multiple":false,"regex":"","delay":0},{"id":"business_address","type":"SelectorText","parentSelectors":["business_page"],"selector":".listing-address","multiple":false,"regex":"","delay":0},{"id":"business_class_location","type":"SelectorText","parentSelectors":["business_page"],"selector":".listing-heading","multiple":false,"regex":"","delay":0},{"id":"Email","type":"SelectorHTML","parentSelectors":["business_page"],"selector":"div.call-to-action.first:nth-of-type(2)","multiple":false,"regex":"(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])","delay":0},{"id":"Webpage","type":"SelectorHTML","parentSelectors":["business_page"],"selector":"div.call-to-action-group:nth-of-type(2) div.call-to-action:nth-of-type(1)","multiple":false,"regex":"[-a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)","delay":0}]

There should be 2000 results, I scraped about 1500. Not sure why I missed a bunch but I suspect my element selector is off. @iconoclast, any thoughts?

Thanks so much for this but this sitemap is coming up as invalid?

Hmm -- Try this perhaps I didn't copy everything

{"_id":"a-delete-yellow-pages","startUrl":["https://www.yellowpages.com.au/search/listings?clue=Auto+wreckers+%26+recyclers&eventType=pagination&pageNumber=1"],"selectors":[{"id":"pagination","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":".navigation:LAST","multiple":false,"delay":0},{"id":"business_page","type":"SelectorElement","parentSelectors":["_root","pagination"],"selector":"div.cell.find-show-more-trial:nth-of-type(n+3),div.search-contact-card.call-to-actions-4","multiple":true,"delay":0},{"id":"business_name","type":"SelectorText","parentSelectors":["business_page"],"selector":".listing-name","multiple":false,"regex":"","delay":0},{"id":"business_number","type":"SelectorText","parentSelectors":["business_page"],"selector":".contact-text:first","multiple":false,"regex":"","delay":0},{"id":"business_address","type":"SelectorText","parentSelectors":["business_page"],"selector":".listing-address","multiple":false,"regex":"","delay":0},{"id":"business_class_location","type":"SelectorText","parentSelectors":["business_page"],"selector":".listing-heading","multiple":false,"regex":"","delay":0},{"id":"Email","type":"SelectorHTML","parentSelectors":["business_page"],"selector":"div.call-to-action.first:nth-of-type(2)","multiple":false,"regex":"(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])","delay":0},{"id":"Webpage","type":"SelectorHTML","parentSelectors":["business_page"],"selector":"div.call-to-action-group:nth-of-type(2) div.call-to-action:nth-of-type(1)","multiple":false,"regex":"[-a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)","delay":0}]}

Was just the last } was missing haha - but it keeps only scraping about 1300-1500 items? When there are 8000 to scrape? Not sure why...

Jay - I'm only showing 2200 items in the list. Where are you seeing 800? That being said, I'm entirely sure why we aren't capturing all of them.

Maybe @iconoclast can weight in but from what I'm seeing the sitemap (Selectors) change after page 5 and perhaps they change again somewhere in the pagination. I accounting for the first shift by using two selectors in the element selector but perhaps I need a third or forth?