Data stopped scraping when adding pagination ($20 to whoever can help!)

Hello guys, I can't seem to figure this out. If I add a simple root page, then nothing else but a load of link selectors, scraper gets the data fine (I intentionally don;t add a child so it does not click the link).

But then when I add an "element click" underneath, even with a 15second delay to hit "Next", no data is scraped at all.

I am essentially just trying to start from this site: https://www.foamortgage.com/find-a-mortgage-advisor/?search=90807 harvest all the links under peoples names, hit next, then repeat.

Help greatly assisted as I've been staring at this all day!

{"_id":"financeofamerica1","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?page=2?search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?search=90807"],"selectors":[{"id":"contactlinks","type":"SelectorLink","parentSelectors":["_root"],"selector":"h4 a","multiple":true,"delay":0},{"id":"next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"a.next","multiple":false,"delay":"15000","clickElementSelector":"a.next","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueHTMLText"}]}

Hi again guys, I've carried on playing and importing some other guys sitemaps after reading the forums for people with similar problems.

This is what I'm working with right now. When I preview the element of "next" in the selector "Edit" the Next page works fine, but when I hit scrape, it just closes the page right after the displaying the first page, and doesn't scrape any data.

All I am trying to do is scrape the link addresses from under each individual persons name on that list.

If anyone can help me fix this I can PayPal you $20.

{"_id":"jamberry-leechfinance","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?search=90807"],"selectors":[{"id":"pagination","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.result-box.result-id-5721 div.people-wrapper, div.pagination-container","multiple":true,"delay":"2000","clickElementSelector":"a.next","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"}]}

Thanks for any assistance

Adam - There is no need to offer a bounty. This community will help you for free.

Since the URL Changes, you can't use Element Select. You need to use Link Select (tied to .next selector {not a next} and make that a child onto it's self. Then you create an element select tied to each line and make that a child to your pagination link selector.

Note (After 120 results the next button disappears and no more pages will load.. might be a site limit.

Set Page load delay to > 8000

{"_id":"jamberry-leechfinance","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?search=90807"],"selectors":[{"id":"Page","type":"SelectorLink","parentSelectors":["_root","Page"],"selector":".next","multiple":false,"delay":"3000"},{"id":"Element Select","type":"SelectorElement","parentSelectors":["_root","Page"],"selector":"div.advisor-wrapper","multiple":true,"delay":0},{"id":"Name","type":"SelectorText","parentSelectors":["Element Select"],"selector":"h4 a","multiple":false,"regex":"","delay":0},{"id":"Title","type":"SelectorText","parentSelectors":["Element Select"],"selector":"span.advisor-name","multiple":false,"regex":"","delay":0},{"id":"Number","type":"SelectorText","parentSelectors":["Element Select"],"selector":"span.advisor-nmls a","multiple":false,"regex":"","delay":0},{"id":"Office Phone","type":"SelectorText","parentSelectors":["Element Select"],"selector":"a.advisor-phone","multiple":false,"regex":"","delay":0},{"id":"Mobile Number","type":"SelectorText","parentSelectors":["Element Select"],"selector":"a.advisor-mobile","multiple":false,"regex":"","delay":0},{"id":"Rating","type":"SelectorText","parentSelectors":["Element Select"],"selector":"div.ratingContainer","multiple":false,"regex":"","delay":0}]}

If you wanted to crawl through each profile and grab e-mail and full address. This will get you that

{"_id":"example-finance-of-america-morgage-bf-solution","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?search=90807"],"selectors":[{"id":"Page-next","type":"SelectorLink","parentSelectors":["_root","Page-next"],"selector":".next","multiple":false,"delay":"9000"},{"id":"Element-Select","type":"SelectorElement","parentSelectors":["_root","Page-next"],"selector":"div.advisor-wrapper","multiple":true,"delay":"2000"},{"id":"Profile Link","type":"SelectorLink","parentSelectors":["Element-Select"],"selector":"h4 a","multiple":false,"delay":0},{"id":"Name","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"h1","multiple":false,"regex":"","delay":0},{"id":"Title","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.lo-advisor-info h2","multiple":false,"regex":"","delay":0},{"id":"Number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.advisor-nmls a","multiple":false,"regex":"","delay":0},{"id":"office number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.contact-info div:nth-of-type(1) a","multiple":false,"regex":"","delay":0},{"id":"mobile number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.contact-info div:nth-of-type(2) a","multiple":false,"regex":"","delay":0},{"id":"Email","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.advisor-email a","multiple":false,"regex":"","delay":0},{"id":"City/State","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.branch-location","multiple":false,"regex":".+?(?=\\s+Google)","delay":0},{"id":"Address","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.branch-location","multiple":false,"regex":"(?:(?!Google).)*.","delay":0}]}

Since @iconoclast would mention this.. here is a way to do it with using URL Ranges for all 10 pages.

{"_id":"example-finance-of-america-morgage-bf-solution","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?page=[1-10]&search=90807"],"selectors":[{"id":"Element-Select","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.advisor-wrapper","multiple":true,"delay":"4000"},{"id":"Profile Link","type":"SelectorLink","parentSelectors":["Element-Select"],"selector":"h4 a","multiple":false,"delay":0},{"id":"Name","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"h1","multiple":false,"regex":"","delay":0},{"id":"Title","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.lo-advisor-info h2","multiple":false,"regex":"","delay":0},{"id":"Number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.advisor-nmls a","multiple":false,"regex":"","delay":0},{"id":"office number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.contact-info div:nth-of-type(1) a","multiple":false,"regex":"","delay":0},{"id":"mobile number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.contact-info div:nth-of-type(2) a:not(:contains("@"))","multiple":false,"regex":"","delay":0},{"id":"Email","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.advisor-email a","multiple":false,"regex":"","delay":0},{"id":"City/State","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.branch-location","multiple":false,"regex":".+?(?=\s+Google)","delay":0},{"id":"Address","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.branch-location","multiple":false,"regex":"(?:(?!Google).)*.","delay":0}]}

Thank you so much man! You've saved me a lot of grey hair! (And hopefully made me a lot of money :smiley:

One small snag...the last piece of code you posted appears not to be working. Can you double check please bro?

Which one I’ve posted theee example

The 3rd one. That you said you got 192 records with. For all 10 pages with URL range

Let's see if this works

{"_id":"example-finance-of-america-morgage-bf-solution","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?page=[1-10]&search=90807"],"selectors":[{"id":"Element-Select","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.advisor-wrapper","multiple":true,"delay":"4000"},{"id":"Profile Link","type":"SelectorLink","parentSelectors":["Element-Select"],"selector":"h4 a","multiple":false,"delay":0},{"id":"Name","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"h1","multiple":false,"regex":"","delay":0},{"id":"Title","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.lo-advisor-info h2","multiple":false,"regex":"","delay":0},{"id":"Number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.advisor-nmls a","multiple":false,"regex":"","delay":0},{"id":"office number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.contact-info div:nth-of-type(1) a","multiple":false,"regex":"","delay":0},{"id":"mobile number","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.contact-info div:nth-of-type(2) a:not(:contains(\"@\"))","multiple":false,"regex":"","delay":0},{"id":"Email","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.advisor-email a","multiple":false,"regex":"","delay":0},{"id":"City/State","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.branch-location","multiple":false,"regex":".+?(?=\\s+Google)","delay":0},{"id":"Address","type":"SelectorText","parentSelectors":["Profile Link"],"selector":"div.branch-location","multiple":false,"regex":"(?:(?!Google).)*.","delay":0}]}

Hey man, appreciate you helping me with this, the code works - but the scrape totally doesn't.

Referring to the one you sent with the 10 page ranges

Scrapes the first (10th actually) page, then scrolls to every other page and scrapes nothing.

I am also intending to be able to to just add a more start pages, e.g https://www.foamortgage.com/find-a-mortgage-advisor/?search=90807 where the 90807 is a zip code, and I have a list of all their branches nation wide.

Is there any reason this wouldnt work?

I tried & same thing happened to me, I even tried separating them but only scrapped the first 20 (1page)

sitemap:
*{"_id":"forum","startUrl":["https://www.foamortgage.com/find-a-mortgage-advisor/?page=1&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=2&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=3&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=4&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=5&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=6&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=7&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=8&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=9&search=90807","https://www.foamortgage.com/find-a-mortgage-advisor/?page=10&search=90807"],"selectors":[{"id":"parent","type":"SelectorLink","parentSelectors":["_root"],"selector":"h4 a","multiple":true,"delay":0},{"id":"link","type":"SelectorLink","parentSelectors":["parent"],"selector":"div.advisor-nmls a","multiple":false,"delay":0}]}

so maybe do one page at a time while you figure this out ?

I have found a solution, just increase the delay and it'll scrap all the pages...

Not sure he the problem; my guess would be set higher delays and make sure an ad blocker is installed. I Sent you the data, check your email. Not sure why it’s replicating on your side.

I’ll post a video showing it working.

Just the second delay is needed. I had mine set at 9000. Also made the first element Selector set at 8000 worked without the other delays.