Describe the problem.
I am scraping a private site using the Edge extension. Essentially it is paginating from a "load 25 more" button, to load 600 profile tiles. Each tile has a link to the profile page where I want to scrape contact info and name. This works if the start URL if a search filter limited to about 50 tiles. The full run of 600 tiles crashes with no data extracted when the pagination selector stops because the load more disappears. the error it shows is "Error - Extracted data size limit exceeded"
I can pull contact info off the tiles for all 600 is I use Element Click selector with a bunch of text selectors, but there is a unique ID on the full profile I am trying to grab. If I use the Link selector with a text selector as sub, it does not scrape the ID.
Basically it works as expected for a small volume of profiles, but fails out on the larger volume... and I realize without the HTML there is limited help to be given.
Sitemap:
{"_id":"SSC2025-Volunteer_scrape-Rancho_pod","startUrl":["REDACTED"],"selectors":[{"id":"LoadMore","paginationType":"clickMore","parentSelectors":["_root","LoadMore"],"selector":".button.grid__item ptrn-translate","type":"SelectorPagination"},{"id":"ProfileLink","linkType":"linkFromHref","multiple":true,"parentSelectors":["LoadMore"],"selector":"a.card__header-link","type":"SelectorLink"},{"id":"IDNum","multiple":false,"parentSelectors":["ProfileLink"],"regex":"","selector":"dt:contains('Identification Number') + .data__value span","type":"SelectorText"},{"id":"Name","multiple":false,"parentSelectors":["ProfileLink"],"regex":"","selector":"span.transliteration__vernacular","type":"SelectorText"},{"id":"MobilePhone","multiple":false,"parentSelectors":["ProfileLink"],"regex":"","selector":"dt:contains('Mobile Phone') + .data__value span","type":"SelectorText"},{"id":"HomePhone","multiple":false,"parentSelectors":["ProfileLink"],"regex":"","selector":"dt:contains('Home Phone') + .data__value span","type":"SelectorText"},{"id":"Email","multiple":false,"parentSelectors":["ProfileLink"],"regex":"","selector":"dt:contains('Email Address') + .data__value span","type":"SelectorText"},{"id":"Department","multiple":false,"parentSelectors":["ProfileLink"],"regex":"","selector":"dt:contains('Department') + .data__value span","type":"SelectorText"}]}