Help with paginated data please

Good morning team,

I'm trying to scrape a list of exhibitors (formatted in li and p elements) on 306 pages.

I've come up with the below and it seems to change pages and show data when I preview, however the scraping finishes after a few seconds with no results or only the entries on the first page.

I have played around with different selectors, elements and delays, but unfortunately to no avail.

If you could please give me a hand with this, I'd really appreciate it.

Many thanks!

Cheers
Clemens

Url: http://www.chinainternationalbeauty.com/gzen/exibitionMenu.html#page-1

Sitemap:
{"_id":"cibe","startUrl":["http://www.chinainternationalbeauty.com/gzen/exibitionMenu.html"],"selectors":[{"id":"items","type":"SelectorHTML","parentSelectors":["_root","next"],"selector":".g_list1li","multiple":true,"regex":"","delay":0},{"id":"next","type":"SelectorElementClick","parentSelectors":["_root","next"],"selector":".light-theme a","multiple":true,"delay":"20000","clickElementSelector":".light-theme a","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"}]}

[quote="Clemens, post:1, topic:1225"]
{"_id":"cibe","startUrl":["http://www.chinainternationalbeauty.com/gzen/exibitionMenu.html"],"selectors":[{"id":"items","type":"SelectorHTML","parentSelectors":["_root","next"],"selector":".g_list1li","multiple":true,"regex":"","delay":0},{"id":"next","type":"SelectorElementClick","parentSelectors":["_root","next"],"selector":".light-theme a","multiple":true,"delay":"20000","clickElementSelector":".light-theme a","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"}]}
[/quotI

There are a few ways to do this. The easiest way is to use the element click selector set to click more.

Then as a child of that you select each row.

{"_id":"cibe","startUrl":["http://www.chinainternationalbeauty.com/gzen/exibitionMenu.html"],"selectors":[{"id":"Pagnation","type":"SelectorElementClick","parentSelectors":["_root"],"selector":".g_list1li","multiple":true,"delay":0,"clickElementSelector":".page-link:last","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"},{"id":"Row Data","type":"SelectorHTML","parentSelectors":["Pagnation"],"selector":"p.word5","multiple":false,"regex":"","delay":0}]}

It will go through all 600+ pages before it scrapes

Hi Bret,

Thanks so much for your help! You even scraped the list and saved it in a spreadsheet, how good is that!

Thanks again, have a great day!

Cheers
Clemens

1 Like