How could I scrape with this pagination?

I´m trying to scrape the TikTok Library but I don´t know how to manage the "see more" button. I tried everything from Element Scroll & Pagination to this one in the example (Element click). Anyone can help¿

Url: https://library.tiktok.com/ads?region=ES&start_time=1664575200000&end_time=1703409951102&adv_name=supplements&adv_biz_ids=&query_type=1&sort_type=last_shown_date,desc

Sitemap:
{"_id":"titktok","startUrl":["https://library.tiktok.com/ads?region=ES&start_time=1664575200000&end_time=1703409951102&adv_name=supplements&adv_biz_ids=&query_type=1&sort_type=last_shown_date,desc"],"selectors":[{"id":"company","parentSelectors":["_root"],"type":"SelectorElementClick","clickActionType":"real","clickElementSelector":"span.loading_more_text","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","multiple":true,"selector":".ad_card a"},{"id":"name","parentSelectors":["company"],"type":"SelectorText","selector":"span.ad_info_text","multiple":false,"regex":""}]}

like this....

{"_id":"TIKTOK","startUrl":["https://library.tiktok.com/ads?region=ES&start_time=1664575200000&end_time=1703409951102&adv_name=supplements&adv_biz_ids=&query_type=1&sort_type=last_shown_date,desc"],"selectors":[{"clickActionType":"real","clickElementSelector":"body","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard","id":"pause","multiple":false,"parentSelectors":["more"],"selector":"body","type":"SelectorElementClick"},{"id":"element","multiple":true,"parentSelectors":["more"],"selector":"div.ad_card","type":"SelectorElement"},{"id":"name","multiple":false,"parentSelectors":["element"],"regex":"","selector":"span[class*=\"ad_info_text\"]","type":"SelectorText"},{"id":"more","paginationType":"clickMore","parentSelectors":["_root","more"],"selector":"span.loading_more_text","type":"SelectorPagination"},{"extractAttribute":"href","id":"URL","multiple":false,"parentSelectors":["element"],"selector":"a.link","type":"SelectorElementAttribute"}]}

Oh, very nice! Thanks ! But I tried it with a different keyword and it does not work. Do you know why?
https://library.tiktok.com/ads?region=ES&start_time=1664575200000&end_time=1703409951102&adv_name=gmbh&adv_biz_ids=&query_type=1&sort_type=last_shown_date,desc

did you change START URL in your sitemap ?

Yes ! And the "see more" button gets clicked, but nothing scraped...

Hi, the scraper will most likely run into a timeout, since there are a lot of results. Try to narrow down the search filter to a couple of hundred search results.

I tried to add some details of the publisher (location, etc) so I converted a selector into an URL selector and added text selector. The problem is, that nothing happens.

{"_id":"TIKTOK","startUrl":["https://library.tiktok.com/ads?region=ES&start_time=1709247600000&end_time=1709938800000&adv_name=gmbh&adv_biz_ids=&query_type=1&sort_type=last_shown_date,desc"],"selectors":[{"clickActionType":"real","clickElementSelector":"body","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard","id":"pause","multiple":false,"parentSelectors":["more"],"selector":"body","type":"SelectorElementClick"},{"elementLimit":0,"id":"element","multiple":true,"parentSelectors":["more"],"scroll":false,"selector":"div.ad_card","type":"SelectorElement"},{"id":"name","multiple":false,"parentSelectors":["element"],"regex":"","selector":"span[class*=\"ad_info_text\"]","type":"SelectorText"},{"id":"more","paginationType":"clickMore","parentSelectors":["_root","more"],"selector":"span.loading_more_text","type":"SelectorPagination"},{"id":"URL","linkType":"linkFromHref","multiple":true,"parentSelectors":["element"],"selector":"a.link","type":"SelectorLink"},{"id":"location","multiple":false,"parentSelectors":["URL"],"regex":"","selector":".byted-Table-Cell_hover[aria-colindex='2'] div","type":"SelectorText"},{"id":"uniques","multiple":false,"parentSelectors":["URL"],"regex":"","selector":"[aria-colindex='3'] div","type":"SelectorText"},{"id":"interests","multiple":false,"parentSelectors":["URL"],"regex":"","selector":".targeting_additional_parameters_table_row:contains('Intereses') td:nth-of-type(2)","type":"SelectorText"},{"id":"paid by","multiple":false,"parentSelectors":["URL"],"regex":"","selector":"div.ad_advertiser_value:nth-of-type(6)","type":"SelectorText"}]}

Hi,

I removed the 'pause' selector, not sure why it is there. THe sitemap seems to be working fine:

{"_id":"TIKTOK","startUrl":["https://library.tiktok.com/ads?region=ES&start_time=1709247600000&end_time=1709938800000&adv_name=gmbh&adv_biz_ids=&query_type=1&sort_type=last_shown_date,desc"],"selectors":[{"elementLimit":0,"id":"element","multiple":true,"parentSelectors":["more"],"scroll":false,"selector":"div.ad_card","type":"SelectorElement"},{"id":"name","multiple":false,"parentSelectors":["element"],"regex":"","selector":"span[class*=\"ad_info_text\"]","type":"SelectorText"},{"id":"more","paginationType":"clickMore","parentSelectors":["_root","more"],"selector":"span.loading_more_text","type":"SelectorPagination"},{"id":"URL","linkType":"linkFromHref","multiple":true,"parentSelectors":["element"],"selector":"a.link","type":"SelectorLink"},{"id":"location","multiple":false,"parentSelectors":["URL"],"regex":"","selector":".ad_details_targeting_title:contains('Ubicación')  ++ div td[aria-colindex=\"2\"] div","type":"SelectorText"},{"id":"uniques","multiple":false,"parentSelectors":["URL"],"regex":"","selector":"[aria-colindex='3'] div","type":"SelectorText"},{"id":"interests","multiple":false,"parentSelectors":["URL"],"regex":"","selector":".targeting_additional_parameters_table_row:contains('Intereses') td:nth-of-type(2)","type":"SelectorText"},{"id":"paid by","multiple":false,"parentSelectors":["URL"],"regex":"","selector":"div.ad_advertiser_value:nth-of-type(6)","type":"SelectorText"}]}

Sounds great but do you know why I don´t get any results ? Thanks !

No, sorry, I cannot see what happens on your PC. Consider that the scraper will first click the 'load more' button until all elements are loaded and only then start to return data.

What I wanted to ask: does it work on your computer ?

In my case it clicks all "more" buttons but does not scrape content.

I deleted the "more" button in root and moved the other ones up into root and it scrapes content but only the visible screen.


Yes, the sitemap I posted before works on my PC. You can try to run it on a different PC, or incognito mode, or disable all other extensions that might interfere.