Load more element has different values

Hello,

I am trying to webscrape the CBC News search results and I have run into a wall. I can't get the click element selector to click the loadmore button more than once and it only scrapes incomplete results.

{"_id":"leslyn","startUrl":["https://www.cbc.ca/search?q=leslyn%20lewis&section=news&sortOrder=relevance&media=all"],"selectors":[{"id":"more","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.card-content","multiple":true,"delay":2000,"clickElementSelector":"button.sclt-loadmore1","clickType":"clickMore","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueText"},{"id":"info","type":"SelectorGroup","parentSelectors":["more"],"selector":"h3, time","delay":0,"extractAttribute":""}]}

As you can see the clickElementSelector "button.sclt-loadmore1" has a numerical value attached to it. Each time the page loads more results the number grows by one: button.sclt-loadmore2 -> button.sclt-loadmore3.

Is there a workaround for this? I have tried everything I could think of and I really don't want to make dozens of sitemaps to scrape the full search results.

The "begins with" selector ^ works great for this. Try:

Click selector: div > button[class^='sclt-loadmore']
Click type: Click More
Click element uniqueness: Unique HTML
Delay: 3000

Optional: You can even add a limiter by using a not condition, e.g. this will stop the clicking when the 5th button appears:

div > button[class^='sclt-loadmore']:not([class*='loadmore5'])

Ref: https://www.w3schools.com/cssref/css_selectors.asp

1 Like