As the page has a 'load more' button at the bottom I set it up so it would click this to bring up more articles, however the only options seem either to be 'click once' which brings a total of 25 articles (there are 25 initially listed, so even when clicking load more once it should return 50). There are a total of 700+ articles on climate change, but when I change the selector for clicking 'load more' to click more than one and to go until there are no new elements it continues to keep clicking even after all articles appear. This ends with the page returning "No results" over and over with no end and thus does not get to scraping any of the data needed. I did try it with every option for "click element uniqueness" but had the same results. Ideally I would like to scrape 250 or so articles of the 723, but it would be ok to scrape all so long as the 'no results' stops being an issue.
Url: You searched for climate change - Reason Foundation
Sitemap:
{"_id":"reasf","startUrl":["You searched for climate change"],"selectors":[{"delay":0,"id":"link","multiple":true,"parentSelectors":["_root"],"selector":".title - Reason Foundation a","type":"SelectorLink"},{"clickElementSelector":"div#loadmore","clickElementUniquenessType":"uniqueText","clickType":"clickMore","delay":3500,"discardInitialElements":"do-not-discard","id":"continue","multiple":true,"parentSelectors":["_root"],"selector":"div#loadmore","type":"SelectorElementClick"},{"delay":0,"id":"title","multiple":false,"parentSelectors":["link"],"regex":"","selector":"h1","type":"SelectorText"},{"delay":0,"id":"summary","multiple":false,"parentSelectors":["link"],"regex":"","selector":"h3.entry-subtitle","type":"SelectorText"},{"delay":0,"id":"author","multiple":false,"parentSelectors":["link"],"regex":"","selector":".author-name span","type":"SelectorText"},{"delay":0,"id":"date","multiple":false,"parentSelectors":["link"],"regex":"","selector":"header time","type":"SelectorText"},{"delay":0,"extractAttribute":"","id":"all","parentSelectors":["link"],"selector":".entry-content p","type":"SelectorGroup"}]}
I did try using info from this topic " Please help. Is there any way to limit "load more" click element selector? :) ", but got the same results as above but just in case here is the sitemap also:
{"_id":"reasf","startUrl":["You searched for climate change"],"selectors":[{"delay":0,"id":"link","multiple":true,"parentSelectors":["_root"],"selector":".title - Reason Foundation a","type":"SelectorLink"},{"clickElementSelector":"<div id="loadmore" style="" class="" value="475">Load More","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickMore","delay":3500,"discardInitialElements":"do-not-discard","id":"continue","multiple":true,"parentSelectors":["_root"],"selector":"div#loadmore","type":"SelectorElementClick"},{"delay":0,"id":"title","multiple":false,"parentSelectors":["link"],"regex":"","selector":"h1","type":"SelectorText"},{"delay":0,"id":"summary","multiple":false,"parentSelectors":["link"],"regex":"","selector":"h3.entry-subtitle","type":"SelectorText"},{"delay":0,"id":"author","multiple":false,"parentSelectors":["link"],"regex":"","selector":".author-name span","type":"SelectorText"},{"delay":0,"id":"date","multiple":false,"parentSelectors":["link"],"regex":"","selector":"header time","type":"SelectorText"},{"delay":0,"extractAttribute":"","id":"all","parentSelectors":["link"],"selector":".entry-content p","type":"SelectorGroup"}]}
Would greatly appreciate any help!
