Scrape for forum sites with two panels to scroll

Describe the problem.

Url: LIHKG 討論區

As you can see from the desktop version of the webpage, there are two panels one for posts, and one for threads in a post. I want to scroll to scrape all the links of the post column first then for each post scroll to scrape all contents. Scrolling seems not working in this case. Anyone can help??

Thanks a lot.

">

Hi,

You can try this setup:

{"_id":"lihkg-com3","startUrl":["https://lihkg.com/thread/3780738/page/1"],"selectors":[{"delay":5000,"elementLimit":500,"id":"element-left-col","multiple":true,"parentSelectors":["_root"],"scrollElementSelector":".C_300Fi04lLpFu_ZznJqq","selector":".qoAmEqNpZRLf2KVKZ8DsC > div","type":"SelectorElementScroll"},{"id":"link","linkType":"linkFromHref","multiple":false,"parentSelectors":["element-left-col"],"selector":"a._2A_7bGY9QAXcGu1neEYDJB","type":"SelectorLink"},{"id":"elements-right-col","multiple":true,"parentSelectors":["pagination"],"selector":"[id*=\"page-\"] ~","type":"SelectorElement"},{"id":"post-html","multiple":false,"parentSelectors":["elements-right-col"],"regex":"","selector":"_parent_","type":"SelectorHTML"},{"id":"pagination","paginationType":"clickOnce","parentSelectors":["link","pagination"],"selector":"[id=\"page-1\"] select option","type":"SelectorPagination"}]}

It works like charm. Thank you very much.

I have a question by the way about the scroller. How can you add another scrollelementSelector into the scroller? Is it something cannot be done in chrome extension? Thanks for your help.

{"id":"element-left-col","parentSelectors":["_root"],"type":"SelectorElementScroll","selector":".qoAmEqNpZRLf2KVKZ8DsC > div","multiple":true,"delay":5000,"scrollElementSelector":".C_300Fi04lLpFu_ZznJqq","elementLimit":500}

Since it is an inner scroll, a custom code snippet has to be inserted into the sitemap JSON. It is a bit tricky though to find the correct selector.

I see. Thanks for your response. Is there any documentation i can refer to about this custom code snippet ? Am i understanding it correctly thst it has to be added directly to the sitemap and impossible to add in chrome extension?