How can I enter the search result for this particular website?

Describe the problem.
This is the base that I want to import multiple URL with search enquiries.
Usually there are 2-3 results after conducting the search. I want to enter site of the first result and then further extract information from it (e.g. book title, ratings, price etc.).

Steps I want to do:

  1. Go to search page
  2. Create Element for first result
  3. Create Link wihin the Element
  4. Go to a specific book page to further extract information

Question 1: I managed to do it for Goodreads and other websites. But not at this website. I can't create "element" which is not book name specific. Please help.

Question 2: How can I scrape the link for the website I am scraping? I manage to scrape the "title", but not the "link". thanks!

Url: 博客來-目前您搜尋的關鍵字為: 賺錢公司也會倒閉!讀財報最常犯的40個誤解

Sitemap:
{"_id":"Books","startUrl":["Error a","multiple":false,"linkType":"linkFromHref"},{"id":"Book-name","parentSelectors":["Book-link"],"type":"SelectorText","selector":"h1","multiple":false,"regex":""},{"id":"Foreign-name","parentSelectors":["Book-link"],"type":"SelectorText","selector":".mod h2 a","multiple":false,"regex":""},{"id":"Ratings","parentSelectors":["Book-link"],"type":"SelectorText","selector":"div.average","multiple":false,"regex":""},{"id":"Ratings-count","parentSelectors":["Book-link"],"type":"SelectorText","selector":"div.sum:nth-of-type(3)","multiple":false,"regex":""}]}

Hi,

Here is a reference sitemap on how to iterate through the listings:

{"_id":"books","startUrl":["https://search.books.com.tw/search/query/key/%E8%B3%BA%E9%8C%A2%E5%85%AC%E5%8F%B8%E4%B9%9F%E6%9C%83%E5%80%92%E9%96%89%EF%BC%81%E8%AE%80%E8%B2%A1%E5%A0%B1%E6%9C%80%E5%B8%B8%E7%8A%AF%E7%9A%8440%E5%80%8B%E8%AA%A4%E8%A7%A3"],"selectors":[{"id":"wrapper","multiple":true,"parentSelectors":["_root"],"selector":".table-td:has(h4)","type":"SelectorElement"},{"id":"link","linkType":"linkFromHref","multiple":false,"parentSelectors":["wrapper"],"selector":"h4 a","type":"SelectorLink"},{"id":"name","multiple":false,"parentSelectors":["link"],"regex":"","selector":"h1","type":"SelectorText"},{"extractAttribute":"href","id":"listing-link","multiple":false,"parentSelectors":["link"],"selector":"[rel=\"canonical\"]","type":"SelectorElementAttribute"}]}

I hope this helps!

Thanks a lot! It works!

One follow up question: For the listing link, how can I choose "[rel="canonical"]" usign mouse cursor? Or it's something I have to remember and it works on every site situation?

Hi,

No, this selector can be identified by inspecting the HTML. The same selector should work for most websites if it is present in the HTML.