Hi, nice to meet you all,
I need a little help, so I want to retrieve data from the shopee website. The data I want to retrieve is store location with store name (primary), number of products sold, followers, and store performance (optional). For the store name, I have to move to a new page to retrieve it, while for the store location, I can retrieve it from the thumbnail of the search results.
There are a few things I'd like to ask and need help with:
- I want to reduce duplicate data from the data I retrieve.
- The data that I retrieved is just from one page. How can I switch the script pages automatically when all the data on the first page has been retrieved?
Thumbnail search result for store location data.
Url: https://shopee.co.id/Perawatan-Kecantikan-cat.11043145?page=0
For store name and store performance data.
Url: https://shopee.co.id/Wardah-UV-Shield-Essential-Sunscreen-Gel-SPF-30PA-Sun-Protect-Tabir-Surya-i.45687268.4056380630?sp_atk=d8fd9992-0e59-4f8c-a3ad-b083eddee21d&xptdk=d8fd9992-0e59-4f8c-a3ad-b083eddee21d
Here is the sitemap that I have created, taking reference from @leemeng.
Sitemap:
{"_id":"test-shopee","startUrl":["https://shopee.co.id/Perawatan-Kecantikan-cat.11043145?page=0"],"selectors":[{"id":"Results wrapper","multiple":false,"parentSelectors":["_root"],"selector":"div[role='main'] div.shopee-search-item-result","type":"SelectorElement"},{"delay":2500,"elementLimit":0,"id":"Separate scroller","multiple":true,"parentSelectors":["Results wrapper"],"selector":"div.col-xs-2-4:nth-of-type(n+8)","type":"SelectorElementScroll"},{"id":"Product wrappers","multiple":true,"parentSelectors":["Results wrapper"],"selector":"div.col-xs-2-4","type":"SelectorElement"},{"id":"Product name","multiple":false,"parentSelectors":["Product wrappers"],"regex":"","selector":"div[data-sqe='name']","type":"SelectorText"},{"id":"Lokasi","multiple":false,"parentSelectors":["Product wrappers"],"regex":"","selector":"div.mrz-bA","type":"SelectorText"},{"id":"Page link","multiple":false,"parentSelectors":["Product wrappers"],"selector":"a","type":"SelectorLink"},{"id":"Page number","multiple":false,"parentSelectors":["Results wrapper"],"regex":"","selector":"div.shopee-page-controller button.shopee-button-solid","type":"SelectorText"},{"id":"toko","multiple":false,"parentSelectors":["Page link"],"regex":"","selector":"div.VlDReK","type":"SelectorText"},{"id":"penilaian (in Thousand)","multiple":false,"parentSelectors":["Page link"],"regex":"[0-9]+\,[0-9]+","selector":"[data-testid='shop_ratings_section_pdp'] span","type":"SelectorText"},{"id":"product (in Thousand)","multiple":false,"parentSelectors":["Page link"],"regex":"[0-9]+\,[0-9]+","selector":"span.vUG3KX","type":"SelectorText"},{"id":"Presentase","multiple":false,"parentSelectors":["Page link"],"regex":"","selector":"[data-testid='shop_response_rate_section_pdp'] span","type":"SelectorText"},{"id":"Pengikut","multiple":false,"parentSelectors":["Page link"],"regex":"[0-9]+\,[0-9]+","selector":"[data-testid='shop_follower_section_pdp'] span","type":"SelectorText"},{"delay":3000,"elementLimit":500,"id":"page-scroller","multiple":true,"parentSelectors":["Page link"],"selector":"div.page-product","type":"SelectorElementScroll"}]}
Thank you.