Describe the problem.
I started with Pagination>> Link Selector>> then click into the listing and capture title, manufacture code, label, pack size, etc. BUt I cannot get the link selector to select the link on this specific website. Can someone please help me. What am I doing wrong. Thanks.
Url: US Foods
Sitemap:
{id:"sitemap code"}
Hi,
The select-tool does not select the links, as they don't exist in the HTML but are generated dynamically only after clicking on the product.
I can think of a workaround to open the product pages. Let me know if this is still required, since the thread was posted 20 days ago.
Cheers
Hi Jan,
Yes I am still trying to figure out a way. But please let me know how i can achieve this. Thank you!
Ok, I see. Here are the steps:
- As the scroll is a bit fussy, you will have to manually scroll down to load all products. Once all products are loaded, scroll to the top of the page and click 'Data preview' for the product-card selector in this sitemap:
{"_id":"usfoods-ids","startUrl":["https://order.usfoods.com/touch/search2?correlationId=68447"],"selectors":[{"elementLimit":0,"id":"product-card","multiple":true,"parentSelectors":["_root"],"scroll":true,"selector":"app-selectable-search-result","type":"SelectorElement"},{"id":"product-id","multiple":false,"multipleType":"singleColumn","parentSelectors":["product-card"],"regex":"","selector":".usf-product-card-num-text","type":"SelectorText"}]}
Copy the ID's to an Excel or other sheet.
- You can use 'Find and replace' to replace # with
https://order.usfoods.com/touch/products/
Now you have a list of product URLs you can access directly.
- Set the product URLs are start URLs for the following sitemap to scrape them directly. You can add the selectors as required. The URLs have to be copied into the JSON in the format:
"URL1","URL2","URL3"
{"_id":"usfoods-products","startUrl":["https://order.usfoods.com/touch/products/6606132"],"selectors":[{"id":"title","multipleType":"singleColumnWithSeparator","parentSelectors":["product-page"],"regex":"","selector":"[data-cy=\"pdp-summary-desc-text\"]","type":"SelectorText"},{"id":"size","multiple":false,"multipleType":"singleColumn","parentSelectors":["product-page"],"regex":"","selector":"[data-cy=\"pdp-summary-sales-pack-size-text\"]","type":"SelectorText"},{"id":"code","multiple":false,"multipleType":"singleColumn","parentSelectors":["product-page"],"regex":"","selector":"[data-cy=\"pdp-summary-product-number-text\"]","type":"SelectorText"},{"id":"Manufacturer-nr","multiple":false,"multipleType":"singleColumn","parentSelectors":["product-page"],"regex":"","selector":"[data-cy=\"manufacturer-product-number-list\"]","type":"SelectorText"},{"elementLimit":0,"id":"product-page","multiple":true,"parentSelectors":["_root"],"scroll":false,"selector":"[data-cy=\"product-detail-content\"]","type":"SelectorElement"}]}