Web Scraper version: 1.75.7
Chrome version: 122.0.6261.111
OS: Arch Linux
Link to the site you were scraping: Batteries & Power Adapters Parts & Upgrades - XPS Laptops | Dell USA
Sitemap:
{"_id":"DellSearchList","startUrl":["https://www.dell.com/en-us/shop/pfydresults/278727?categoryId=8490&sid=52","https://www.dell.com/en-us/shop/pfydresults/269121?categoryId=8490&sid=52","https://www.dell.com/en-us/shop/pfydresults/273243?categoryId=8490&sid=52","https://www.dell.com/en-us/shop/pfydresults/275214?categoryId=8490&sid=52","https://www.dell.com/en-us/shop/pfydresults/233361?categoryId=8490&sid=52"],"selectors":[{"id":"paginator","parentSelectors":["_root","paginator"],"paginationType":"auto","type":"SelectorPagination","selector":"button.dds__pagination__next-page"},{"id":"product_page","parentSelectors":["_root","paginator"],"type":"SelectorLink","selector":".ps-title a","multiple":true,"linkType":"linkFromHref"},{"id":"parts","parentSelectors":["product_page"],"type":"SelectorText","selector":"div.ps-product-info","multiple":false,"regex":""}]}
List of products for XPS L321X and XPS 17 (9720) scrape fine. XPS 17 (9710) lists 4 products, but only one gets scraped. XPS 17 (9700) skips all products entirely.
However, if I scrape pages individually, it works:
{"_id":"DellSearchSingle","startUrl":["https://www.dell.com/en-us/shop/pfydresults/273243?categoryId=8490&sid=52"],"selectors":[{"id":"paginator","parentSelectors":["_root","paginator"],"paginationType":"auto","type":"SelectorPagination","selector":"button.dds__pagination__next-page"},{"id":"product_page","parentSelectors":["_root","paginator"],"type":"SelectorLink","selector":".ps-title a","multiple":true,"linkType":"linkFromHref"},{"id":"parts","parentSelectors":["product_page"],"type":"SelectorText","selector":"div.ps-product-info","multiple":false,"regex":""}]}
In this case, I scrape XPS 17 (9710) with no other URLs and it scrapes all 4 products fine. Nothing is changed about the scraper except for the list of start URLs.
This happens whether or not I use a list of start URLs, or scrape links to each accessory list on Batteries & Power Adapters Parts & Upgrades - XPS Laptops | Dell USA -- certain products get skipped.
From what I could tell, if I change the order of start urls, different products on different pages get skipped. So it appears to be a problem with how webscraperio internally handles queuing which links to load.