Scraping multiple select boxes (product variations)

I need to scrape a shop. the products have sometimes variations, somtetimes multiple variations in the form of 2 or three select drop downs.
Right now i my sitemap does navigate through the select boxes options, but does not scrape all 4 variations (see my example sitemap with an example product). it scrapes one variation twice and is missing another on.
As we have many different options across the whole shop with different names we need to scrape tha variations just by seleclecting/clicking through all spossible options combiantions of the select dropdwons no name can be used or anything)

Url: Schwingen & Umlenkschwingen für Harley-Davidson

Sitemap:
{"_id":"RICKS_SHOP","startUrl":["https://www.ricks-motorcycles.shop/shop/wg/ricks-parts/schwingen/"],"selectors":[{"id":"Produklinks","parentSelectors":["_root"],"type":"SelectorLink","selector":".col-lg-4 .obs-product-card-title a","multiple":true,"linkType":"linkFromHref"},{"id":"variation-click","parentSelectors":["Produklinks"],"type":"SelectorElementClick","clickActionType":"real","clickElementSelector":"select option:not(:contains('Auswählen:')","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard-when-click-element-exists","multiple":true,"selector":"#oneboxshop-pages-product-out div.wp-block-columns"},{"id":"Name","parentSelectors":["variation-click"],"type":"SelectorText","selector":"h1","multiple":false,"regex":""},{"id":"Preis","parentSelectors":["variation-click"],"type":"SelectorText","selector":"div.obs-product-price","multiple":false,"regex":""},{"id":"Artikelnummer","parentSelectors":["variation-click"],"type":"SelectorText","selector":":contains('Artikelnummer') div.obs-product-item_mpn_boxss","multiple":false,"regex":""},{"id":"Features_BOX","parentSelectors":["variation-click"],"type":"SelectorHTML","selector":".obs-product-features-table tbody","multiple":false,"regex":""},{"id":"IMAGES","parentSelectors":["variation-click"],"type":"SelectorGroup","selector":"li.obs-product-page-thumbnail-slider-item img","extractAttribute":"src"},{"id":"Var_selected","parentSelectors":["variation-click"],"type":"SelectorText","selector":"select option:selected","multiple":false,"regex":""},{"id":"Var_selected_1","parentSelectors":["variation-click"],"type":"SelectorText","selector":"tr:contains('Felgenmaß') select option:selected","multiple":false,"regex":""}]}

Hi,

Please see below an example with 2 dropdown variants, which will work with any variant label:

{"_id":"RICKS_SHOP","startUrl":["https://www.ricks-motorcycles.shop/shop/wg/ricks-parts/schwingen/"],"selectors":[{"id":"Produklinks","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":".col-lg-4 .obs-product-card-title a","type":"SelectorLink"},{"clickActionType":"real","clickElementSelector":".obs-product-variations-table tr:nth-of-type(2) select option:not([value=\"0\"])","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard-when-click-element-exists","id":"var-2-click","multiple":true,"parentSelectors":["var-1-click"],"selector":"_parent_","type":"SelectorElementClick"},{"id":"Name","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"Preis","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":"div.obs-product-price","type":"SelectorText"},{"id":"Artikelnummer","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":":contains('Artikelnummer') div.obs-product-item_mpn_boxss","type":"SelectorText"},{"id":"Features_BOX","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":".obs-product-features-table tbody","type":"SelectorHTML"},{"extractAttribute":"src","id":"IMAGES","parentSelectors":["var-2-click"],"selector":"li.obs-product-page-thumbnail-slider-item img","type":"SelectorGroup"},{"id":"var-1-title","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":".obs-product-variations-table tr:nth-of-type(1) label","type":"SelectorText"},{"id":"var-1-value","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":".obs-product-variations-table tr:nth-of-type(1) select option[selected]","type":"SelectorText"},{"clickActionType":"real","clickElementSelector":".obs-product-variations-table tr:nth-of-type(1) select option:not([value=\"0\"])","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard-when-click-element-exists","id":"var-1-click","multiple":true,"parentSelectors":["Produklinks"],"selector":"body","type":"SelectorElementClick"},{"id":"var-2-title","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":".obs-product-variations-table tr:nth-of-type(2) label","type":"SelectorText"},{"id":"var-2-value","multiple":false,"parentSelectors":["var-2-click"],"regex":"","selector":".obs-product-variations-table tr:nth-of-type(2) select option[selected]","type":"SelectorText"}]}

Hi JanAp,

thanks, that works indeed with 2 select dropdowns..
But sometimes we have 4 select dropdowns, somtimes 3, 1 or none.
Is there a way to automatcially use all checkboxes available or one if none exist. Otherwise I would have to adjust the scraping for hundres of products individualy, thats impossible.

You can add as many as required by applying the same logic: var-3-click would be a child selector of var-2-click and so on.

It does not matter if there are no variants. The scraper will simply collect all data available without performing any clicks.

1 Like