How would you scrape colors available of a product?

I am scraping collections of carpet from this starting point on Stark website: https://www.starkcarpet.com/search?category=CARPET.

Once I click into a collection (Example collection - SHARNA):
How do I make sure it grabs each of the 7colors in the collection along with their color name, image, model code, and own URL?

Right now it only grabs one of the colors available in each collection.

{"_id":"starkcarpet","startUrl":["https://www.starkcarpet.com/search?category=CARPET"],"selectors":[{"id":"collection","type":"SelectorLink","parentSelectors":["_root"],"selector":"a.productLink","multiple":true,"delay":0},{"id":"collectionname","type":"SelectorText","parentSelectors":["collection"],"selector":"h1.prodName","multiple":false,"regex":"","delay":0},{"id":"indivcolorname","type":"SelectorText","parentSelectors":["collection"],"selector":"span.selectedColorWayValue","multiple":false,"regex":"","delay":0},{"id":"image","type":"SelectorImage","parentSelectors":["collection"],"selector":"img.prodFullImg","multiple":false,"delay":0},{"id":"construction","type":"SelectorText","parentSelectors":["collection"],"selector":"[for='carpet-ac-1'] span.articleHeading","multiple":false,"regex":"","delay":0},{"id":"yarncomosition","type":"SelectorText","parentSelectors":["collection"],"selector":"[for='carpet-ac-2'] span.articleHeading","multiple":false,"regex":"","delay":0},{"id":"patternrepeat","type":"SelectorText","parentSelectors":["collection"],"selector":"[for='carpet-ac-9'] span.articleHeading","multiple":false,"regex":"","delay":0},{"id":"category","type":"SelectorText","parentSelectors":["collection"],"selector":"[for='carpet-ac-10'] span.articleHeading","multiple":false,"regex":"","delay":0},{"id":"modelnumber","type":"SelectorText","parentSelectors":["collection"],"selector":"div.productCode","multiple":false,"regex":"","delay":0}]}

Thanks for any guidance on this!
-Jenny

Hi @JennyK

You need to have an element click selector which would click through all of the colors.

See my example here:

{"_id":"carpet-color-click","startUrl":["https://www.starkcarpet.com/sharna-carpet?c=BRUDOVEWIDE1937"],"selectors":[{"id":"element-click-color","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"body:has(h1[class=\"prodName\"])","multiple":true,"delay":"900","clickElementSelector":"img.swatchColorImg","clickType":"clickOnce","discardInitialElements":"discard-when-click-element-exists","clickElementUniquenessType":"uniqueCSSSelector"},{"id":"color","type":"SelectorText","parentSelectors":["element-click-color"],"selector":"span.selectedColorWayValue","multiple":false,"regex":"","delay":0},{"id":"image - 1","type":"SelectorElementAttribute","parentSelectors":["element-click-color"],"selector":"div.productDisplay img:nth(0)","multiple":false,"extractAttribute":"src","delay":0},{"id":"image - 2","type":"SelectorElementAttribute","parentSelectors":["element-click-color"],"selector":"div.productDisplay img:nth(1)","multiple":false,"extractAttribute":"src","delay":0},{"id":"image - 3","type":"SelectorElementAttribute","parentSelectors":["element-click-color"],"selector":"div.productDisplay img:nth(2)","multiple":false,"extractAttribute":"src","delay":0},{"id":"code","type":"SelectorText","parentSelectors":["element-click-color"],"selector":"div.productCode","multiple":false,"regex":"","delay":0}]}

As it will come to the product URLs. You can see that the ending of the product URL is the SKU.

You can add it in a manner like this through Cloud Scraper. See the screenshots below.

We will simply add a Virtual Column by using our Code column and modify it afterwards by using the Parser feature.




Hope it helps.

This was helpful, thank you!

1 Like