Adding selector element attribute creates gap in table

Describe the problem.

Url: https://www.superc.ca/epicerie-en-ligne/circulaire?sortOrder=relevance&filter=%3Arelevance%3Adeal%3ACirculaire+et+promotions

Sitemap:
{"_id":"superc","startUrl":["https://www.superc.ca/epicerie-en-ligne/circulaire?sortOrder=relevance&filter=%3Arelevance%3Adeal%3ACirculaire+et+promotions"],"selectors":[{"elementLimit":0,"id":"product wrapper","multiple":true,"parentSelectors":["_root"],"scroll":false,"selector":".tiles-container","type":"SelectorElement"},{"extractAttribute":"data-product-code","id":"Product code","multiple":true,"parentSelectors":["product wrapper"],"selector":".tile-product","type":"SelectorElementAttribute"},{"id":"product name","multiple":true,"parentSelectors":["product wrapper"],"regex":"","selector":".head__title","type":"SelectorText"}]}

Hey, anytime I add an elementattribute selector the table does not match:


is there a way to fix this?

thanks

Hi,

The wrapper selector should match each product card, and the data selectors nested inside should not have 'multiple' checked:

{"_id":"superc","startUrl":["https://www.superc.ca/epicerie-en-ligne/circulaire?sortOrder=relevance&filter=%3Arelevance%3Adeal%3ACirculaire+et+promotions"],"selectors":[{"elementLimit":0,"id":"product wrapper","multiple":true,"parentSelectors":["_root"],"scroll":false,"selector":".tile-product","type":"SelectorElement"},{"extractAttribute":"data-product-code","id":"Product code","multiple":false,"parentSelectors":["product wrapper"],"selector":"_parent_","type":"SelectorElementAttribute"},{"id":"product name","multiple":false,"parentSelectors":["product wrapper"],"regex":"","selector":".head__title","type":"SelectorText"}]}

ah perfect, thank you. I have another question. Is it possible to extract information put like this in a element attribute:

data-product="{'ProductId':'00000_000000006882010089','BrandName':'Compliments','FullDisplayName':'Champignons blancs','CategoryName':null,'IsAgeRequired':false,'SizeLabel':'','Size':'227 g','ProductUrl':'/fr/produit/champignonsblancs/00000_000000006882010089','ProductImageUrl':'https://sbs-prd-cdn-products.azureedge.net/media/image/product/fr/medium/0006882010089.jpg','HasNewPrice':false,'PromotionName':null,'RegularPrice':2.99000,'SalesPrice':2.49000,'CustomerProductComment':null,'PopularityFactor':4.0,'ProductSKU':'0006882010089_00000','RawSKU':null,'CountryOfOrigin':''}"

like if i wanted to get "ProductId" or "FullDisplayName"

thanks!

You would have to scrape the whole JSON block (in {}) and extract the required data points in post-processing, for instance, by applying a regex in Excel.

But i mean is it possible to extract the data points within the webscrapper or only in excel afterwards

Not within the extension, only in Web Scraper Cloud.

Let me confirm... it is possible to do without postprocessing ))

Could you elaborate? :slight_smile:

Search for older RegEx version of WebScraper and use it to grab a portion of text you need