Doesn’t scrape all products?

Hello

why doesn't it scrape all products? The page has 1519 products and with this, i can "only" get 1337.

I have tried to do it again many times. can anyone help me?

thank you very much

{"_id":"obsessive_v2","startUrl":["Sinto-me Sexy! - Cupcake Love Store [itemprop='name'] a","multiple":true,"delay":0},{"id":"titulo","type":"SelectorText","parentSelectors":["product"],"selector":"h1","multiple":false,"regex":"","delay":0},{"id":"categoria","type":"SelectorText","parentSelectors":["product"],"selector":"li:nth-of-type(3) span[itemprop='name']","multiple":false,"regex":"","delay":0},{"id":"preco","type":"SelectorText","parentSelectors":["product"],"selector":"span[itemprop='price']","multiple":false,"regex":"","delay":0},{"id":"descricao","type":"SelectorText","parentSelectors":["product"],"selector":".tab-pane div.product-description","multiple":false,"regex":"","delay":0},{"id":"material","type":"SelectorText","parentSelectors":["product"],"selector":"dt:contains('Material') + dd","multiple":false,"regex":"","delay":0},{"id":"cor","type":"SelectorText","parentSelectors":["product"],"selector":"dt:contains('Cor') + dd","multiple":false,"regex":"","delay":0},{"id":"ean","type":"SelectorText","parentSelectors":["product"],"selector":"dd[itemprop='gtin13']","multiple":false,"regex":"","delay":0},{"id":"brand","type":"SelectorImage","parentSelectors":["product"],"selector":"img.img","multiple":false,"delay":0}]}

Hi @Txill

Unfortunately, your sitemap code is not valid. Make sure it is importable.
I would recommend to apply - preformatted text as well.

Hi there,

Sorry for that.
Here is the code

{"_id":"obsessive_v2","startUrl":["https://www.cupcake.pt/13-sinto-me-sexy?page=[1-64]"],"selectors":[{"id":"product","type":"SelectorLink","parentSelectors":["_root"],"selector":".product-description.col-xs-12 [itemprop='name'] a","multiple":true,"delay":0},{"id":"titulo","type":"SelectorText","parentSelectors":["product"],"selector":"h1","multiple":false,"regex":"","delay":0},{"id":"categoria","type":"SelectorText","parentSelectors":["product"],"selector":"li:nth-of-type(3) span[itemprop='name']","multiple":false,"regex":"","delay":0},{"id":"preco","type":"SelectorText","parentSelectors":["product"],"selector":"span[itemprop='price']","multiple":false,"regex":"","delay":0},{"id":"descricao","type":"SelectorText","parentSelectors":["product"],"selector":".tab-pane div.product-description","multiple":false,"regex":"","delay":0},{"id":"material","type":"SelectorText","parentSelectors":["product"],"selector":"dt:contains('Material') + dd","multiple":false,"regex":"","delay":0},{"id":"cor","type":"SelectorText","parentSelectors":["product"],"selector":"dt:contains('Cor') + dd","multiple":false,"regex":"","delay":0},{"id":"ean","type":"SelectorText","parentSelectors":["product"],"selector":"dd[itemprop='gtin13']","multiple":false,"regex":"","delay":0},{"id":"brand","type":"SelectorImage","parentSelectors":["product"],"selector":"img.img","multiple":false,"delay":0}]}

Is this ok?
Regards

@Txill I tested it in Cloud Scraper and it seems that all of the 1519 products were extracted.

Product link - .h3 a

Hi viesturs,

Thank you.
You only changed the product link to .h3 a?
If no, can you please share the code?
Regards

Hi there,

I also tested it in Cloud Scraper and have different results with the exact same Sitemap as you can see in the image attached

What I'm I doing wrong?

Regards

@Txill Because it is a "test" job, test job is limited to 500 pages.

Can you please help me, since i copy the exact sitemap i posted here and give me 0 records as you can see below

What i'm i doing wrong?

Regards