Pagination works, but items on the first page are skipped

Hey there. The following code works as a charm, except that it skips the first page with products. Any idea why? Thanks a lot!

Pagination is on Auto. Pagination as Link also works, but all the other modes skip entering the single product pages and keep browsing through catalogue instead.

But this is not a problem. It works perfectly. It just skips the first page :smiley:

Url: Време за подаръци | Ozone.bg

Sitemap:

{"_id":"ozone2","startUrl":["https://www.ozone.bg/promo_koleda/?kategoriya=knijarnica"],"selectors":[{"id":"Browse pages","paginationType":"auto","parentSelectors":["_root","Browse pages"],"selector":".products-list > div.col-xs-12 a.next","type":"SelectorPagination"},{"id":"Items","multiple":true,"parentSelectors":["Browse pages"],"selector":"a.product-box","type":"SelectorLink"},{"id":"Title","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"Author","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Автор') td","type":"SelectorText"},{"id":"Publisher","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Издателство') td","type":"SelectorText"},{"id":"Year","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Година') td","type":"SelectorText"},{"id":"Covertype","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Издание') td","type":"SelectorText"},{"id":"NumPages","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Брой страници') td","type":"SelectorText"},{"id":"Barcode","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Баркод') td","type":"SelectorText"},{"id":"Annotation","multiple":false,"parentSelectors":["Items"],"regex":"","selector":".full-description-content div.col-xs-12","type":"SelectorText"},{"clickElementSelector":".main-image.slick-current img","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"Click","multiple":true,"parentSelectors":["Items"],"selector":".main-image.slick-current img","type":"SelectorElementClick"},{"id":"Image","multiple":false,"parentSelectors":["Items"],"selector":".slick-current img.main-image-nosrc","type":"SelectorImage"}]}

@roshoivanov Hi, you should use the 'Link' type pagination in this case.

Please, note that the results will start to be returned only once the scraper has gone through all of the pagination links.

Learn more: My scraping job is running, although no results are being returned - Web Scraper Knowledge Base

1 Like

That's fine, but it skips scraping the results on the first page. Do you have an idea why is that?

You could make your "Items" selector a child of both _root and the paginator, and that should solve the issue:

{"_id":"ozone2","startUrl":["https://www.ozone.bg/promo_koleda/?kategoriya=knijarnica"],"selectors":[{"id":"Items","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root","Browse pages"],"selector":"a.product-box","type":"SelectorLink"},{"id":"Browse pages","paginationType":"auto","parentSelectors":["_root","Browse pages"],"selector":".products-list > div.col-xs-12 a.next","type":"SelectorPagination"},{"id":"Title","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"Author","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Автор') td","type":"SelectorText"},{"id":"Publisher","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Издателство') td","type":"SelectorText"},{"id":"Year","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Година') td","type":"SelectorText"},{"id":"Covertype","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Издание') td","type":"SelectorText"},{"id":"NumPages","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Брой страници') td","type":"SelectorText"},{"id":"Barcode","multiple":false,"parentSelectors":["Items"],"regex":"","selector":"tr:contains('Баркод') td","type":"SelectorText"},{"id":"Annotation","multiple":false,"parentSelectors":["Items"],"regex":"","selector":".full-description-content div.col-xs-12","type":"SelectorText"},{"clickElementSelector":".main-image.slick-current img","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"Click","multiple":true,"parentSelectors":["Items"],"selector":".main-image.slick-current img","type":"SelectorElementClick"},{"id":"Image","multiple":false,"parentSelectors":["Items"],"selector":".slick-current img.main-image-nosrc","type":"SelectorImage"}]}