Does anyone know how to scrape all the product information including the image?

Hi, I am studying in CBS.
There is a projet we are doing.Our mentor asked us to investigate brand we’re interested in, so I attempted to collect product data from a site.

The site is : Handcrafted Fine Jewelry | From Our Hands To Yours | MollyJewelryUS

As this is my first time trying this, I intend to first collect the engagement category products.
I managed to scrape the first page of engagement products, all 20 items, it's good!
But how to scrapte second page, third page, etc?

I watched so many videos and tried so many times, it doesn't work.

Pls check belew,my sitemap code:
{"_id":"mollyjewelry","startUrl":["https://mollyjewelryus.com/"],"selectors":[{"id":"engagement","linkType":"linkFromHref","multiple":false,"parentSelectors":["_root"],"selector":".menu-item-30338.c-top-menu__item > a","type":"SelectorLink"},{"id":"collectionpage","linkType":"linkFromHref","multiple":true,"parentSelectors":["engagement"],"selector":".c-product-grid__title-wrap a:nth-of-type(1)","type":"SelectorLink"},{"id":"title","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"price","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":"bdi","type":"SelectorText"},{"id":"description","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":".c-product__short-description p","type":"SelectorText"},{"id":"image1","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image2","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image3","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":".active img.c-product__slider-img","type":"SelectorText"},{"id":"image4","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image5","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image6","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image7","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image8","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"}]}

Pls help.

Hi,

To do that, you should add a pagination selector to the sitemap, which matches the Next button. The pagination selector should be a parent of the product link selector:

{"_id":"mollyjewelry","startUrl":["https://mollyjewelryus.com/"],"selectors":[{"id":"engagement","linkType":"linkFromHref","multiple":false,"parentSelectors":["_root"],"selector":".menu-item-30338.c-top-menu__item > a","type":"SelectorLink"},{"id":"collectionpage","linkType":"linkFromHref","multiple":true,"parentSelectors":["pagination"],"selector":".c-product-grid__title-wrap a:nth-of-type(1)","type":"SelectorLink"},{"id":"title","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"price","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":"bdi","type":"SelectorText"},{"id":"description","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":".c-product__short-description p","type":"SelectorText"},{"id":"image1","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image2","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image3","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"regex":"","selector":".active img.c-product__slider-img","type":"SelectorText"},{"id":"image4","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image5","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image6","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image7","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"image8","multiple":false,"multipleType":"singleColumn","parentSelectors":["collectionpage"],"selector":".active img.c-product__slider-img","type":"SelectorImage"},{"id":"pagination","paginationType":"auto","parentSelectors":["engagement","pagination"],"selector":"a.next","type":"SelectorPagination"}]}

Thanks so much.
It works!

1 Like