Scrape images from gallery that loads images on request

Hello,

The product gallery slider on the given page loads to the DOM 3 images - an active image, previous to that, and the next image. What happens is that on the data preview of every step, everything seems to be working as it should, but the scraper doesn't produce any result.

Here are my step-by-step instructions:

1 - Pagination (next button type)
2 - On every page, I get the elements (product items)
3 - On every element, I get the link to the product page
4 - On the product page I collect some data and I have a popup instruction to open the product gallery and load the full-size images to the DOM. It loads only 3 images as I explained. To get over this, I am clicking on each thumbnail while capturing the image URLs. If you preview data on that step, you will see that data is being received correctly.

When scraping, I see the bot going through the pages, etc... but no data is being scraped. If I refresh, nothing happens.

Any Ideas of what is wrong here?

Thanks in advance.

Url: Revell 1/25 Tour Truck 'Rammstein' Gift Set # 07658

Sitemap:
{"_id":"emodels-revell","startUrl":["https://www.emodels.co.uk/brands/revell"],"selectors":[{"id":"pagination","parentSelectors":["_root","pagination"],"paginationType":"auto","selector":"a.next","type":"SelectorPagination"},{"id":"product-element-card","parentSelectors":["pagination"],"type":"SelectorElement","selector":"li.product","multiple":true,"delay":0},{"id":"product-page-link","parentSelectors":["product-element-card"],"type":"SelectorLink","selector":"a.product-item-link","multiple":false,"delay":0},{"id":"product-name","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"span[itemprop='name']","multiple":false,"delay":0,"regex":""},{"id":"product-sku","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"div[itemprop='sku']","multiple":false,"delay":0,"regex":""},{"id":"product-ean","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"div[itemprop='ean']","multiple":false,"delay":0,"regex":""},{"id":"product-stock-status","parentSelectors":["product-page-link"],"type":"SelectorText","selector":".stock span","multiple":false,"delay":0,"regex":""},{"id":"product-type","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"div[itemprop='product_type']","multiple":false,"delay":0,"regex":""},{"id":"product-regular-price","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"#old-price-65759 span","multiple":false,"delay":0,"regex":""},{"id":"product-promo-price","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"#product-price-65759 span","multiple":false,"delay":0,"regex":""},{"id":"product-description","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"div#description","multiple":false,"delay":0,"regex":""},{"id":"product-short-description","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"tr:contains('Short Description') td","multiple":false,"delay":0,"regex":""},{"id":"product-brand","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"tr:contains('Manufacturer') td","multiple":false,"delay":0,"regex":""},{"id":"product-scale","parentSelectors":["product-page-link"],"type":"SelectorText","selector":"tr:contains('Scale') td","multiple":false,"delay":0,"regex":""},{"id":"product-gallery-popup","parentSelectors":["product-page-link"],"type":"SelectorPopupLink","selector":".fotorama__stage__frame.fotorama__active img.fotorama__img","multiple":false,"delay":0},{"id":"product-gallery-popup-thumb-click","parentSelectors":["product-gallery-popup"],"type":"SelectorElementClick","clickElementSelector":"div.fotorama__thumb","clickElementUniquenessType":"uniqueHTMLText","clickType":"clickMore","delay":150,"discardInitialElements":"do-not-discard","multiple":true,"selector":"div.fotorama__thumb"},{"id":"product-gallery-image-full","parentSelectors":["product-gallery-popup"],"type":"SelectorGroup","selector":"img.fotorama__img--full","delay":0,"extractAttribute":"src"}]}

@svidersky.vitaly Hi. You can extract all of the images by using the 'Grouped' selector - .fotorama__thumb img with an 'Attribute name' - src.

To transform the size of these images, some additional data post-processing will be needed which is possible to be done through Web Scraper Cloud.

Sitemap example:

{"_id":"emodels-revell","startUrl":["https://www.emodels.co.uk/revell-1-25-tour-truck-rammstein-gift-set-07658.html"],"selectors":[{"delay":0,"id":"product-ean","multiple":false,"parentSelectors":["product-page"],"regex":"","selector":"div[itemprop='ean']","type":"SelectorText"},{"delay":2000,"id":"product-page","multiple":true,"parentSelectors":["_root"],"selector":"body:has(h1.page-title)","type":"SelectorElementScroll"},{"delay":0,"extractAttribute":"src","id":"all-images","parentSelectors":["product-page"],"selector":".fotorama__thumb img","type":"SelectorGroup"}]}

See the illustrative screenshots below:

  • Apply Regex - (?<="all-images-src":")[^"]+

  • Transform the image links using 'Replace text'

Helpful resources:

https://webscraper.io/documentation/web-scraper-cloud
https://webscraper.io/documentation/web-scraper-cloud/parser/regex-match

@ViestursWS Hi, Thanks for your message, I'll try it but at a first glance, the solution with replacing the image name with will work for that specific product, while I need to go through every product available on this page: https://www.emodels.co.uk/brands/revell

When trying each step in my sitemap (preview data button), I can get to all full-sized images. What I can't understand is why The scraper doesn't produce any results?

Thanks again.