Link within an iFrame

Hi! I'm looking to scrape information about individual pairs of shoes listed by a website. I was able to get pretty far getting basic information just fine, but would love to grab a couple links that are made available within an iframe on each page. I can isolate by each page, but running a ton of individual scrapes sounds like a nightmare. Is there any way to pull, say, the first four links in the Marketplace section. I have tried so many things but can't seem to come up with a solution. Thank you in advance!

Url: Nike Air Max 1 Women's "Luxe" | Nike | Release Dates, Sneaker Calendar, Prices & Collaborations

@hexzerorouge Hi, it appears that the iframe source is based on a slightly different domain, therefore in order to scrape the desired data, you can use the iframe source link as a unique start URL instead.

Example:

{"_id":"solecollector-com","startUrl":["https://embed.solecollector.com/#/sneaker-offers/42887/solecollector?location=webSD"],"selectors":[{"id":"link-elements","multiple":true,"parentSelectors":["_root"],"selector":"li.price-list__item","type":"SelectorElement"},{"id":"link","multiple":false,"parentSelectors":["link-elements"],"selector":"a","type":"SelectorLink"}]}

1 Like

Hi @ViestursWS. I noticed that too, so that is reassuring in my skillset! :slight_smile:

That said— is there any way to get around needing to use different start URLs? I am essentially trying to recompile everything by Nike/sub-brand/style/colorway(as the navigation link at the top of each page). This is two steps back(Nike/Sub-brand) from the page I linked in my original post. Because of the way the site is set up, correct me if I'm wrong, there would be thousands of unique start URLS.

If that's the case, will I have any issue stacking thousands of start URLs in the JSON that was provided by @ViestursWS?

@hexzerorouge Hi, you could extract the iframe source link for the original sitemap. Then use the extracted URLs for a separate sitemap and later merge the scraped results if necessary.

Multiple start URLs for a sitemap can be added using 'Bulk Start Url Import' feature within Web Scraper Cloud which handles up to 20'000 start URL.