Weird Iframe Scrape

Can I get a little assistance with this sitemap.

I want to scrape Member Directory and need the following

a: Scroll Down to handle dynamic loading (That I can do)
b: Element click to load the details pop-up
C: scrape from within the iframe.

This is where it goes wrong and I can't seem to nail the right selectors that allow me to grab from within the iframe. In the past I would have used pop-up selector but that doesn't seem to work.

Any thoughts?

{"_id":"vcplatform","startUrl":["Member Directory in","multiple":true,"parentSelectors":["_root"],"selector":"div.collection-list-item-2","type":"SelectorElementScroll"},{"id":"Link-to-popup","linkType":"linkFromInlineScript","multiple":false,"parentSelectors":["click in"],"selector":"img","type":"SelectorLink"}]}

Hi,

You can find below a reference sitemap for scraping the profiles:

{"_id":"vcplatform","startUrl":["https://www.vcplatform.com/directory"],"selectors":[{"id":"name","multiple":false,"parentSelectors":["iframe"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"iframe","multiple":true,"parentSelectors":["click"],"selector":"iframe:iframe body","type":"SelectorElement"},{"id":"title","multiple":false,"parentSelectors":["iframe"],"regex":"","selector":"h4","type":"SelectorText"},{"clickActionType":"real","clickElementSelector":"div.collection-list-item-2:nth-of-type(-n+5) div.text-block-3","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":0,"discardInitialElements":"discard-when-click-element-exists","id":"click","multiple":true,"parentSelectors":["_root"],"selector":"html","type":"SelectorElementClick"},{"clickActionType":"real","clickElementSelector":"a.close-btn","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":0,"discardInitialElements":"discard-when-click-element-exists","id":"close","multiple":true,"parentSelectors":["click"],"selector":"_parent_","type":"SelectorElementClick"}]}

For testing purposes, I did not add the scroll and limited the selection to the first 5 profiles, since the data will be returned only after clicking through all results.

Let me know if you have any additional questions regarding this or need further assistance.

Thank You! That worked. I have a few questions on it:

RE: Click Element

  • Selector: Why did you set to HTML instead of setting it to "iframe:iframe body" was that because you needed a separate child to house the Element Selector (iframe) and the Element_Click (Close the iframe)
  • Disgard: Why did you select "Discard when click element exists"? I have always been foggy as to when to discard

Re: Id (iframe)
Why do you have that set to multiple? Wouldn't that inherit the multiple attribute from the original Parent (ID click)

Thanks Again, that was a unique one.

Hi,

I suppose it would also work with 'iframe:iframe body' as the click selector value. Sometimes it is necessary to move the selectors around, thus it is more convenient to have a wrapper selector.
The 'Discard initial elements' choice is not so important in this particular case, it is just the most commonly used, thus automatically selected by me.

The selector 'iframe' would also work with multiple not checked, but it is the default setting for Element type and does not affect the end result.