How to Scrape Popup Element Overlays?

I have tried looking this up, but have not been able to find this exact same use case.

I am trying to scrape from this site. When you click on the business, a popup overlay comes up on the screen. I need to figure out how to get all of the information from each popup for all businesses on the site.

URL: Native Home Services Directory | Native Owned

Here is my sitemap attempt, but its not working at all.

Sitemap:
{"_id":"NativeOwnedHomeServices","startUrl":["https://nativeowned.com/home-services/"],"selectors":[{"id":"Businesses","multiple":true,"parentSelectors":["_root"],"selector":"[dialogbox] div#cbeb-view-entry-modal-field-container","type":"SelectorElement"},{"id":"Bio","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"div","type":"SelectorText"},{"id":"Type","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"div.cbeb-entry-tag:nth-of-type(n+2)","type":"SelectorText"},{"id":"Phone","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"[field-key='phone'] div","type":"SelectorText"},{"id":"Email","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"[field='email'] a","type":"SelectorText"},{"id":"Address","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":".cbeb-center p","type":"SelectorText"},{"id":"WebLink","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":".cbeb-entry-view a[target]","type":"SelectorText"},{"id":"Facebook","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"div[space-evenly]","type":"SelectorText"}]}

Any help or ideas is greatly appreciated. Thank you

Hi,

Please have a look at the reference sitemap below:

{"_id":"NativeOwnedHomeServices","startUrl":["https://nativeowned.com/home-services/"],"selectors":[{"id":"Businesses","multiple":true,"parentSelectors":["listing-click"],"selector":"[dialogbox]:not([signup]) .cbeb-modal-content","type":"SelectorElement"},{"id":"title","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"div.cbeb-modal-title-text","type":"SelectorText"},{"id":"Bio","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":".cbeb-entry-viewfield-container .cbeb-entry-view p","type":"SelectorText"},{"id":"Type","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"div.cbeb-entry-tag","type":"SelectorText"},{"id":"Phone","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"[field-key='phone'] div","type":"SelectorText"},{"id":"Email","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":"[field='email'] a","type":"SelectorText"},{"id":"Address","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":".cbeb-center p","type":"SelectorText"},{"id":"WebLink","multiple":false,"parentSelectors":["Businesses"],"regex":"","selector":".cbeb-entry-view a[target]","type":"SelectorText"},{"extractAttribute":"href","id":"Facebook","multiple":false,"parentSelectors":["Businesses"],"selector":"a:has([title=\"Facebook\"])","type":"SelectorElementAttribute"},{"clickActionType":"real","clickElementSelector":"button:nth-of-type(2)","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":0,"discardInitialElements":"do-not-discard","id":"close","multiple":false,"parentSelectors":["Businesses"],"selector":"_parent_","type":"SelectorElementClick"},{"clickActionType":"real","clickElementSelector":"button[showmore]","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickMore","delay":0,"discardInitialElements":"do-not-discard","id":"pagination","multiple":false,"parentSelectors":["_root"],"selector":"body","type":"SelectorElementClick"},{"clickActionType":"real","clickElementSelector":"div.cb-beo-overview-title","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":0,"discardInitialElements":"discard-when-click-element-exists","id":"listing-click","multiple":true,"parentSelectors":["_root"],"selector":"body","type":"SelectorElementClick"}]}
1 Like

Oh wow. that was much more complex than I thought it would be. Thanks for taking the time to help with this.

One thing that I did notice when running this, is that the "see more" selector for the pagination isnt working. It only scrapes the first 15 on the page...

Never mind! I think the issue is that I needed to have a longer Page Load delay. This worked perfectly!

1 Like