Scraping hidden phone/email/website information

Hi,

Cannot scrape phone numbers from the site.

Selected Element Attribute, and from site's Elements preview, found the "phone-content" is storing full phone info but when selected, data output is still NULL.

Could you help me figuring out how to extract full numbers from the site as well emails and website urls?

Url: https://www.pkt.pl/szukaj/geodezja/warszawa

Sitemap:

{"_id":"pkt_geodezja","startUrl":["Najlepszy geodeta w lokalizacji warszawa"],"selectors":[{"id":"name","type":"SelectorText","parentSelectors":["_root"],"selector":".company name • Geodezja • Geodeci • pkt.pl a","multiple":true,"regex":"","delay":0},{"id":"phone","type":"SelectorElementAttribute","parentSelectors":["_root"],"selector":"span.call-text","multiple":true,"extractAttribute":"phone-content","delay":0}]}

Hi @Uzhipius

I would suggest using an element selector with "multiple" option checked for each of the companies - div.box-content and set it as a "parent" for the name - .company-name a and phone(element attribute - div.call--phone a with attribute name - data-phone)

Sitemap example:

{"_id":"pkt_geodezja","startUrl":["https://www.pkt.pl/szukaj/geodezja/warszawa"],"selectors":[{"delay":0,"id":"name","multiple":false,"parentSelectors":["wrapper"],"regex":"","selector":".company-name a","type":"SelectorText"},{"delay":0,"extractAttribute":"data-phone","id":"phone","multiple":false,"parentSelectors":["wrapper"],"selector":"div.call--phone a","type":"SelectorElementAttribute"},{"delay":0,"id":"wrapper","multiple":true,"parentSelectors":["_root"],"selector":"div.box-content","type":"SelectorElement"}]}

Hope it helps! :crossed_fingers:t6:

1 Like

Why div.call--phone a and now div.call-cell a?

When a class has two elements ie (call-cell call--phone) does it matter which one you use?