Scraping YouTube links of YouTube video Iframe

Describe the problem.
I wish to get YouTube link from the song (bhajan) page. Example page: Ram ka hua mai mere Ram ho Gye bhajan lyrics

Every such page has a song title, lyrics, and later an iframe of the YouTube song. Such as
<iframe width="640" height="360" src="https://www.youtube.com/embed/4FDVkG3aZMY?rel=0&fs=1&hl=en_US&showinfo=0&iv_load_policy=3" allowfullscreen></iframe>

I just want the URL of the song i.e. https://www.youtube.com/embed/4FDVkG3aZMY, not the title or any other detail from the iframe.

If it is not possible, Can I get the whole parent element? I will regex the youtube URL from that later.

I checked several old queries regarding Iframes, but didn't find any solution that I could make work. Considering myself new to scraping, I wanted confirmation if the YouTube URL can be scraped along with other data. Thanks in Advance.

Url: https://bhajanganga.com

Sitemap:
{"_id":"bhajanganga","startUrl":["https://bhajanganga.com/"],"selectors":[{"id":"god-link","parentSelectors":["_root"],"type":"SelectorLink","selector":"a.left_menu","multiple":true,"linkType":"linkFromHref"},{"id":"bhajan-link","parentSelectors":["god-link"],"type":"SelectorLink","selector":"div:nth-of-type(3) a.bhajan_list","multiple":true,"linkType":"linkFromHref"},{"id":"hindi-title","parentSelectors":["bhajan-link"],"type":"SelectorText","selector":"div.head","multiple":false,"regex":""},{"id":"download-lyrics-link","parentSelectors":["bhajan-link"],"type":"SelectorLink","selector":".div_download a","multiple":false,"linkType":"linkFromHref"},{"id":"lyrics-content","parentSelectors":["bhajan-link"],"type":"SelectorText","selector":"p","multiple":false,"regex":""}]}

Hi,

Please check the sitemap below:

{"_id":"bhajanganga","startUrl":["https://bhajanganga.com/"],"selectors":[{"id":"god-link","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"a.left_menu","type":"SelectorLink"},{"id":"bhajan-link","linkType":"linkFromHref","multiple":true,"parentSelectors":["god-link"],"selector":"div:nth-of-type(3) a.bhajan_list","type":"SelectorLink"},{"id":"hindi-title","multiple":false,"parentSelectors":["bhajan-link"],"regex":"","selector":"div.head","type":"SelectorText"},{"id":"download-lyrics-link","linkType":"linkFromHref","multiple":false,"parentSelectors":["bhajan-link"],"selector":".div_download a","type":"SelectorLink"},{"id":"lyrics-content","multiple":false,"parentSelectors":["bhajan-link"],"regex":"","selector":"p","type":"SelectorText"},{"extractAttribute":"src","id":"youtube","multiple":false,"parentSelectors":["bhajan-link"],"selector":"iframe[src*=\"https://www.youtube.com/\"]","type":"SelectorElementAttribute"}]}
1 Like

Thanks a tonne! :pray:

I was not able to click on the iframe, because that led to the video being played not the iframe or its parent <div> getting selected.

The selector that you added has its type as SelectorElementAttribute. Does that mean it can be done via the UI interface of the plugin? Or did you edit the JSON yourself. Thanks again.

{
    "extractAttribute": "src",
    "id": "youtube",
    "multiple": false,
    "parentSelectors": [
        "bhajan-link"
    ],
    "selector": "iframe[src*=\"https://www.youtube.com/\"]",
    "type": "SelectorElementAttribute"
}

Glad I could help! If you have a moment, I'd appreciate you leaving a review on the Web Scraper extension page! Your feedback helps us improve!

The point-and-click will not work for elements within an iframe. If this is the case, you have to right-click on the element and click Inspect, and the HTML for the correct selector.