Scraping text 'behind' image

Hello everyone,

I'm pretty new to the webscraper.io tool and I have a small question about its use.

I want to scrape data from the following website: https://www.transfermarkt.com/statistik/letztevertragsverlaengerungen

The data on this website is represented as a table with the following columns: 'Player', 'Age', 'Nationality', 'Club' and 'New contract until'. Scrapping this table works perfectly for all columns, except for the 'Nationality' column. Since the content in this column is represented as image instead of text, this column is left blank after scrapping. However, I want to scrape the text you see when moving your mouse cursor towards the image. For example, the scraper should scrape the text 'Lithuania' as nationality in the following case (second row): https://i.imgur.com/5W7yCfy.png

Does anyone know how I can do this? Any help would be really appreciated! :slight_smile:

Use the Element Attribute selector
attribute name = alt

1 Like

Thanks for your reply! That actually works pretty easy :slight_smile:

However, I don't want to scrape the data as an individual element. I want to scrape it as a column in a table, since the data belongs to other elements in the table. Is there a way to get this done?

Even scrapping the original table and the element attribute separately (and after that, merging it together in excel) seems not to work, because webscraper changes the order of the elements in the table.

I hope you can help me out!

something like this?

{"_id":"tramsfer-markt","startUrl":["https://www.transfermarkt.com/statistik/letztevertragsverlaengerungen"],"selectors":[{"id":"Table-pagin","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"table.items > tbody > tr","multiple":true,"delay":0,"clickElementSelector":"li.naechste-seite a","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"},{"id":"Name","type":"SelectorText","parentSelectors":["Table-pagin"],"selector":"td:nth-of-type(1) td.hauptlink","multiple":false,"regex":"","delay":0},{"id":"Position","type":"SelectorText","parentSelectors":["Table-pagin"],"selector":"td:nth-of-type(1) tr:nth-of-type(2) td","multiple":false,"regex":"","delay":0},{"id":"age","type":"SelectorText","parentSelectors":["Table-pagin"],"selector":"td.zentriert:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Nationality","type":"SelectorElementAttribute","parentSelectors":["Table-pagin"],"selector":"td.zentriert img.flaggenrahmen","multiple":false,"extractAttribute":"alt","delay":0},{"id":"club","type":"SelectorText","parentSelectors":["Table-pagin"],"selector":"td:nth-of-type(4)","multiple":false,"regex":"","delay":0},{"id":"new contract until","type":"SelectorText","parentSelectors":["Table-pagin"],"selector":"td.zentriert.hauptlink","multiple":false,"regex":"","delay":0}]}
1 Like

Yes, that is indeed exactly what I meant!

Thank you very much, you helped me a lot with this :slight_smile: