Tripadvisor images

DCR · February 24, 2022, 12:38am

Hi guys,

I'm trying to scrape restaurant data from tripadvisor. Everything works fine except for the images.

I cannot recover the images contained in the mosaic. However when I do Data preview, the image URLs appear correctly.

Could someone advise me?

Thanks a lot

{"_id":"trip","startUrl":["https://www.tripadvisor.fr/Restaurants-g187178-zfg9909-Lille_Nord_Hauts_de_France.html"],"selectors":[{"id":"Page Changer ","paginationType":"auto","parentSelectors":["_root","Page Changer "],"selector":"a.nav","type":"SelectorPagination"},{"delay":0,"id":"Restaurang","multiple":true,"parentSelectors":["_root","Page Changer "],"selector":"a.bHGqj","type":"SelectorLink"},{"delay":0,"id":"namn","multiple":false,"parentSelectors":["Restaurang"],"regex":"","selector":"h1.fHibz","type":"SelectorText"},{"delay":0,"id":"Betyg","multiple":false,"parentSelectors":["Restaurang"],"regex":"","selector":"div.cfxpI:nth-of-type(1) div.dbgRU","type":"SelectorText"},{"delay":0,"id":"email","multiple":false,"parentSelectors":["Restaurang"],"selector":"div.bKBJS:nth-of-type(2) a","type":"SelectorLink"},{"delay":0,"id":"tel","multiple":false,"parentSelectors":["Restaurang"],"regex":"","selector":"a span span.brMTW","type":"SelectorText"},{"delay":0,"id":"hemsida","multiple":false,"parentSelectors":["Restaurang"],"selector":".enBrh a.dOGcA","type":"SelectorLink"},{"delay":0,"id":"adress","multiple":false,"parentSelectors":["Restaurang"],"regex":"","selector":".cSPba .dOGcA span.brMTW","type":"SelectorText"},{"delay":0,"id":"tags","multiple":false,"parentSelectors":["Restaurang"],"regex":"","selector":"span.VRlVV","type":"SelectorText"},{"delay":0,"id":"imgs","multiple":true,"parentSelectors":["Restaurang"],"selector":"div.large_photo_wrapper:nth-of-type(n+2) img.basicImg, .mini_photo_wrap img","type":"SelectorImage"}]}

ViestursWS · February 24, 2022, 12:44pm

@DCR Hi, with the current selector set up each of the 'Images' will appear on a new line due to the selector being set to 'Multiple'. In order to fix this - use the 'Grouped' selector with an 'Attribute name' - src.

Example:

{"_id":"tripadvisor-fr","startUrl":["https://www.tripadvisor.fr/Restaurant_Review-g187178-d4242962-Reviews-Aux_Merveilleux_de_Fred-Lille_Nord_Hauts_de_France.html"],"selectors":[{"delay":0,"extractAttribute":"src","id":"images","parentSelectors":["_root"],"selector":"div.large_photo_wrapper:nth-of-type(n+2) img.basicImg, .mini_photo_wrap img","type":"SelectorGroup"}]}

Learn more: Web Scraper << How to >> Scrape multiple values in one row

DCR · February 24, 2022, 1:36pm

@ViestursWS thanks a lot you made my day !