Hi, I'm trying to scrape my restaurant's reviews on TripAdvisor (my plan later is to also scrape reviews from Zomato and Google My Business).
I want to scrape "Title", "Rating", "Date" and the "Full review". I seem to get everything right except for three things:
-
When I export to excel I get mismatched columns and rows (ex: part of the description in the title column but in a lower row)
-
I programmed it to click the "Show more" to show the full review with an "Element Click", but I still have the "Show more" text in the reviews (and not the full review).
-
I get columns that I didn't ask for (web-scraper-order, web-scraper-start-url, pagination
, pagination-href)
Thanks for your time and patience.
Antonio
Sitemap:
{"_id":"tripadvisor","startUrl":["https://www.tripadvisor.cl/Restaurant_Review-g294305-d1059717-Reviews-China_Village-Santiago_Santiago_Metropolitan_Region.html"],"selectors":[{"id":"pagination","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":"a.pageNum:nth-of-type(n+2)","multiple":true,"delay":0},{"id":"wrappers","type":"SelectorElement","parentSelectors":["_root","pagination"],"selector":"div.is-9","multiple":true,"delay":0},{"id":"titulo","type":"SelectorText","parentSelectors":["wrappers"],"selector":"span.noQuotes","multiple":false,"regex":"","delay":0},{"id":"rating","type":"SelectorElementAttribute","parentSelectors":["wrappers"],"selector":"span.ui_bubble_rating","multiple":false,"extractAttribute":"class","delay":0},{"id":"descripcion","type":"SelectorText","parentSelectors":["wrappers"],"selector":"p","multiple":false,"regex":"","delay":0},{"id":"fecha","type":"SelectorText","parentSelectors":["wrappers"],"selector":"span.ratingDate","multiple":false,"regex":"","delay":0},{"id":"mostrar-mas","type":"SelectorElementClick","parentSelectors":["wrappers"],"selector":"[data-collapsed='true'] p","multiple":true,"delay":0,"clickElementSelector":"span.ulBlueLinks","clickType":"clickOnce","discardInitialElements":"discard","clickElementUniquenessType":"uniqueText"}]}