Booking.com Scrape Hover Price Information?

Hi,

I am trying to scrape the same 2 pieces of information from multiple hover popups.

If you go here for example - Hotel Giorgi, Rome – Updated 2021 Prices

Select a check-in/check-out dates, and check now, you will get a lit of rooms and prices etc, but under "Today's Price" hover over "?" icon and I would like it to scrape the initial price on the first line and VAT cost on the 2nd line.

I need to copy the same parts from all other hover popups for the different room types on the same page.

How can I achieve this?

Here is my current sitemap which I can't seem to get it to scrape the info above as I need it:

{"_id":"test","startUrl":["https://www.booking.com/hotel/it/hotelcrocedimalta.html","https://www.booking.com/hotel/it/domus-urbis.html","https://www.booking.com/hotel/it/casenadeicollipalermo.html","https://www.booking.com/hotel/it/relais-fontana-di-trevi.html","https://www.booking.com/hotel/it/cardinal-saint-peter.html","https://www.booking.com/hotel/it/hisuiterome.html","https://www.booking.com/hotel/it/giorgi.html"],"selectors":[{"id":"container","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr.js-rt-block-row","multiple":true,"delay":0},{"id":"room","type":"SelectorText","parentSelectors":["container"],"selector":"div.hprt-roomtype-name","multiple":false,"regex":"","delay":0},{"id":"sleeps","type":"SelectorText","parentSelectors":["container"],"selector":".hprt-table-cell-occupancy div.hprt-block","multiple":false,"regex":"","delay":0},{"id":"free-cancellation","type":"SelectorText","parentSelectors":["container"],"selector":".e2e-cancellation span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"no-prepayment","type":"SelectorText","parentSelectors":["container"],"selector":".jq_tooltip span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"price","type":"SelectorElementAttribute","parentSelectors":["_root"],"selector":".bui-popover span","multiple":true,"extractAttribute":"role","delay":0}]}

I can't find the todays price.
Please explain your issue with some shots.

Hi Asad,

You must select a check in and checkout date to see the room prices before you can get to the section I am referring to.

Here is a screenshot:

1 Like

I thought it would take forever.
Listen. Just hover on the "?" and inspect on the price you want to scrape. Select the class and make new selector with it.
See the per night selector and you'll get it.
There is no need to do anything more.

Ask me if you have question.

Sitemap:
{"_id":"bookingcom","startUrl":["https://www.booking.com/hotel/it/giorgi.html"],"selectors":[{"id":"container","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr.js-rt-block-row","multiple":false,"delay":0},{"id":"room","type":"SelectorText","parentSelectors":["container"],"selector":"div.hprt-roomtype-name","multiple":false,"regex":"","delay":0},{"id":"sleeps","type":"SelectorText","parentSelectors":["container"],"selector":".hprt-table-cell-occupancy div.hprt-block","multiple":false,"regex":"","delay":0},{"id":"free-cancellation","type":"SelectorText","parentSelectors":["container"],"selector":".e2e-cancellation span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"no-prepayment","type":"SelectorText","parentSelectors":["container"],"selector":".jq_tooltip span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"per night","type":"SelectorText","parentSelectors":["container"],"selector":"span.per-night-tt-table-cell-value-wrapper","multiple":false,"regex":"","delay":0}]}

I've deleted the meta data and unselect the multiple. Sorry for that.

Thank you, but does it grab it from all price popups on the page?

Yes. just make parent element multiple.

Can you please send me the sitemap so I can test it out?

{"_id":"bookingcom","startUrl":["https://www.booking.com/hotel/it/giorgi.html"],"selectors":[{"id":"container","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr.js-rt-block-row","multiple":true,"delay":0},{"id":"room","type":"SelectorText","parentSelectors":["container"],"selector":"div.hprt-roomtype-name","multiple":false,"regex":"","delay":0},{"id":"sleeps","type":"SelectorText","parentSelectors":["container"],"selector":".hprt-table-cell-occupancy div.hprt-block","multiple":false,"regex":"","delay":0},{"id":"free-cancellation","type":"SelectorText","parentSelectors":["container"],"selector":".e2e-cancellation span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"no-prepayment","type":"SelectorText","parentSelectors":["container"],"selector":".jq_tooltip span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"per night","type":"SelectorText","parentSelectors":["container"],"selector":"span.per-night-tt-table-cell-value-wrapper","multiple":false,"regex":"","delay":0}]}

1 Like

Hi,

Sorry for the late reply!

Thank you so much, that works great.

There is just 1 last part that needs scraped and it's the VAT price, see screenshot.
Screenshot_3

Are you able to edit the sitemap to scrape that also? I can't seem to get it to work.

Thank you again in advance.

Hi Asad,

Please when you have time today, let me know about my final problem above.

Hi, Sorry for the late reply :sweat_smile:

I hope that helps.

Sitemap:
{"_id":"bookingcom","startUrl":["https://www.booking.com/hotel/it/giorgi.html"],"selectors":[{"id":"container","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr.js-rt-block-row","multiple":false,"delay":0},{"id":"room","type":"SelectorText","parentSelectors":["container"],"selector":"div.hprt-roomtype-name","multiple":false,"regex":"","delay":0},{"id":"sleeps","type":"SelectorText","parentSelectors":["container"],"selector":".hprt-table-cell-occupancy div.hprt-block","multiple":false,"regex":"","delay":0},{"id":"free-cancellation","type":"SelectorText","parentSelectors":["container"],"selector":".e2e-cancellation span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"no-prepayment","type":"SelectorText","parentSelectors":["container"],"selector":".jq_tooltip span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"per night","type":"SelectorText","parentSelectors":["container"],"selector":"span.per-night-tt-table-cell-value-wrapper","multiple":false,"regex":"","delay":0},{"id":"VAT","type":"SelectorText","parentSelectors":["container"],"selector":"tr:contains('10 % VAT') td.per-night-tt-table-cell-value","multiple":false,"regex":"","delay":0}]}

1 Like

You sir are a legend! Thank you so much.

2 final questions, let's see if you are the ultimate scrapping master! lol.

  1. How can I grab the hotel name WITHOUT the little badge next to it which I highlighted in my screenshot. There are references of the hotel name in other places, this is just 1 example you could use but if you find a better reference point then that's fine.
    Screenshot_4

  2. I don't suppose you can do something to add the "per night" and "vat" together? or is that pushing the boundaries of the tool as it can only scrape?

Kind Regards
Jamie

Thanks for your appreciation. Take a look and tell me if it's not working or you need anything else.
Are you Monty or Jamie?

Sitemap:
{"_id":"bookingcom","startUrl":["https://www.booking.com/hotel/it/giorgi.html"],"selectors":[{"id":"container","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr.js-rt-block-row","multiple":true,"delay":0},{"id":"room","type":"SelectorText","parentSelectors":["container"],"selector":"div.hprt-roomtype-name","multiple":false,"regex":"","delay":0},{"id":"sleeps","type":"SelectorText","parentSelectors":["container"],"selector":".hprt-table-cell-occupancy div.hprt-block","multiple":false,"regex":"","delay":0},{"id":"free-cancellation","type":"SelectorText","parentSelectors":["container"],"selector":".e2e-cancellation span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"no-prepayment","type":"SelectorText","parentSelectors":["container"],"selector":".jq_tooltip span.ungreen-condition-green","multiple":false,"regex":"","delay":0},{"id":"per night","type":"SelectorText","parentSelectors":["container"],"selector":"span.per-night-tt-table-cell-value-wrapper","multiple":false,"regex":"","delay":0},{"id":"VAT","type":"SelectorText","parentSelectors":["container"],"selector":"tr:contains('10 % VAT') td.per-night-tt-table-cell-value","multiple":false,"regex":"","delay":0},{"id":"Hotel Name","type":"SelectorText","parentSelectors":["_root"],"selector":"span.bh-photo-modal-name","multiple":false,"regex":"","delay":0},{"id":"All Charges","type":"SelectorText","parentSelectors":["container"],"selector":"table.per-night-tt-table","multiple":false,"regex":"","delay":0}]}

LOL hi again!

OK, you can deleted the "All charges" column as iI don't need this.

Its great that its now pulling the hotel name!!

But I have a final challenge for you! :smiley:

The rooms types are only mentioned once, but you get different prices for the same room type, as shown in my screenshot below, but the scrapper fills i in with null and so when it scrapes several rooms types and some prices don't have the room-type next to it, how can I identify which price is for which room?

So the room type must always be next to the relevant price if you understand me?

That's where my superpowers get limited.
I don't think it can be done by my gauntlet.

:frowning: I am actually sad to of defeated Asad Thanos! lol

What about the "order" of which it grabs the details.

It would be good if it could be like:

Double room - price - etc
null - price - etc
null - price - etc

Triple room - price - etc
null - price - etc

That way after the scrape, I will know that the nulls listed up until the next room type is scraped is under that category?

Scraper doesn't work that way or I don't know the way. :nerd_face:
We need the real experts here. @leemeng @viesturs @KristapsWS

The strange thing is, when I do data preview on the "container" you created, it shows the exact layout/order that I am looking for, however when I run the scrape it jumbles up the order, is there something we can to do make the scrape match the data preview order?

Actually, if I export the data to csv and sort it by "web-scraper-order" it shows the data in the correct order so that's good at least as all the room "null" fields I know relate to the previously mentioned room type
up until the next room type heading :slight_smile:

It would be great if someone could work out a way of inputting the room types instead of null and have the scrape data be in "web-scraper-order" by Z-A (Oldest to Newest)

1 Like