Get menus names from dropdown

Describe the problem. I am scraping menu data from a menu provider. The different menus i.e. Pub Menu / Dinner Menu / Desserts are in a dropdown. As this dropdown menu just controls what items are shown on the page I am able to scrape all the menu items (i.e. from Pub Menu, Dinner Menu & Desserts) without using this dropdown selector however my data doesn't show which menu it came from - so I don't know if a menu item is from the Pub Menu or the Dinner Menu.

Url: http://places.singleplatform.com/quinns-3/menu

Sitemap:
{"_id":"singleplatformmenu","startUrl":["http://places.singleplatform.com/simply-soulful-0/menu"],"selectors":[{"id":"Restaurant Name","type":"SelectorText","parentSelectors":["_root"],"selector":"div.location-title-row","multiple":false,"regex":"","delay":0},{"id":"Entire Menu","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.menus","multiple":true,"delay":0},{"id":"Menu","type":"SelectorElement","parentSelectors":["Entire Menu"],"selector":"div.menu","multiple":true,"delay":0},{"id":"Menu Item Group","type":"SelectorElement","parentSelectors":["Menu Section"],"selector":"div.item","multiple":true,"delay":0},{"id":"Menu Item Name","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"div.item-title-row","multiple":false,"regex":"","delay":0},{"id":"Menu Item Price","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"span.price","multiple":false,"regex":"","delay":0},{"id":"Menu Item Description","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"div.description","multiple":false,"regex":"","delay":0},{"id":"Menu Item Allergens","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"div.allergens-group","multiple":false,"regex":"","delay":0},{"id":"Add Ons Group","type":"SelectorElement","parentSelectors":["Menu Item Group"],"selector":"div.addon","multiple":true,"delay":0},{"id":"Add On Description","type":"SelectorText","parentSelectors":["Add Ons Group"],"selector":"li.text","multiple":false,"regex":"","delay":0},{"id":"Add On Price","type":"SelectorText","parentSelectors":["Add Ons Group"],"selector":".price li","multiple":false,"regex":"","delay":0},{"id":"Footer","type":"SelectorText","parentSelectors":["Entire Menu"],"selector":"div.footnote","multiple":false,"regex":"","delay":0},{"id":"Menu Section","type":"SelectorElement","parentSelectors":["Menu"],"selector":"div.section","multiple":true,"delay":0},{"id":"Menu Section Name","type":"SelectorText","parentSelectors":["Menu Section"],"selector":"div.title","multiple":false,"regex":"","delay":0},{"id":"Menu Section Description","type":"SelectorText","parentSelectors":["Menu Section"],"selector":".items > div.description","multiple":false,"regex":"","delay":0}]}

This info seems to be updated to the right of the dropdown after every selection. So you could create a text selector to scrape it:

@leemeng I tried this as a Text Selector under the Entire Menu parent selector.

It just gives me the Pub Menu as a text output - battling to get the other menu names and link with the actual menu items.

I think it might be because the page actually has all the menus on one page and the drop down selector just affects the UI.

I ran a quick test of your scraper and realized it does not click on all the dropdowns. I added a clicker for those, restructured your map, and added a couple more text scrapers. Pls modify as needed:

{"_id":"singleplatformmenu_v2","startUrl":["http://places.singleplatform.com/quinns-3/menu"],"selectors":[{"id":"Restaurant Name","type":"SelectorText","parentSelectors":["_root"],"selector":"div.location-title-row","multiple":false,"regex":"","delay":0},{"id":"Entire Menu","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.menus","multiple":true,"delay":0},{"id":"Menu","type":"SelectorElement","parentSelectors":["click_drop_down_menus"],"selector":"div.menu","multiple":true,"delay":0},{"id":"Menu Item Group","type":"SelectorElement","parentSelectors":["Menu Section"],"selector":"div.item","multiple":true,"delay":0},{"id":"Which_menu","type":"SelectorText","parentSelectors":["click_drop_down_menus"],"selector":"h2","multiple":false,"regex":"","delay":0},{"id":"time_slot","type":"SelectorText","parentSelectors":["click_drop_down_menus"],"selector":".header div.description","multiple":false,"regex":"","delay":0},{"id":"Menu Item Name","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"div.item-title-row","multiple":false,"regex":"","delay":0},{"id":"Menu Item Price","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"span.price","multiple":false,"regex":"","delay":0},{"id":"Menu Item Description","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"div.description","multiple":false,"regex":"","delay":0},{"id":"Menu Item Allergens","type":"SelectorText","parentSelectors":["Menu Item Group"],"selector":"div.allergens-group","multiple":false,"regex":"","delay":0},{"id":"Add Ons Group","type":"SelectorElement","parentSelectors":["Menu Item Group"],"selector":"div.addon","multiple":true,"delay":0},{"id":"Add On Description","type":"SelectorText","parentSelectors":["Add Ons Group"],"selector":"li.text","multiple":false,"regex":"","delay":0},{"id":"Add On Price","type":"SelectorText","parentSelectors":["Add Ons Group"],"selector":".price li","multiple":false,"regex":"","delay":0},{"id":"Footer","type":"SelectorText","parentSelectors":["click_drop_down_menus"],"selector":"div.footnote","multiple":false,"regex":"","delay":0},{"id":"Menu Section","type":"SelectorElement","parentSelectors":["Menu"],"selector":"div.section","multiple":true,"delay":0},{"id":"Menu Section Name","type":"SelectorText","parentSelectors":["Menu Section"],"selector":"div.title","multiple":false,"regex":"","delay":0},{"id":"Menu Section Description","type":"SelectorText","parentSelectors":["Menu Section"],"selector":".items > div.description","multiple":false,"regex":"","delay":0},{"id":"click_drop_down_menus","type":"SelectorElementClick","parentSelectors":["Entire Menu"],"selector":"_parent_","multiple":true,"delay":"3000","clickElementSelector":"div.nav-row > div > div:nth-child(n+1)","clickType":"clickOnce","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueHTMLText"}]}

@leemeng thank you for taking a look at this.

Unfortunately I ran the webscraper above:

It cycles through the different menus from the drop down menu
Unfortunately because all the menu data from all 5 menus is available to scrape no matter which dropdown menu is selected you end up with 5 instances of each menu line item
As an example from the menu URL in the scraper there is a Menu for Brunch and a section called Brunch Drinks - when you scrape the output includes a Brunch Drinks section under the Brunch Menu but also under the Dinner, Lunch, Desserts & Happy Hour Menus

If you have any other suggestions I would appreciate it as I have hit a wall.

30

I think it may have something to do with this code which seems to control which menu is shown and which are hidden.

Heh I should have taken a closer look at the data. You can change the Discard option in click_drop_down_menus to Discard when click element exists and that should fix it.

@leemeng sadly this results in only one Menu (i.e. Dinner menu) being scraped with all the menu items including menu items which are from other menus (i.e. Breakfast drinks menu section)