I try to scrape a website which has two differents kinds of layouts.
The most common layout is sthg like this: https://www.chiquito.co.uk/restaurants/london/croydon/croydon and we just have to put an element around the section.Nap
But sometimes we also have this kind of layout => https://www.chiquito.co.uk/restaurants/london/london and the section.Nap no longer exists, we have to work on Teaser
Url: https://www.chiquito.co.uk/restaurants
I have to try work only on Teaser for every store and then to deduplicate the results, but I have only 77 stores out of 85.
Does anybody know a clean solution to get the 85 stores please ?
Sitemap:
{"startUrl":"https://www.chiquito.co.uk/restaurants/","selectors":[{"parentSelectors":["_root"],"type":"SelectorLink","multiple":true,"id":"first-href","selector":"a.Directory-listLink","delay":""},{"parentSelectors":["first-href"],"type":"SelectorLink","multiple":true,"id":"second-href","selector":"a.Directory-listLink","delay":""},{"parentSelectors":["second-href"],"type":"SelectorElement","multiple":true,"id":"element","selector":"article.Teaser","delay":""},{"parentSelectors":["element"],"type":"SelectorText","multiple":false,"id":"name","selector":"a.Teaser-titleLink","regex":"","delay":""},{"parentSelectors":["element"],"type":"SelectorText","multiple":false,"id":"address","selector":"div.Teaser-address div.c-AddressRow:nth-of-type(1)","regex":"","delay":""},{"parentSelectors":["element"],"type":"SelectorText","id":"city","selector":"span.c-address-city","delay":"","multiple":false,"regex":""},{"parentSelectors":["element"],"type":"SelectorText","multiple":false,"id":"zip_code","selector":"span.c-address-postal-code","regex":"(GIR|[A-Z]\d[A-Z\d]??|[A-Z]{2}\d[A-Z\d]??)[ ]??(\d[A-Z]{2})","delay":""},{"parentSelectors":["element"],"type":"SelectorText","multiple":false,"id":"address2","selector":"span.c-address-street-2","regex":"","delay":""},{"parentSelectors":["element"],"type":"SelectorElementAttribute","multiple":false,"id":"cid","selector":"a.c-get-directions-button","extractAttribute":"href","delay":""},{"parentSelectors":["element2"],"type":"SelectorText","multiple":false,"id":"address_element2","selector":"span.c-address-street-1","regex":"","delay":""},{"parentSelectors":["element2"],"type":"SelectorText","multiple":false,"id":"city_element2","selector":"span.c-address-city","regex":"","delay":""}],"_id":"chiquito_gbr"}
Thank's in advance,
Nicolas.


