The trickiest website I had to scrape yet... Please help me :)

Hi Webscrapers,

I'm trying to scrape this website:
https://eurosatory-prod.mobile-spot.com/?/map&lang=fr&profile=webapp-exh

The element click works well, but it doesn't scrape the "text" and "links" elements between each element click, so the output is empty.

How do you think I could do?

Thanks!

Quentin

Sitemap:
{"_id":"eurosatory","startUrl":["eurosatory exposant","parentSelectors":["_root"],"type":"SelectorElementClick","clickElementSelector":"li[data-has-places='true']:nth-of-type(n+2)","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":"2000","discardInitialElements":"do-not-discard","multiple":true,"selector":"li[data-has-places='true']:nth-of-type(n+2)"},{"id":"Nom ","parentSelectors":["_root"],"type":"SelectorText","selector":"div.prop-desc","multiple":false,"delay":0,"regex":""},{"id":"URL","parentSelectors":["_root"],"type":"SelectorLink","selector":"div:nth-of-type(5) .prop-right a","multiple":false,"delay":0},{"id":"Email","parentSelectors":["_root"],"type":"SelectorText","selector":"div:nth-of-type(6) a","multiple":false,"delay":0,"regex":""},{"id":"Adresse","parentSelectors":["_root"],"type":"SelectorText","selector":"div:nth-of-type(7) .prop-right div","multiple":false,"delay":0,"regex":""}]}

Hi @Quentin

You can use the "Element click" selector - div.all-informations with a "click selector" - li[data-has-places='true'] span set as 'parent' to all of the remaining selectors you are trying to extract from the exposants main page.

Example:

{"_id":"eurosatory-prod-mobile-spot-com","startUrl":["https://eurosatory-prod.mobile-spot.com/?/list&locateAll=false&inputs=%5B%7B%22dataType%22:%22exhibitors%22%7D%5D&lang=fr&profile=webapp-exh"],"selectors":[{"id":"exposants-click","parentSelectors":["_root"],"type":"SelectorElementClick","clickElementSelector":"li[data-has-places='true']:nth-of-type(-n+10) span","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","multiple":true,"selector":"div.all-informations"},{"id":"description","parentSelectors":["exposants-click"],"type":"SelectorText","selector":"p","multiple":false,"delay":0,"regex":""},{"id":"phone","parentSelectors":["exposants-click"],"type":"SelectorText","selector":"div.prop-left:has(span.fa-phone) + div","multiple":false,"delay":0,"regex":""}]}

Hope that helps.

P.S. when pasting your sitemap, make sure it is valid and use the 'Preformatted' text option.

Hi @ViestursWS,

Thanks a lot for your answer!

However I tried your sitemap to test it, but it doesn't export any data from the exposants pages.
(I copy-pasted your sitemap without making any modification)
It browses through the different exposants pages but it doesn't scrape any data.

Did it manage to export data on your side?

Hi again @ViestursWS,

My mistake, your sitemap did work. However it only scraped the first 9 exposants.

I tried to modify it in order to scrape the 1757 exposants but I have the same problem again: no data is scraped at all.

Do you have any idea what I do wrong?

Thanks a lot!

Sitemap:

{"_id":"eurosatory3","startUrl":["https://eurosatory-prod.mobile-spot.com/?/list&locateAll=false&inputs=%5B%7B%22dataType%22:%22exhibitors%22%7D%5D&lang=fr&profile=webapp-exh"],"selectors":[{"clickElementSelector":"li[data-has-places='true']:nth-of-type(n+2) span","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":"2000","discardInitialElements":"do-not-discard","id":"exposants-click","multiple":true,"parentSelectors":["_root"],"selector":"div.all-informations","type":"SelectorElementClick"},{"delay":0,"id":"Website URL","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"div.prop-left:has(span.fa-link) + div","type":"SelectorText"},{"delay":0,"id":"Email","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"div.prop-left:has(span.fa-envelope) + div","type":"SelectorText"},{"delay":0,"id":"Adresse","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"div.prop-left:has(span.fa-university) + div","type":"SelectorText"},{"delay":0,"id":"Description","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"p","type":"SelectorText"}]}

Hi @ViestursWS,

It did work.
I used nth-of-type(-n+1750)

Thanks for your help!

Sitemap:

{"_id":"eurosatory","startUrl":["https://eurosatory-prod.mobile-spot.com/?/list&locateAll=false&inputs=%5B%7B%22dataType%22:%22exhibitors%22%7D%5D&lang=fr&profile=webapp-exh"],"selectors":[{"clickElementSelector":"li[data-has-places='true']:nth-of-type(-n+1750) span","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"exposants-click","multiple":true,"parentSelectors":["_root"],"selector":"div.all-informations","type":"SelectorElementClick"},{"delay":0,"id":"Email","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"div.prop-left:has(span.fa-envelope) + div","type":"SelectorText"},{"delay":0,"id":"Website URL","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"div.prop-left:has(span.fa-link) + div","type":"SelectorText"},{"delay":0,"id":"Adresse","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"div.prop-left:has(span.fa-university) + div","type":"SelectorText"},{"delay":0,"id":"Description","multiple":false,"parentSelectors":["exposants-click"],"regex":"","selector":"p","type":"SelectorText"}]}