How to get info of each product in each page?

Hi, I'm trying to grab the info of each car in this site that has many pages and in each page has URLs for many cars. Basically I want to open each car and get price, model info and telephone number for each car in each page from 1 to last page. The telephone number only appears after with do click on "Ver Teléfono."

I show it in this video

My current attempt sitemap only opens the link of 3 cars and then stops. Thanks in advance

{"_id":"coches","startUrl":["https://www.coches.net/segunda-mano/"],"selectors":[{"id":"cars","parentSelectors":["_root"],"type":"SelectorLink","selector":"a.mt-CardBasic-titleLink","multiple":true,"linkType":"linkFromHref"},{"id":"next","parentSelectors":["cars","next"],"paginationType":"auto","type":"SelectorPagination","selector":".sui-AtomButton-rightIcon .sui-AtomIcon--small svg"}]}

Scroll needed to let Webscraper find all links on the page...
Check this sitemap to collect data from first 3 pages:

{"_id":"TEST_coches","startUrl":["https://www.coches.net/segunda-mano/?pg=[1-3]"],"selectors":[{"delay":2000,"elementLimit":50,"id":"scroll","multiple":false,"parentSelectors":["_root"],"selector":"div.mt-AdsList-pagination","type":"SelectorElementScroll"},{"id":"cars","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"div[data-ad-position] a:has(img)","type":"SelectorLink"},{"id":"name","multiple":false,"parentSelectors":["cars"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"price","multiple":false,"parentSelectors":["cars"],"regex":"","selector":"div[data-testid=\"card-adPrice-price\"]","type":"SelectorText"},{"clickActionType":"real","clickElementSelector":".mt-FormContactSeller-callButtonsPhone button:contains(\"Ver teléfono\")","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard","id":"click","multiple":false,"parentSelectors":["cars"],"selector":".mt-FormContactSeller-callButtonsPhone button:contains(\"Ver teléfono\")","type":"SelectorElementClick"},{"id":"phone","multiple":false,"parentSelectors":["cars"],"regex":"","selector":"div.mt-FormContactSeller-callButtonsPhone","type":"SelectorText"}]}

Thanks so much for your help. Simply as define interval of pages [1-N]!.
I'm trying and it works, what I see is some car pages have the phone higher and others in lower position and now is currently retrieving the phone number of those for which the phone appears located lower.

This is an example for which phone appears in higher position and I found the xpath to be //SPAN[@class='mt-LeadPhoneCall-linkText mt-LeadPhoneCall-linkText--medium'][text()='Ver teléfono']

and this is a case for which the phone number appears in lower position and I found that xpath coud be these 2

(//BUTTON[@shape='circular'])[10]
//div[@class='mt-FormContactSeller-callButtonsPhone']//button[1]

Maybe you can help me in how to handle to get the phone number in both cases. Regards

check it:

{"_id":"TEST_coches","startUrl":["https://www.coches.net/segunda-mano/?pg=[1-3]"],"selectors":[{"delay":2000,"elementLimit":50,"id":"scroll","multiple":false,"parentSelectors":["_root"],"selector":"div.mt-AdsList-pagination","type":"SelectorElementScroll"},{"id":"cars","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"div[data-ad-position] a:has(img)","type":"SelectorLink"},{"id":"name","multiple":false,"parentSelectors":["cars"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"price","multiple":false,"parentSelectors":["cars"],"regex":"","selector":"div[data-testid=\"card-adPrice-price\"]","type":"SelectorText"},{"clickActionType":"real","clickElementSelector":".mt-FormContactSeller-callButtonsPhone button:contains(\"Ver teléfono\")","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard","id":"click","multiple":false,"parentSelectors":["cars"],"selector":".mt-FormContactSeller-callButtonsPhone button:contains(\"Ver teléfono\")","type":"SelectorElementClick"},{"id":"phone1","multiple":false,"parentSelectors":["cars"],"regex":"","selector":"div.mt-FormContactSeller-callButtonsPhone","type":"SelectorText"},{"clickActionType":"real","clickElementSelector":"p[data-testid=\"lead-phone-call-link\"]","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":1500,"discardInitialElements":"discard","id":"click_phone2","multiple":false,"parentSelectors":["cars"],"selector":"p[data-testid=\"lead-phone-call-link\"]","type":"SelectorElementClick"},{"extractAttribute":"arial-label","id":"phone2","multiple":false,"parentSelectors":["cars"],"selector":"p[data-testid=\"lead-phone-call-link\"]","type":"SelectorElementAttribute"}]}
1 Like

Thanks it seems it gets the 2 numbers now. Only 2 last questions.

I see that is very slow since opens url of each car. For 2 pages it took more than 10 min. Is there a way to speed up the process? for example, get the data without to load each url or something?

And finally, if I want to extract more data from each car´s url, only is needed to add another selector to inside "cars" ID within your code?

  1. There is no way to speed up the scraping process...
  2. Correct, if you add another selector, you would succeed.
1 Like