HELP! Scraper does find correct links but only checks the last one?

Hi all,

I am trying to scrape some info from the following site: Search Results

I managed to let the scraper expand the list and find the desired links (namely the clickable company names). Now I want to scrape some info from the page of every company link (for simplicity lets say I want to scrape every company name from the company link), but scraper returns the name of the last company for all company links! Anyone know how to solve?

Sitemap:
{"_id":"sapcompaniesdenmark","startUrl":["Search Results p","clickType":"clickMore","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueCSSSelector"},{"id":"SelectCompanyLinkInsideWrapper","type":"SelectorLink","parentSelectors":["ExpandListAndSelectCompanyWrappers"],"selector":".search-result__head a","multiple":false,"delay":0},{"id":"SaveName","type":"SelectorText","parentSelectors":["SelectCompanyLinkInsideWrapper"],"selector":".partner-details section:nth-of-type(1) header","multiple":false,"regex":"","delay":0}]}

Hi @qwerty Your sitemap did not work when I tried to copy it but I hope that I managed to figure out what you were after! :wink:

{"_id":"partneredge-sap-com","startUrl":["https://partneredge.sap.com/content/partnerfinder/search.html#/search/results?country=scm_v_country061&itemsPerPage=10&sortBy=shortname&sortOrder=asc"],"selectors":[{"id":"company-wrapper","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"li.search-results__item","multiple":true,"delay":"1200","clickElementSelector":"button.btn__show-more","clickType":"clickMore","discardInitialElements":"discard-when-click-element-exists","clickElementUniquenessType":"uniqueHTML"},{"id":"company-name","type":"SelectorText","parentSelectors":["company-wrapper"],"selector":".search-result__head a","multiple":false,"regex":"","delay":0}]}

He doesn't need that. He wants to click the link and scrape the data which is on the company page like contact name, email, etc.

List> Show more + Click On Company Name> Get Data from the linked page.

I've tried but I didn't get it through.

@Asad Oh, i see. I tested it with 2 of the company starting URLs and it refuses to go to the next link.

@qwerty It seems the issue lies in the link itself because of the "#" symbol the extension has difficuilties to proceed further.

After trying different combinations in the Cloud Scraper Environment, I managed to get the result you probably went for.

Thanks, both, for your effort! @ViestursWS, do the things you did in Cloud Scraper translate into the final sitemap? Or how can I replicate your findings? Because it seems as you indeed found what I am looking for!

Thanks for your help in advance!

It is probably easier to scrape this site in two stages, where in stage 1, you get all the URLs of the company pages, and then in stage 2 you have a different sitemap which uses all those stage 1 URLs as Starturls.

The company links looks like this:

#/partner/details/0000608336

so you need to prefix the site URL to turn them all to proper URLs:

https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000608336

Straightforward search n replace.

Thanks @leemeng for your suggestion, makes sense, however the extension still doesn't work properly, probably due to the # then as @ViestursWS said?

The output with multiple start urls still gives the output of only a single link:

Sitemap:
{"_id":"finalfinal","startUrl":["https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000147974","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000353434","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000922309","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001377841","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001015680","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001408938","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001849128","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000874377","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000349803","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000969711","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001399373","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001578553","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001524434","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000381240","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001139113","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001135412","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000906154","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001086789","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000867573","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001570772","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001260100","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001217394","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000229025","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001809830","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000022164","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001278328","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000635318","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001612823","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000881176","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001538749","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001768183","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000435600","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000398962","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0002474285","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000039257","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001849128","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000908633","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001779144","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0002296266","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000695306","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000219650","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001571635","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000038998","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001175596","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001192229","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001548874","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001052828","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000011883","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000022209","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001086789","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000837167","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001101676","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001098721","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000881617","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001418558","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001860823","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000913899","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000762267","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001736804","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000148277","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000011837","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001330946","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001187957","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000026470","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000506579","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001683926","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000219223","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001114078","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001378860","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001810800","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001299079","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001205088","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000658199","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001307261","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001297619","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001495496","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001823407","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000940933","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000979331","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000668894","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000643077","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000489243","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001761033","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0002050202","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001557539","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001060055","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001314977","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000147871","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000454027","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000667100","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000871208","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001223440","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000353785","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000832330","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000650037","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000665463","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001330946","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000880860","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001188654","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0001853963","https://partneredge.sap.com/content/partnerfinder/search.html#/partner/details/0000856053"],"selectors":[{"id":"Companyname","type":"SelectorText","parentSelectors":["_root"],"selector":".partner-details section:nth-of-type(1) header","multiple":false,"regex":"","delay":0},{"id":"PartnerLevel","type":"SelectorText","parentSelectors":["_root"],"selector":"section:nth-of-type(2) div:nth-of-type(3) .col-md-9 p","multiple":false,"regex":"","delay":0},{"id":"Authorizations","type":"SelectorText","parentSelectors":["_root"],"selector":"div:nth-of-type(4) .col-md-9 p","multiple":true,"regex":"","delay":0}]}

Hmm interesting. The server seems to be reading from a database via URLS like:
https://partneredge.sap.com/bin/partnerfinder/partnerdetails?partnerId=0000608336
https://partneredge.sap.com/bin/partnerfinder/partnerdetails?partnerId=0000922309

The company numbers at the end of the URLs match the ones you've been getting. However the data is in JSON format so you probably can't parse it with WS alone. JSON parsing is a common task for languages like Python, Java, C# etc.