Infinite scroll

Hi,
im trying to scrape this page

https://www.fintastico.com/fintech-uk/

i manually scroll down until i see all results
then i try to run this sitemap
but i get only the "first page" results
how can i tell the scraper to automatically scroll down?
i also tried "element scroll down" but not sure how to set it up
thanks!

Sitemap:
{"_id":"fintastico","startUrl":["https://www.fintastico.com/fintech-uk/"],"selectors":[{"id":"cards","type":"SelectorElement","parentSelectors":["_root"],"selector":"ul.archive-list li","multiple":true,"delay":0},{"id":"companylink","type":"SelectorLink","parentSelectors":["cards"],"selector":"h4","multiple":true,"delay":0},{"id":"elements","type":"SelectorElement","parentSelectors":["companylink"],"selector":"div.col-md-8","multiple":true,"delay":0},{"id":"website","type":"SelectorText","parentSelectors":["elements"],"selector":"li:nth-of-type(1) a","multiple":false,"regex":"","delay":0},{"id":"linkedin","type":"SelectorText","parentSelectors":["elements"],"selector":"a.in","multiple":false,"regex":"","delay":0}]}

1 Like

Simple, just change your "cards" element selector to "Selector Element Scroll" and add a delay of 2000-3000 milliseconds to it.

{"_id":"fintastico","startUrl":["https://www.fintastico.com/fintech-uk/"],"selectors":[{"id":"cards","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"ul.archive-list li","multiple":true,"delay":"3000"},{"id":"companylink","type":"SelectorLink","parentSelectors":["cards"],"selector":"h4","multiple":true,"delay":0},{"id":"elements","type":"SelectorElement","parentSelectors":["companylink"],"selector":"div.col-md-8","multiple":true,"delay":0},{"id":"website","type":"SelectorText","parentSelectors":["elements"],"selector":"li:nth-of-type(1) a","multiple":false,"regex":"","delay":0},{"id":"linkedin","type":"SelectorText","parentSelectors":["elements"],"selector":"a.in","multiple":false,"regex":"","delay":0}]}

thanks for this solution!
it seems it is scraping now, but i get zero results, also i can see it can't grab the company name

how can i fix this?
thanks!

i just tried this sitemap
{"_id":"fintastico3","startUrl":["https://www.fintastico.com/fintech-uk/"],"selectors":[{"id":"cards","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"ul.archive-list li","multiple":true,"delay":"5000"},{"id":"cardblock","type":"SelectorElement","parentSelectors":["cards"],"selector":"div.card-block","multiple":true,"delay":0},{"id":"companylink","type":"SelectorLink","parentSelectors":["cardblock"],"selector":"h4","multiple":true,"delay":0},{"id":"website","type":"SelectorText","parentSelectors":["companylink"],"selector":"li:nth-of-type(1) a","multiple":false,"regex":"","delay":0},{"id":"linkedin","type":"SelectorText","parentSelectors":["companylink"],"selector":"a.in","multiple":false,"regex":"","delay":0}]}

but still zero data

Alright, so I had only fixed the scroll part before and didn't even check to see as to if it worked past that. I did notice something else wrong, you were using a text element to try and extract your linkedin and website urls where you should have been using ElementAttribute with href as the attribute, but that's not what's keeping you from getting any results. I tried turning off the scroll part and it works just fine so I think the issue might be that there isn't an end to the scrolling. I don't think it will start following the 'cards' links until it gets to the bottom of the page and if there isn't one then it will just keep scrolling forever.

I forgot to add the current version I have for your scraper but again if it is truly an infinite scroll then it just might now be possible.

{"_id":"fintastico","startUrl":["https://www.fintastico.com/fintech-uk/"],"selectors":[{"id":"cards","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"ul.archive-list li","multiple":true,"delay":"3000"},{"id":"companylink","type":"SelectorLink","parentSelectors":["cards"],"selector":"a","multiple":false,"delay":0},{"id":"elements","type":"SelectorElement","parentSelectors":["companylink"],"selector":"div.col-md-8","multiple":true,"delay":0},{"id":"website","type":"SelectorElementAttribute","parentSelectors":["elements"],"selector":"li:nth-of-type(1) a","multiple":false,"extractAttribute":"href","delay":0},{"id":"linkedin","type":"SelectorElementAttribute","parentSelectors":["elements"],"selector":"a.in","multiple":false,"extractAttribute":"href","delay":0},{"id":"companyname","type":"SelectorText","parentSelectors":["cards"],"selector":"h4","multiple":false,"regex":"","delay":0}]}

It's working....running it right now and it's currently scraping the individual card pages.

cool thanks it is working great!
:rocket:

Is that the latest version of the working scrapper? It's not working for me now, I imported it verbatim, and the scrapper will scroll up and down the website for a while but after a short wait it will finish with zero results.

1 Like

Hi,

I have the same problem as @eldoland had.

It is scrolling down, but I have no data, I can't go further..

How can I fix this ?
Thanks

{"_id":"hackerone","startUrl":["https://hackerone.com/directory/programs"],"selectors":[{"id":"liste_programme","type":"SelectorElementScroll","parentSelectors":["_root"],"selector":"a.daisy-link--major","multiple":true,"delay":"3000"},{"id":"source_code","type":"Text","parentSelectors":["liste_programme"],"selector":".vertical-spacing div.grid--has-outside-gutter:nth-of-type(1) div.grid__column:nth-of-type(1)","multiple":true,"regex":"(?=[Ss]ource code analysis|[Ss]ource code review|[Ss]ource code|[Cc]ode analysis|[Cc]ode review)","delay":0},{"id":"inscope","type":"Text","parentSelectors":["liste_programme"],"selector":".card__content .vertical-spacing .daisy-table__cell > span","multiple":true,"delay":0},