How can I scrape data on the following pages?

There are only Next and Back buttons. How can I scrape the data on the first page and then keep pressing and scraping the next one etc. until I get through all the pages

Url: Barrierefreie Suche | Deutsches Krankenhaus Verzeichnis

Sitemap:
{"_id":"KrankenhausAllgemein","startUrl":["https://www.deutsches-krankenhaus-verzeichnis.de/app/suche/barrierefrei"],"selectors":[{"id":"Suche","parentSelectors":["_root"],"type":"SelectorElementClick","clickActionType":"real","clickElementSelector":"button#search_send","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard-when-click-element-exists","multiple":false,"selector":"button#search_send"},{"id":"Unternehmen Name","parentSelectors":["Unternehmen"],"type":"SelectorText","selector":"h1","multiple":false,"regex":""},{"id":"E-Mail","parentSelectors":["Unternehmen"],"type":"SelectorElementAttribute","selector":"a[href^='mailto:']","multiple":false,"extractAttribute":"href"},{"id":"Adresse","parentSelectors":["Unternehmen"],"type":"SelectorText","selector":".col-sm-8 p:nth-of-type(1)","multiple":false,"regex":""},{"id":"Telefon","parentSelectors":["Unternehmen"],"type":"SelectorText","selector":".col-sm-8 a:nth-of-type(1)","multiple":false,"regex":""},{"id":"Unternehmen","parentSelectors":["_root","Weiter"],"type":"SelectorLink","selector":"#dkv_result_table_row a","multiple":true,"linkType":"linkFromHref"},{"id":"Weiter","parentSelectors":["_root","Unternehmen"],"type":"SelectorElementClick","clickActionType":"real","clickElementSelector":"a:contains('Die nächsten zehn Krankenhäuser darstellen')","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","multiple":true,"selector":"div#pagination"}]}

Hi,

Please see the sitemap below as a reference:

{"_id":"KrankenhausAllgemein","startUrl":["https://www.deutsches-krankenhaus-verzeichnis.de/app/suche/barrierefrei"],"selectors":[{"clickActionType":"real","clickElementSelector":"button#search_send","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"discard-when-click-element-exists","id":"Suche","multiple":false,"parentSelectors":["_root"],"selector":"button#search_send","type":"SelectorElementClick"},{"id":"Unternehmen Name","multiple":false,"parentSelectors":["Unternehmen"],"regex":"","selector":"h1","type":"SelectorText"},{"extractAttribute":"href","id":"E-Mail","multiple":false,"parentSelectors":["Unternehmen"],"selector":"a[href^='mailto:']","type":"SelectorElementAttribute"},{"id":"Adresse","multiple":false,"parentSelectors":["Unternehmen"],"regex":"","selector":".col-sm-8 p:nth-of-type(1)","type":"SelectorText"},{"id":"Telefon","multiple":false,"parentSelectors":["Unternehmen"],"regex":"","selector":".col-sm-8 a:nth-of-type(1)","type":"SelectorText"},{"id":"Weiter","paginationType":"clickMore","parentSelectors":["_root","Weiter"],"selector":"a:contains('Die nächsten zehn Krankenhäuser darstellen')","type":"SelectorPagination"},{"id":"Unternehmen","linkType":"linkFromHref","multiple":true,"parentSelectors":["Weiter"],"selector":"#dkv_result_table_row a","type":"SelectorLink"}]}

Note that the scraper will start to open the individual pages only after the pagination is finished.

Hope this helps!

1 Like

Thank you very much for your help. Your sitemap does it exactly as I had it. It only collects data from the first page and then the search stops. I.e. it doesn't jump to the next page and doesn't load more articles that need to be scraped in the same way etc.

Hi,

It works fine on my end. Can you try to clear the browser cache and cookies?

Thank you it works fine.