How do I set up the correct pagination link for this site:
I also want to get the mail address for each entry, e.g.
but I only get the "E-Mail senden" text.
How do I set up the correct pagination link for this site:
I also want to get the mail address for each entry, e.g.
but I only get the "E-Mail senden" text.
OK, first problem solved:
Any ideas about the mail addresses?
for email you can use: a:contains("E-Mail senden")
Thanks for your reply.
I'm a noob, what would the sitemap look like?
In the meantime I've solved it this way, but have to clean up the results in Excel - which I hate because Microsoft can't handle data particularly carefully:
{ …
"selectors":[{"id":"Schullink","parentSelectors":["_root"],"type":"SelectorLink","selector":"a.row","multiple":true,"linkType":"linkFromHref"},{"id":"info","parentSelectors":["Schullink"],"type":"SelectorText","selector":"p.adresse:nth-of-type(1)","multiple":true,"regex":""},{"id":"Mail","parentSelectors":["Schullink"],"type":"SelectorHTML","selector":"p.adresse:nth-of-type(2)","multiple":false,"regex":""}]}
send your complete sitemap by using this button:
{"_id":"page_schulen_de_n","startUrl":["https://schulen.de/geosuche/suchergebnisse/?gymnasiale_oberstufe=true&lat=48.5216364&lng=9.0576448&formatted_address=72+T%C3%BCbingen%2C+Deutschland&rad=250&page=[135-160]"],"selectors":[{"id":"Schullink","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"a.row","type":"SelectorLink"},{"id":"Einrichtung","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"Adresse","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"p.adresse:nth-of-type(1)","type":"SelectorText"},{"id":"Kontakt","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"p.adresse:nth-of-type(2)","type":"SelectorText"},{"id":"Mail","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"p.adresse:nth-of-type(2)","type":"SelectorHTML"},{"id":"Schüler","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"ul:nth-of-type(3) li:nth-of-type(1)","type":"SelectorText"}]}
like this?
check it out:
{"_id":"page_schulen_de_n","startUrl":["https://schulen.de/geosuche/suchergebnisse/?gymnasiale_oberstufe=true&lat=48.5216364&lng=9.0576448&formatted_address=72+T%C3%BCbingen%2C+Deutschland&rad=250&page=[135-160]"],"selectors":[{"id":"Schullink","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"a.row","type":"SelectorLink"},{"id":"Einrichtung","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"h1","type":"SelectorText"},{"id":"Adresse","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"p.adresse:nth-of-type(1)","type":"SelectorText"},{"id":"Kontakt","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"p.adresse:nth-of-type(2)","type":"SelectorText"},{"extractAttribute":"href","id":"Mail","multiple":false,"parentSelectors":["Schullink"],"selector":"p.adresse a:contains(\"E-Mail senden\")","type":"SelectorElementAttribute"},{"id":"Schüler","multiple":false,"parentSelectors":["Schullink"],"regex":"","selector":"ul li:contains(\"Schüler\")","type":"SelectorText"}]}
Thank you so much. This works perfect to me.
(is it necessary / possible to mark this topic as solved?)
don't worry ) take care ))