Data not colected after website state setup secuence

Web Scraper version: 1.87.6
Chrome version: Version 131.0.6778.86
OS: Windows 11

Link to the site you were scraping: Directorio Hospital Angeles Health System

Sitemap (Please make the sitemap as minimal as possible so it’s easier to replicate the bug. You can export the sitemap by opening it and choosing “Export Sitemap” in the dropdown menu):

{"_id":"angeles","startUrl":["https://hospitalangeles.com/directorio/medicos"],"selectors":[{"id":"carta","parentSelectors":["cuerpo"],"type":"SelectorElement","selector":".blog-card-content","multiple":true},{"id":"nombre","parentSelectors":["carta"],"type":"SelectorText","selector":"h6","multiple":false,"regex":""},{"id":"especialidad","parentSelectors":["carta"],"type":"SelectorText","selector":"div.especialidad","multiple":false,"regex":""},{"id":"cuerpo","parentSelectors":["_root"],"type":"SelectorElement","selector":"body","multiple":false}],"websiteStateSetup":{"enabled":true,"performWhenNotFoundSelector":"#doctors-cards:not(:contains('Busca a tu médico especialista'))","actions":[{"selector":"input.form-control","type":"textInput","value":"nutriólogo"},{"selector":"button.flex-shrink-0","type":"click"}]}}

Explanation:
I need to search for "nutriólogo" on this site: Directorio Hospital Angeles Health System

Writing is ok and data is colected as you can see here:

But when I run the scraper I can't obtain that info after the "Website state setup" runs.

I don't know what else try because thgat page needs that the user types something there and then pick the search button, I can't do it with a paged link.

This site's url doesn't change with the search term so you'd need to manually paste the term on the page and click search just after you click "Start scraping". You can give yourself more time by adjusting Page load delay. Usually about 8 sec (8000 ms) is enough, as shown here:
image

1 Like

Thank you leemeng. I thought that the set up part was for sites in which you need to type something before the scraping can begin. But for what it seems maybe is a bug.

Hi, currently, the WSS is designed to return to the start URL after completing the setup. It will not be useful in your scenario.

1 Like

Generalization of Website State Setup (WSS)

Web Scraper 1.96.18
Firefox 140.3.0esr
Debian GNU/Linux 12 (bookworm)

https://global.morningstar.com/fr/outils/screener/fonds/

{"_id":"morningstar12","startUrl":["https://global.morningstar.com/fr/outils/screener/fonds/"],"selectors":[{"id":"pagination","parentSelectors":["_root","pagination"],"paginationType":"auto","type":"SelectorPagination","selector":"button:not([disabled])>span:contains(Suivant)"},{"id":"fiche","parentSelectors":["pagination"],"type":"SelectorText","selector":"th.mdc-data-grid-cell__mdc>a>span","multiple":true,"regex":""}],"websiteStateSetup":{"enabled":true,"performWhenNotFoundSelector":"div[data-value='EUR'][remove-icon-aria-label='Remove']","actions":[{"selector":"h3:contains(Autre)","type":"click"},{"selector":"label:has(div>span:contains('Devise de base'))>div>div>div>input","type":"click"},{"selector":"li[title='EUR']","type":"click"}]}}

Hello,

The simplified sitemap above, consisting in the succession of Website State Setup (WSS) and Pagination selector, does not work : Pagination stops when WSS is enabled.
According to Data not colected after website state setup secuence, this use case seems not possible.

1/ Will this use case be planned ?

Alternatively, I am tuning a sitemap based on _parent_ cascading on Element click selectors.
This solution is more complicated and less readable than WSS.
Furthermore, I can't find an equivalent selector to the "Text Input" action of the WSS.

2/ What workaround should I consider ?

Thank you very much. Best regards.

Hi,

As stated before, the WSS will reload the start URL after finishing the setup. If the website discards the changes after a reload, WSS cannot be used in that scenario.

In this case, the 'Element click' should be used, or simply starting the scrape with a longer delay and setting the parameters by hand in the pop-up.

Text input is restricted to WSS, as broader usage could potentially expose the extension to fraudulent exploitation.