Wellfound filters scraping

Hi there,
I've been struggling to set up proper filters for the Welffound website, the link on the website doesn't contain any filters, so I tried to use element click selectors to set up filters, such as job titles, location, and other points I need.
Url: https://wellfound.com/jobs

The problem is that I can't make it work, I used 3 selectors to make it so far as to open the dropdown menu, but I struggle with selecting the proper job title, such as "Designer", as I don't really understand which element to choose for click selector

Sitemap:
{"_id":"wellfound","startUrl":["https://wellfound.com/jobs"],"selectors":[{"clickActionType":"real","clickElementSelector":"div.pr-4","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"job_title_click","multiple":false,"parentSelectors":["_root"],"selector":"button.styles_inactive__aAc_w","type":"SelectorElementClick"},{"clickActionType":"real","clickElementSelector":"div.select__indicator","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"job_selection","multiple":false,"parentSelectors":["_root"],"selector":"span.styles_label__ikMuI","type":"SelectorElementClick"},{"clickActionType":"real","clickElementSelector":"div.select__option--is-focused","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"selector_designer","multiple":false,"parentSelectors":["_root"],"selector":"button.styles_inactive__aAc_w","type":"SelectorElementClick"}]}

The sitemap contains only 3 selectors I managed to set up, I didn't move any further, the only thing I can do so far - is click on the dropdown, select the first title from the selection, and click on it.

Will appreciate any advice;)

Hi,

I could not locate the filters you were writing about. Could it be that this section is only available after a login?

Nevertheless, if you are struggling with the filters, one trick you can use is to ignore the filtering in the sitemap and set up the selectors for the data you would like to scrape.

Then start a scraping job with a page load delay of let's say 20000 and set the filters in the new pop-up window.

Hi JanAp, yes, the filters are only there for registered users, so I created an account on the platform.

Thanks for the advice, I will most definitely try it out as well, I had no idea it could be done on the actively scraped sitemap.

However, I am still looking for a more permanent solution in the long run, as it's possible there will be a need for continuous scraping setup from the cloud.

For now, I thought of adding the needed postings to my starred list and scraping them from a different tab, however, I ran into another issue - the info about the company is not scraped, though everything shows up in the preview without any issues and I can't find the reason for that as well:

{"_id":"wellfound_saved_companies","startUrl":["https://wellfound.com/jobs/starred"],"selectors":[{"id":"Link","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"a.styles_component__9eYti","type":"SelectorLink"},{"id":"Wrapper","multiple":false,"parentSelectors":["Link"],"selector":"div.styles_motionContainer__0bu1f","type":"SelectorElement"},{"id":"Company name","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_name__qn8jG div.justify-between","type":"SelectorText"},{"id":"Company overview","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"section.styles_component__thTp9","type":"SelectorText"},{"id":"Founders_People","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".flex header","type":"SelectorText"},{"id":"Founder 1","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"div.styles_component__Wb41n:nth-of-type(1) .styles-module_component__3ZI84 a","type":"SelectorText"},{"id":"Founder 2","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"div.styles_component__Wb41n:nth-of-type(2) .styles-module_component__3ZI84 a","type":"SelectorText"},{"id":"Website","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt.flex","type":"SelectorText"},{"id":"Location","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt > ul","type":"SelectorText"},{"id":"Company size","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt:nth-of-type(3)","type":"SelectorText"},{"id":"Company type","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt:nth-of-type(4) span","type":"SelectorText"},{"id":"Markets","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt.styles_tags__y_J8v","type":"SelectorText"}]}

One issue I can see with the sitemap is that the selectors have a lot of random letters. This is a kind of anti-scraping measure some websites utilize, which would require manually checking the HTML and finding a more universal selector value.

If you could provide a dummy account for the login, I could try to set this up for you, otherwise it is not possible to inspect the code.

freakzonaleash@gmail.com,
9D3pLVu+=gfiG%Q

I added several jobs to the saved section. If you manage to beat it, could you please share the way to correctly inspect the elements to determine how to pick the right selectors?

This website is frustrating

Yeah, it is a tricky one.

For example, you could improve the selector for website by 'right click -> Inspect' on the website:

image

You can see in the HTML that the class for the element is styles_websiteLink___Rnfc

It can be made more universal with this syntax: [class*="styles_websiteLink_"]

You can read more about how selectors can be adjusted in this article A Quick Guide to CSS and jQuery

See the below sitemap for reference:

{"_id":"wellfound_saved_companies","startUrl":["https://wellfound.com/jobs/starred"],"selectors":[{"id":"Link","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":"[data-test=\"StartupResult\"] > div > a","type":"SelectorLink"},{"id":"Wrapper","multiple":true,"parentSelectors":["Link"],"selector":"body","type":"SelectorElement"},{"id":"Company name","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"[class*=\"styles_body__\"] [class*=\"styles_name__\"] a","type":"SelectorText"},{"id":"Company overview","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"section.styles_component__thTp9","type":"SelectorText"},{"id":"Founders_People","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".flex header","type":"SelectorText"},{"id":"Founder 1","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"div.styles_component__Wb41n:nth-of-type(1) .styles-module_component__3ZI84 a","type":"SelectorText"},{"id":"Founder 2","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"div.styles_component__Wb41n:nth-of-type(2) .styles-module_component__3ZI84 a","type":"SelectorText"},{"id":"Website","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"[class*=\"styles_websiteLink_\"]","type":"SelectorText"},{"id":"Location","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt > ul","type":"SelectorText"},{"id":"Company size","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt:nth-of-type(3)","type":"SelectorText"},{"id":"Company type","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":"dd:contains(\"Company type\") +","type":"SelectorText"},{"id":"Markets","multiple":false,"parentSelectors":["Wrapper"],"regex":"","selector":".styles_about__6dvji dt.styles_tags__y_J8v","type":"SelectorText"}]}