[Advice Needed] Scraping sites with pop-ups

Hi,

Apologies in advanced for such a n00b question. This is the first time I've done something like this.

I've been trying to scrape sephora's site but the pop up keeps interferring with the scraping process causing the system to only capture a portion of the websites.

I am essentially trying to scrape all the ingredients within the products and need to return back: product name, product brand, and product ingredients

Have you experienced similiar issues with pop ups?

Can you share your current sitemap and the URL of the troublesome page?

I'm trying to recreate what I did, my previous sitemap was deleted. However, now it won't let me dig further than the _root so i'm unable to get past the category and type in order to get to the product and then then details I need

URL https://www.sephora.com

{"_id":"sephora","startUrl":["https://www.sephora.com/"],"selectors":[{"id":"shop","type":"SelectorText","parentSelectors":["_root"],"selector":"div.css-1ma7jam","multiple":false,"regex":"","delay":0},{"id":"categories","type":"SelectorText","parentSelectors":["_root"],"selector":"a.css-1t5gbpr","multiple":true,"regex":"","delay":0},{"id":"type","type":"SelectorText","parentSelectors":["_root"],"selector":"a[data-at='top_level_category']","multiple":true,"regex":"","delay":0}]}

To start with you wold have to use the Link selectors to navigate through the page and not text selectors.

Here is a start to a sitemap:

{"_id":"sephora","startUrl":["https://www.sephora.com/"],"selectors":[{"id":"shop","type":"SelectorText","parentSelectors":["_root"],"selector":"div.css-1ma7jam","multiple":false,"regex":"","delay":0},{"id":"categories","type":"SelectorLink","parentSelectors":["_root"],"selector":"a.css-1t5gbpr","multiple":true,"delay":0},{"id":"type","type":"SelectorLink","parentSelectors":["categories"],"selector":"a[data-at='top_level_category']","multiple":true,"delay":0}]}

If you are still struggling, I would suggest going to the documentation and video tutorial section of the site so you at least learn the basics of how the scraper works.

Thank you for getting me started - I'm trying a different start brand list. However, there are two issues I'm encountering:

  1. When it's scraping I notice that the selector does not start from A and it goes through to a random portion of the site. Is there a reason why this is happening?

  2. It is not returning the ingredients data i'm trying to scrape

{"_id":"sephora","startUrl":["https://www.sephora.com/brands-list"],"selectors":[{"id":"brand_name","type":"SelectorLink","parentSelectors":["_root"],"selector":"a.css-d84rnc","multiple":true,"delay":0},{"id":"brand_see_all","type":"SelectorLink","parentSelectors":["brand_name"],"selector":".css-1rgbz9i a:nth-of-type(1)","multiple":false,"delay":0},{"id":"brand_product","type":"SelectorLink","parentSelectors":["brand_see_all"],"selector":"a[data-comp='ProductItem']","multiple":true,"delay":0},{"id":"brand_name_text","type":"SelectorText","parentSelectors":["brand_product"],"selector":"span.css-euydo4","multiple":false,"regex":"","delay":0},{"id":"brand_product_name_text","type":"SelectorText","parentSelectors":["brand_product"],"selector":"span.css-0","multiple":false,"regex":"","delay":0},{"id":"product_ingredients_text","type":"SelectorElementClick","parentSelectors":["brand_product"],"selector":"#tabpanel2 div","multiple":false,"delay":0,"clickElementSelector":"span.css-jpw3l4","clickType":"clickOnce","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueText"}]}