How to scrape a website that requires many pages of navigation

Hi, I'm trying to scrape specific data about a series of plants. The website appears to use AJAX, and I've tried a few different ways to get around it, but I have not been successful.

The data I need to get requires several different navigational steps:

  1. Select a letter A-Z to show a popup
  2. In the popup, open the list of Plant Families
  3. On each Plant Family page, click "Compare All" button
  4. On the Compare All page, scrape the list of Plant URLs and open the Plant URLs
  5. On the Plant URL, scrape details about the plant
  6. Go to the next letter in step 1.

I've tried to create the sitemap several different ways, but I usually end up with one of the following scenarios:

  1. Scraper opens the A popup, then stops - no data is saved
  2. Scraper opens the A popup, goes to the Plant Family Page, then stops - no data is saved
  3. Scraper opens every A-Z popup, then stops - no data is saved

This is my current sitemap and it's result is scenario #2 above.
{"_id":"gardenia","startUrl":["https://www.gardenia.net/"],"selectors":[{"id":"Hardiness","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Hardiness') td","type":"SelectorText"},{"id":"Heat Zones","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Heat Zones') td","type":"SelectorText"},{"id":"Climate Zones","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Climate Zones') td","type":"SelectorText"},{"id":"Plant Type","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Plant Type') td","type":"SelectorText"},{"id":"Plant Family","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Plant Family') td","type":"SelectorText"},{"id":"Exposure","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Exposure') td","type":"SelectorText"},{"id":"Season of Interest","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Season of Interest') td","type":"SelectorText"},{"id":"Height","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Height') td","type":"SelectorText"},{"id":"Spread","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Spread') td","type":"SelectorText"},{"id":"Water Needs","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Water Needs') td","type":"SelectorText"},{"id":"Maintenance","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Maintenance') td","type":"SelectorText"},{"id":"Soil Type","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Soil Type') td","type":"SelectorText"},{"id":"Soil pH","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Soil pH') td","type":"SelectorText"},{"id":"Soil Drainage","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Soil Drainage') td","type":"SelectorText"},{"id":"Characteristics","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Characteristics') td","type":"SelectorText"},{"id":"Native Plants","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Native Plants') td","type":"SelectorText"},{"id":"Tolerance","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Tolerance') td","type":"SelectorText"},{"id":"Attracts","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Attracts') td","type":"SelectorText"},{"id":"Garden Uses","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Garden Uses') td","type":"SelectorText"},{"id":"Garden Styles","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Garden Styles') td","type":"SelectorText"},{"id":"Spacing","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".d-none tr:contains('Spacing') td","type":"SelectorText"},{"id":"other-names","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":"h2 em","type":"SelectorText"},{"id":"plant-name","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".body-heading h1","type":"SelectorText"},{"id":"description-text","multiple":false,"parentSelectors":["name-plant"],"regex":"","selector":".detail-text-area div","type":"SelectorText"},{"clickElementSelector":"a.alpha-click","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":2000,"discardInitialElements":"do-not-discard","id":"click-letter","multiple":true,"parentSelectors":["_root"],"selector":"body > div.container","type":"SelectorElementClick"},{"id":"click-compare-all","multiple":false,"parentSelectors":["plant-parent-name"],"selector":"a.btn-block","type":"SelectorLink"},{"id":"name-plant","multiple":true,"parentSelectors":["click-compare-all"],"selector":"strong a","type":"SelectorLink"},{"id":"plant-parent-name","multiple":true,"parentSelectors":["click-letter"],"selector":".list-wrapper a","type":"SelectorPopupLink"}]}

I've read through several other forums that were helpful so that I no longer get the "Parent does not contain selected element", but I'm still not able to scrap correctly.

Thank you for any help!

@scaper Hello, it does not appear like the 'plant-parent-name' links would open a new 'Pop-up'. therefore you should use a 'Link' selector instead.

Hi @ViestursWS

Thank you for taking a look! I changed plant-parent-name to a Link selector instead of a Pop-Up selector, and my scraper started working as I needed. :raised_hands: