Simple Two Link Scraper Not Working

woojoo666 · January 11, 2021, 1:52am

HI all! I'm having a bit of trouble creating my scraper. It seems if there are multiple link selectors defined for a page, it will only scrape the last one. I created a test website to show what I mean. All it contains is two links to two different pages that I want scraped. Here is the sitemap graph I created:

sitemap graph

And here is the sitemap:

{
    "_id": "github-webscraper-test",
    "startUrl": ["https://woojoo666.github.io/web-scraper-test-site/"],
    "selectors": [
    {
        "id": "info-link",
        "type": "SelectorLink",
        "parentSelectors": ["_root"],
        "selector": "a[href=\"./info.html\"]",
        "multiple": false,
        "delay": 0
    },
    {
        "id": "items-link",
        "type": "SelectorLink",
        "parentSelectors": ["_root"],
        "selector": "a[href=\"./items.html\"]",
        "multiple": false,
        "delay": 0
    },
    {
        "id": "info-heading",
        "type": "SelectorText",
        "parentSelectors": ["info-link"],
        "selector": "h1",
        "multiple": false,
        "regex": "",
        "delay": 0
    },
    {
        "id": "info-text",
        "type": "SelectorText",
        "parentSelectors": ["info-link"],
        "selector": "p",
        "multiple": false,
        "regex": "",
        "delay": 0
    },
    {
        "id": "items-li",
        "type": "SelectorText",
        "parentSelectors": ["items-link"],
        "selector": "li",
        "multiple": true,
        "regex": "",
        "delay": 0
    }]
}

When I try to run the scraper, I can see from the popup window that it only navigates to the items.html page, but never to the info.html page. The scraped data further proves this, as you can see that info-heading and info-text are empty.

web-scraper-order	web-scraper-start-url	info-link	info-link-href	items-link	items-link-href	items-li
1610329031-14	https://woojoo666.github.io/web-scraper-test-site/	site info	https://woojoo666.github.io/web-scraper-test-site/info.html	items	https://woojoo666.github.io/web-scraper-test-site/items.html	item 2
1610329031-13	https://woojoo666.github.io/web-scraper-test-site/	site info	https://woojoo666.github.io/web-scraper-test-site/info.html	items	https://woojoo666.github.io/web-scraper-test-site/items.html	item 1
1610329031-15	https://woojoo666.github.io/web-scraper-test-site/	site info	https://woojoo666.github.io/web-scraper-test-site/info.html	items	https://woojoo666.github.io/web-scraper-test-site/items.html	item 3

Perhaps I am misunderstanding how the scraper works? I went through all the video tutorials on the website but didn't seem to find a solution to my problem. Any help would be greatly appreciated!