I realize this is likely explained in a guide somewhere but I've read around and I'm just not understanding something... I also have no foreseeable use for the tool outside of this one single thing. I'm trying to take the Chinese sentences off of the lessons from www.immersivechinese.com - there's a Github explaining how to do it here but at the web scraping step it just says, "scrape the website." There are files that seem to be direct instructions for what to put in the sitemap. One looks like
{"_id":"pronounciation","startUrl":["https://console.immersivechinese.com/pronunciation"],"selectors":[{"delay":0,"id":"lesson","multiple":true,"parentSelectors":["_root"],"selector":"a.list-group-item","type":"SelectorLink"},{"delay":0,"id":"exercise","multiple":true,"parentSelectors":["lesson"],"selector":"div.main-swipe-slide","type":"SelectorElement"},{"delay":0,"id":"pinyin","multiple":false,"parentSelectors":["exercise"],"regex":"","selector":"label.show_reveal","type":"SelectorText"},{"delay":0,"id":"description","multiple":false,"parentSelectors":["exercise"],"regex":"","selector":"div.lesson_note_div","type":"SelectorText"},{"delay":0,"extractAttribute":"data-audio-fast","id":"audioFastUrl","multiple":false,"parentSelectors":["exercise"],"selector":"parent","type":"SelectorElementAttribute"},{"delay":0,"extractAttribute":"data-audio-slow","id":"audioSlowUrl","multiple":false,"parentSelectors":["exercise"],"selector":"parent","type":"SelectorElementAttribute"},{"delay":0,"extractAttribute":"id","id":"number","multiple":false,"parentSelectors":["exercise"],"selector":"parent","type":"SelectorElementAttribute"}]}
for example. But no matter how I've tried putting it in I just can't get it working. I just want to get these cards out to help my Chinese study, Chinese is more than enough headache as it is