Objective:
I want to do the following..
- Click each competition (e.g. 2020, Tokyo, JPN) to load the data tables.
- Then, click on each specific event (e.g. K1 men 200m) which will bring me to the specific data table.
- Scrape the event name, ranking, athlete(s), nationality and timings data from each table.
Describe the problem:
The data tables don't have a proper header row. The 'header row' comprises 2 columns while the 'body rows' comprise 4 columns. Can't seem to access the nationality and timings data either. Using the sitemap I have, I've only managed to get the rankings (some with null values?) and athletes. Would appreciate sitemap suggestions or even alternative approaches to tackle this.
Url:
http://www.canoeresults.eu/view-results/sprint
Sitemap:
{"_id":"canoesprintEU","startUrl":["http://www.canoeresults.eu/view-results/sprint"],"selectors":[{"delay":0,"id":"competition_link","multiple":true,"parentSelectors":["_root"],"selector":"div.row:nth-of-type(n+2) a","type":"SelectorLink"},{"delay":0,"id":"event_link","multiple":true,"parentSelectors":["competition_link"],"selector":"#results div a","type":"SelectorLink"},{"columns":[{"extract":true,"header":"K2 men 1.000 m","name":"rank"},{"extract":true,"header":"sprint","name":"sprint"}],"delay":0,"id":"event_table","multiple":true,"parentSelectors":["event_link"],"selector":"table","tableDataRowSelector":"tbody tr","tableHeaderRowSelector":"thead tr","type":"SelectorTable"}]}