First of all THANK you for this brilliant tool and this community. My life has become so much easier after discovering webscraper.io!
I am struggling with some issues around scraping tables:
Lets Use this URL as example:
URL: https://www.specialized.com/de/de/demo-alloy-27-5/p/154304
Sitemap (for scraping the table containing Geometry-data):
{"id":"GeometryTable","type":"SelectorTable","parentSelectors":["ClickIndividualBike"],"selector":"table.geometry-section__table","multiple":true,"columns":[{"header":"XS","name":"XSmall","extract":true},{"header":"S","name":"Small","extract":true},{"header":"M","name":"Medium","extract":true},{"header":"L","name":"Large","extract":true},{"header":"XL","name":"XLarge","extract":true}],"delay":0,"tableDataRowSelector":"tr:nth-of-type(n+2)","tableHeaderRowSelector":"tr.hidden-mobile"}
Issues I am facing:
- First cell is header row is empty:
This leads to the first cell in every row not being scraped. How can I fix this?
-
On different detail product pages, the tables have different column numbers. Size variations vary between 4 and 10, i.e. there might be 4 columns or 10 columns. How can I handle this as n+1/2 is only affecting the rows?
-
Actually I would need to scrape the table "90 degrees" turned, meaning columns should be rows in the CSV and rows should be columns. How can I do this?
Thanks in advance for any help!