I am trying to scrape the Doge contract data at the following webpage:
***** FYI, running this with the pagination selector makes viewing sample data take forever (has to go through all the pages on the website). So I have been running this with or without the pagination selector.
When I run my scraper (with or without the pagination selector), I get all the fields with data except for the last three (NAICS, PCS, and Cancelled Date).
-
When I run the data preview for the individual fields (NAICS, PCS, and Cancelled Date) I get data back. Again, this is just for the individual previews (such as when looking at "id":"NAICS_CODE").
-
When I run the data preview on an entire row, all fields are populated except for the last three fields (NAICS, PCS, and Cancelled Date) which are now blank.
-
When I run the scraper and export data, all fields are once again populated except for those last three fields (NAICS, PCS, and Cancelled Date) which are again blank.
What am I missing?
Here is my sitemap.
{"_id":"doge-tracker","startUrl":["http://app.g2xchange.com/doge-tracker"],"selectors":[{"id":"IDPag","paginationType":"auto","parentSelectors":["_root","IDPag"],"selector":".pages button.rt-Button","type":"SelectorPagination"},{"id":"con_wrapper","multiple":true,"parentSelectors":["IDPag"],"selector":"div.rdg-row:nth-of-type(n+2)","type":"SelectorElement"},{"id":"Contract_ID","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='1']","type":"SelectorText"},{"id":"Business_Name","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='2']","type":"SelectorText"},{"id":"Unique_Entity_ID","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='3']","type":"SelectorText"},{"id":"Status","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='4']","type":"SelectorText"},{"id":"Contract_Ceiling","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='5']","type":"SelectorText"},{"id":"Description","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='6']","type":"SelectorText"},{"id":"Award_IDV_Type","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='7']","type":"SelectorText"},{"id":"Awarding_Agency","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='8']","type":"SelectorText"},{"id":"Sub_Contracting_Agency","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='9']","type":"SelectorText"},{"id":"NAICS_CODE","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='10']","type":"SelectorText"},{"id":"PSC","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='11']","type":"SelectorText"},{"id":"Cancelled_Date","multiple":false,"parentSelectors":["con_wrapper"],"regex":"","selector":"div[aria-colindex='12']","type":"SelectorText"}]}