Capturing Table, Only First Page Works

I am trying to pull all of the information from Licensed Contractors

I have been able to successfully scrape P. 1 of the data. But whenever I try to have it paginate, it either only does the first page OR it will endlessly go through all of the pages without actually scraping anything.

{"_id":"ChicagoContractors","startUrl":["Licensed Contractors a.page-link"},{"id":"TableSelect","parentSelectors":["NextPage"],"type":"SelectorTable","multiple":true,"selector":"table","tableDataRowSelector":"tbody tr","tableHeaderRowSelector":"thead tr","columns":[{"extract":true,"header":"License Type","name":"License Type"},{"extract":true,"header":"License Number","name":"License Number"},{"extract":true,"header":"Name","name":"Name"},{"extract":true,"header":"Address","name":"Address"},{"extract":true,"header":"Phone","name":"Phone"},{"extract":true,"header":"License Expiration Date","name":"License Expiration Date"},{"extract":true,"header":"Insurance / Bond Expiration Date","name":"Insurance Bond Expiration Date"},{"extract":true,"header":"License Inactive?","name":"License Inactive"}]}]}

Any ideas on how to modify?

Thank you!

  • Steve

here is your data, check it out.

If you are interested- here is working sitemap:

{"_id":"WEBAPP1","startUrl":["https://webapps1.chicago.gov/licensedcontractors/active"],"selectors":[{"columns":[{"extract":true,"header":"License Type","name":"License Type"},{"extract":true,"header":"License Number","name":"License Number"},{"extract":true,"header":"Name","name":"Name"},{"extract":true,"header":"Address","name":"Address"},{"extract":true,"header":"Phone","name":"Phone"},{"extract":true,"header":"License Expiration Date","name":"License Expiration Date"},{"extract":true,"header":"Insurance / Bond Expiration Date","name":"Expiration Date"},{"extract":true,"header":"License Inactive?","name":"License Inactive"}],"id":"TABLE","multiple":true,"parentSelectors":["NEXT"],"selector":"table","tableDataRowSelector":"tbody tr","tableHeaderRowSelector":"thead tr","type":"SelectorTable"},{"clickActionType":"real","clickElementSelector":".next a","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickMore","delay":500,"discardInitialElements":"discard","id":"NEXT","multiple":true,"parentSelectors":["_root"],"selector":"BODY","type":"SelectorElementClick"}]}
1 Like

This is phenomenal - thank you for the template and for the nicely formatted data!

-Steve

you are welcome ))))) eeeeeeasy

Hi Don,

Any chance you can re-run and repost the results? :grinning:

1 Like

Following for the same. Im new to scraping and not sure how to run that sitemap

Hi,

You can import a sitemap by clicking 'Create new sitemap -> Import Sitemap', then paste the sitemap code.

I would suggest starting by having a look at the video tutorial and documentation section. There you will find instructions on how to create sitemaps. You can do that here:

Tutorial videos: Extension intro | Web Scraper How To
Documentation: Installation | Web Scraper Documentation
How-tos: Extension intro | Web Scraper How To