Data.com not scraping websites

Hi
im trying to scrape data.com, it works for company names, but it won't scrape website
or better, data preview works fine https://www.dropbox.com/s/0lqsv1nv5l25xb8/Screenshot%202018-08-10%2014.11.25.png?dl=0
but when i upload the csv i only get a few website
the same for city
what am i doing wrong?
thanks!

Url: https://connect.data.com/search#p=searchresult;;t=companies;;ss=advancedsearch;;q=H4sIAAAAAAAAAD3MsQ7CMAwE0H-5OYNbxEDXTkwgVtShpAZVSm2UOEio6r8TWsRmv9PdjPsYjGNCM8NrFosjl-eKAxGhc0j5dpQhp39QVVT9Mp6eQd-8eU1w2BVfHHpvo0o6SashT7Kuf-3FrYqVa5Wk0dBg4OSxlJbPMbLYuX8wmpocTK0PF_Yah7K1-Z6WD4WtLwC1AAAA

Sitemap:
{"_id":"datacom3","startUrl":["https://connect.data.com/search#p=searchresult;;t=companies;;ss=advancedsearch;;q=H4sIAAAAAAAAAD2MsQ4CIRBE_2VqCtBYSHuVlcbWXIGwGhKONbCYmMv9u3gau5n3MjPjFpNQqbAzPLcsJVIvF-y11hgVarsecmj1L4zR5udoeiR-0ZdvNBS2nS8KzkvkXI954NSmvL5_2JMGztLTSioXgUWg6rH0lbC4dCbPJfTpyd0JdqeXN6vhyo-kAAAA"],"selectors":[{"id":"pagination","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"#findCompanies > div.search-result.general-display-none > div.column-right > div.result-table > table > tbody","multiple":true,"delay":0,"clickElementSelector":"div.result-table-panel.result-table-panel-bottom img.table-navigation-button-next.table-navigation-next-image-active","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"},{"id":"company","type":"SelectorText","parentSelectors":["pagination"],"selector":".companyName","multiple":true,"regex":"","delay":0},{"id":"website","type":"SelectorText","parentSelectors":["pagination"],"selector":"div.website a","multiple":true,"regex":"","delay":"2000"},{"id":"city","type":"SelectorText","parentSelectors":["pagination"],"selector":"td.td-city","multiple":true,"regex":"","delay":"2000"}]}

Posted this in wrong topic first lol
Hi!

Had to register to this website in order to help.

You've picked a wrong selector for the items, and also set inner text selectors as multiple.

Try this one:
{"_id":"datacom3","startUrl":["https://connect.data.com/search#p=searchresult;;t=companies;;ss=advancedsearch;;q=H4sIAAAAAAAAAD2MsQ4CIRBE_2VqCtBYSHuVlcbWXIGwGhKONbCYmMv9u3gau5n3MjPjFpNQqbAzPLcsJVIvF-y11hgVarsecmj1L4zR5udoeiR-0ZdvNBS2nS8KzkvkXI954NSmvL5_2JMGztLTSioXgUWg6rH0lbC4dCbPJfTpyd0JdqeXN6vhyo-kAAAA"],"selectors":[{"id":"pagination","type":"SelectorElementClick","selector":"div#findCompanies.ui-tabs-panel tr.general-display-none","parentSelectors":["_root"],"multiple":true,"delay":0,"clickElementSelector":"div.result-table-panel.result-table-panel-bottom img.table-navigation-button-next.table-navigation-next-image-active","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"},{"id":"company","type":"SelectorText","selector":".companyName","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":0},{"id":"website","type":"SelectorText","selector":"div.website a","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":"2000"},{"id":"city","type":"SelectorText","selector":"td.td-city","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":"2000"}]}

hey!
thanks a lot!
it seems better now yes, but it gets stuck at page 8
im trying this one
https://connect.data.com/search#p=searchresult;;t=companies;;ss=advancedsearch;;q=H4sIAAAAAAAAAD3MsQ7CMAwE0H_xnCFhgqydmECsiKEkBkVKbeQ4lVCVfyeUis1-p7sFHikrSgG_QOBKKgn7c4WDtRZuBkq9HynW8g-cs27LcHplfuPmnZqBMWhiKicaONeJ1uGvzTgwab9WKSwKHiKWAK23QhVB0vP4RPB7A8o65gsGltinfryztn0A0mRlMrAAAAA
what am i missing?
thanks!

Tried scraping the first page & got all names+websites

How did u do the pagination ?

I've narrowed down the next button, seem to be working ok now.
It doesn't matter how much results you set to view on a page, as once scrape starts it automatically goes back to 50 (at least for me). Results URL is being generated randomly as well.

{"_id":"datacom4","startUrl":["https://connect.data.com/search#p=searchresult;;t=companies;;ss=advancedsearch;;q=H4sIAAAAAAAAAD2MsQ4CIRBE_2VqCtBYSHuVlcbWXIGwGhKONbCYmMv9u3gau5n3MjPjFpNQqbAzPLcsJVIvF-y11hgVarsecmj1L4zR5udoeiR-0ZdvNBS2nS8KzkvkXI954NSmvL5_2JMGztLTSioXgUWg6rH0lbC4dCbPJfTpyd0JdqeXN6vhyo-kAAAA"],"selectors":[{"id":"pagination","type":"SelectorElementClick","selector":"div#findCompanies.ui-tabs-panel tr.general-display-none","parentSelectors":["_root"],"multiple":true,"delay":"2000","clickElementSelector":"div.result-table-panel.result-table-panel-top img.table-navigation-button-next.table-navigation-next-image-active","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"},{"id":"Name","type":"SelectorText","selector":"a.break-word","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":0},{"id":"Website","type":"SelectorText","selector":"div.website a","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":0},{"id":"Phone","type":"SelectorText","selector":"div.phone","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":0},{"id":"City","type":"SelectorText","selector":"td:nth-of-type(5)","parentSelectors":["pagination"],"multiple":false,"regex":"","delay":0}]}

hey iconoclast!
it is working like charm!
how did you find the correct selector?
thanks a lot and have a great day! :smiley:

Somewhat trial and error :slight_smile: