Scraping Google Search

I'm encountering weird problems when trying to scrape google search results. Is this Google preventing me?

When I nest the pagination links under the pagination parent it just keeps scraping search result 8 and then 7 and then 8 and then 7

Url: https://www.google.com/search?q=site:about.me+I&ei=GGbfWt7RG4Od_QbG65OoDw&start=0&sa=N&biw=1280&bih=520

{"_id":"aboutme2","startUrl":["https://www.google.com/search?q=site:about.me+I&ei=GGbfWt7RG4Od_QbG65OoDw&start=0&sa=N&biw=1280&bih=520"],"selectors":[{"id":"urls","type":"SelectorText","selector":"cite.iUh30","parentSelectors":["_root","pagination"],"multiple":true,"regex":"","delay":0},{"id":"pagination","type":"SelectorLink","selector":"td:nth-of-type(n+3) a.fl","parentSelectors":["_root","pagination"],"multiple":true,"delay":0}]}

Hi,
test to change the select of your pagination like this:

{"_id":"test","startUrl":["https://www.google.com/search?q=site:about.me+I&ei=GGbfWt7RG4Od_QbG65OoDw&start=0&sa=N&biw=1280&bih=520"],"selectors":[{"id":"urls","type":"SelectorText","selector":"cite.iUh30","parentSelectors":["_root","pagination"],"multiple":true,"regex":"","delay":0},{"id":"pagination","type":"SelectorLink","selector":"a.pn","parentSelectors":["_root","pagination"],"multiple":true,"delay":0}]}

1 Like

Thanks for the help! How did you select the pagination so that it returns "a.pn"?

hi,
have a look