Google unusual traffic

gustdb · January 8, 2020, 8:35pm

Describe the problem.
I want to scrape search results from Google, in my URL: https://1240.nl/messina I have some Google search queries. After approximately 30 Google searches I get the error: 'Unusual traffic detected'.

I understand why Google does this, because I am scraping from Google. However, is there some workaround or do I have to keep completing the recaptcha?

Would love to hear,
Gust de Backer

leemeng · January 9, 2020, 2:23pm

You can sometimes avoid triggering their captcha by spacing out your searches (longer delays in between each). The threshold seems to vary depending on time of day and your location, so you will need to experiment.

gustdb · January 9, 2020, 7:05pm

Thanks for your answer!

Do you have any idea how to accomplish this with Webscraper.io?

Kind regards.

blakehamilton · January 11, 2020, 5:27pm

Hi,

I can't say that I've ever used webscraper.io so I couldn't give you much advice to getting it working properly.

However, to bypass Google's "Unusual traffic detected" message (and a CAPTCHA altogether) here's a few solutions for you:

Slow down your requests dramatically (this may not help still depending on the request volume over time)
Rotate your IP address every so often through proxy servers (by far the most reliable method)

There are a few services whom offer API's to do exactly this.
(I believe webscraper.io can do this as well I'm just unsure of how to set it up having never used it.)

Instead of asking Google (or any other website) directly for the search results from your address you'll send a request to proxy's API which routes your request to Google through a proxy, can defeat CAPTCHA's automatically, and even renders JavaScript pages for you (as frameworks like React, Vue, EmberJS, and Angular are becoming more prevalent).