
So far Octoparse does not handle captchas. The most common way is to hook your program up to a service in an offshore center where someone sits before a screen all day filling in those little authentication screens.

By using some artificial technique, it can bypass the verification code. Well, is it possible to bypass the CAPTCHA when extracting data From web pages? And when there are too many users from a particular IP address, the human verification on websites are tougher to solve. When you use a VPN on internet several computers using same IP address, which makes websites suspect that you are a robot. So it’ll be very tricky for you to extract data from these websites. Technically speaking, you can’t finish human verification because you are using a VPN. There are many websites that use CAPTCHA to prevent robots from visiting their websites. Blacklist IP addresses that continually send spam traffic to your site.

Ensure bots cant skip human verification using a bot management system. Add a honeypot or human verification field set to your forms. Have you ever been asked to read blurred letters and type them into a box? That’s a CAPTCHA.ĬAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a method that websites use to tell the difference between robots and humans accessing their pages. CAPTCHAs are there to actually stop you for automating the login. This is an ongoing struggle between CAPTCHAs providers and the ones who want to beat the system by bypassing them. Install reCAPTCHA anti-spam human verification on your website forms.
