AI CAPTCHA Solver: New Tool Uses GPT-4o and Gemini to Beat Various Web Security Challenges
AI-Powered CAPTCHA Solver
This project is a Python-based command-line tool that uses large multimodal models (LMMs) like OpenAI’s GPT-4o and Google’s Gemini to automatically solve various types of CAPTCHAs. It leverages Selenium for web browser automation to interact with web pages and solve CAPTCHAs in real-time.
A successful solve is recorded as a GIF in the successful_solves directory.
Key Features
- Multiple AI Providers: Supports both OpenAI (e.g., GPT-4o) and Google Gemini (e.g., Gemini 2.5 Pro) models.
- Multiple CAPTCHA Types: Capable of solving a variety of CAPTCHA challenges.
- Browser Automation: Uses Selenium to simulate human interaction with web pages.
- Extensible: The modular design makes it easy to add support for new CAPTCHA types or AI models.
- Benchmarking: Includes a script to test the performance and success rate of the solvers.
Supported CAPTCHA Types
The tool can solve the following CAPTCHA types found on the 2captcha.com/demo/ pages:
- Text Captcha: Simple text recognition.
- Complicated Text Captcha: Text with more distortion and noise.
- reCAPTCHA v2: Google’s “I’m not a robot” checkbox with image selection challenges.
- Puzzle Captcha: Slider puzzles where a piece must be moved to the correct location.
- Audio Captcha: Transcribing spoken letters or numbers from an audio file.
How It Works
- Launch Browser: The script starts a Firefox browser instance using Selenium.
- Navigate: It goes to the demo page for the specified CAPTCHA type.
- Capture: It takes screenshots of the CAPTCHA challenge (image, instructions, or puzzle).
- AI Analysis: The captured images or audio files are sent to the selected AI provider (OpenAI or Gemini) with a specific prompt tailored to the CAPTCHA type.
- Get Action: The AI returns the solution (text, coordinates, or image selections).
- Perform Action: The script uses Selenium to enter the text, move the slider, or click the correct images.
- Verify: The script checks for a success message to confirm the CAPTCHA was solved.
Install & Use
Support Our Threat Intelligence
If you find our technology report and cybersecurity news helpful, consider supporting our work.