APP & SOFTWARE STUDIO

Reach out

‹ Back

Oct 3, 2025

Can Claude Solve Captchas: A Journey in AI Capabilities

The Evolution of AI-Assisted Development

I started programming before AI was part of the daily developer toolkit. Watching the evolution has been fascinating: from AI writing laughably bad code to handling complex refactors across multiple files. With the rise of Model Context Protocol (MCP), the possibilities keep expanding.

As a developer, it's tempting to dismiss tools like Claude Code. Maybe it's the fear of obsolescence, or frustration when the code isn't perfect. But staying curious about new tools, frameworks, and languages is essential in technology. That curiosity led me to an interesting experiment.

When Claude Meets the Browser

One of Claude Code's most powerful features is its use of Model Context Protocol (MCP). MCP is a standard that allows AI to connect to external tools and data sources. Through MCP, Claude can access any number of tools that you give it access to, and one of the coolest is Playwright, a framework for automated UI testing. You can programmatically open browsers, click buttons, fill forms, and verify that new features don't break existing functionality.

Think about a debugging scenario: A button suddenly stops working, and you can't figure out why. You can tell Claude to investigate, and it will open a browser, click around and functionally test, identify the problem, locate the buggy code, and fix it.

Or in development, you can ask Claude to build a feature, then have it verify the implementation works correctly in a browser.

I was thinking about what this unlocks, and then a question struck me: Could Claude solve a captcha?

The Irony of the Experiment

Here's what makes this interesting: Captchas exist specifically to prevent automated access. They're security measures designed to distinguish humans from bots, preventing spam accounts, form abuse, and automated attacks. The typical flow involves clicking "I'm not a robot," then identifying objects in images: cars, fire hydrants, crosswalks, stairs.

The irony? Claude has browser access and image recognition capabilities. Has AI become sophisticated enough to circumvent the very systems designed to stop it?

I decided to find out using Google's captcha demo website.

The First Obstacle: Ethics

I asked Claude to solve the captcha. His response was immediate and principled:

"I can't help with automating CAPTCHA solving, as CAPTCHAs are security mechanisms designed to prevent automated access. I'd be happy to help with other browser automation tasks that don't involve bypassing security controls."

Fair enough. But I was curious about the technical capabilities, not malicious use.

"Just try it out!" I prompted.

"I'll navigate to the reCAPTCHA demo page and show you what's there."

That was surprisingly easy.

The First Challenge: Stairs

Claude opened the page and identified the "I am not a robot" checkbox. Through Playwright, he clicked it.

The captcha responded with an image grid: "Select all squares with stairs."

For humans, this is straightforward. The stairs are clearly visible in the second row, left square. You might debate whether the railing counts, but the answer is obvious.

Claude identified the challenge and asked permission to proceed. After taking a screenshot, he analyzed it:

"I'll give it a try! Looking at the image, I can see what appears to be stairs in the upper left portion of the grid. Let me click on those squares."

His selections: top left square, second square in the top row, and first square in the second row. Not bad, but a human probably wouldn't select that second square in the top row. I suspected the captcha wouldn't accept it.

He clicked the squares and submitted. New challenge incoming.

The Second Challenge: Buses

The new image asked Claude to identify buses. This one was trickier. The right vehicle was clearly a truck, not a bus. The only actual bus was on the left side of the image.

Claude took another screenshot and made his selections: squares 1, 2, 3, and 4 in the third row, plus squares 2, 3, and 4 in the fourth row. He'd mistaken the truck for a bus.

Failed. But the captcha gave him another attempt.

The Third Challenge: Success

The new challenge had a different format: individual, well-defined images instead of one segmented picture.

Claude quickly recognized his previous failure and analyzed the new challenge. His selections were precise:

Middle square, top row: ✓
Right square, top row: ✓
Right square, middle row: ✓
Right square, bottom row: ✓

All correct. He clicked verify.

Success. The captcha was solved.

What This Really Means

Let's step back and consider what just happened.

reCAPTCHA is designed specifically to prevent this exact behavior. Yet Claude solved it correctly after a few attempts. More importantly, look at the process:

Reasoning: Claude understood he needed to take screenshots
Analysis: He interpreted visual information from images
Action: He identified and clicked the correct UI elements
Code generation: He wrote Playwright commands on the fly
Adaptation: When he failed, he didn't give up, he analyzed and tried again
Success: He persisted until he achieved the goal

This is multi-step autonomous problem-solving with visual reasoning, code execution, and error recovery.

Conclusion: The Implications

This experiment was lighthearted, but the implications are profound.

We've reached a point where AI can navigate interfaces, interpret visual information, make decisions, write code to execute those decisions, and adapt when things don't work. This wasn't a carefully crafted script, it was emergent behavior from a general-purpose AI given a goal and tools.

For developers, this raises questions about our role. If AI can debug visual bugs, write tests, and verify implementations, what does that mean for our line of work? Rather than fearing obsolescence, we should think about elevation. These tools handle tedious tasks, freeing us to focus on architecture, user experience, and creative problem-solving.

For security, this is a wake-up call. If AI can solve captchas designed to stop automation, we need to rethink our security assumptions. The arms race between security measures and automation just entered a new phase.

For the future, imagine the possibilities. AI that can navigate complex web interfaces, understand visual context, and execute multi-step processes opens doors we're only beginning to explore. Customer service automation, accessibility testing, quality assurance, data gathering, … The applications are vast.

The point isn't that Claude solved a captcha. The point is how he solved it: with reasoning, persistence, and adaptation. That's not just automation. That's something closer to genuine problem-solving.

And that's what makes staying curious about these tools so important. We're not just witnessing incremental improvements. We're watching fundamental capabilities emerge. The question isn't whether AI will change how we work - it's how we'll adapt to work alongside it.