LLMs are already pretty good at brute force security testing. They aren’t “polite” pen testers.
I recently used an LLM to win a CTF at work (there were no rules against AI, but I bet there will be next year). I felt a little bad, at the end, when they demoed the intended hacks and, for a couple of them, it was the first time I saw the home page. If it could quickly hack it with just the clue and URL I just let it.
For any serious website it needs a lot more direction, but it will help you along nicely.
I only saw denials twice, over an entire week, and I used three different major LLM agents (Codex CLI, Claude Code CLI, and Gemini CLI).
It took time, I spent something like 20 hours guiding, but if you have the time, and some expertise, the tools are extremely workable.
I recently used an LLM to win a CTF at work (there were no rules against AI, but I bet there will be next year). I felt a little bad, at the end, when they demoed the intended hacks and, for a couple of them, it was the first time I saw the home page. If it could quickly hack it with just the clue and URL I just let it.
For any serious website it needs a lot more direction, but it will help you along nicely.
I only saw denials twice, over an entire week, and I used three different major LLM agents (Codex CLI, Claude Code CLI, and Gemini CLI).
It took time, I spent something like 20 hours guiding, but if you have the time, and some expertise, the tools are extremely workable.