aalto1 untyped-item.component.html
Agentic AI for penetration testing
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Science |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Kovalenko, Viktoriia
Date
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
56
Series
Abstract
Penetration testing is a critical offensive security tactic that involves executing attacks to identify and exploit system vulnerabilities. Recently, large language models (LLMs) and AI agents have gained popularity for automating tasks in various domains, including cybersecurity. However, applying LLMs directly to penetration testing poses challenges, namely limited context windows, inconsistent command outputs, lack of persistent memory, and insufficient human feedback.
To overcome these limitations, we propose an agentic AI architecture designed for multi-stage penetration testing. We present a proof-of-concept (PoC) pipeline and evaluate its initial performance. Our solution employs a supervisor-based framework comprising reconnaissance and enumeration agents that autonomously assess the environment, recommend next steps, and generate corresponding shell commands. By integrating human feedback into the workflow, the system can refine its strategies and perform targeted web searches to improve accuracy.
We conducted a qualitative evaluation with four professional penetration testers. The results indicate that our architecture effectively supports network scanning, enumeration, and exploitation of vulnerable machines. These findings underscore the potential of AI agents in penetration testing, while also revealing key limitations and suggesting directions for future research to further automate and enhance the testing pipeline.
Description
Supervisor
Gunn, LachlanThesis advisor
Balzarotti, DavideBergström, Daniel