aalto1 untyped-item.component.html

Agentic AI for penetration testing

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis

Authors

Kovalenko, Viktoriia

Department

Mcode

Language

en

Pages

56

Series

Abstract

Penetration testing is a critical offensive security tactic that involves executing attacks to identify and exploit system vulnerabilities. Recently, large language models (LLMs) and AI agents have gained popularity for automating tasks in various domains, including cybersecurity. However, applying LLMs directly to penetration testing poses challenges, namely limited context windows, inconsistent command outputs, lack of persistent memory, and insufficient human feedback. To overcome these limitations, we propose an agentic AI architecture designed for multi-stage penetration testing. We present a proof-of-concept (PoC) pipeline and evaluate its initial performance. Our solution employs a supervisor-based framework comprising reconnaissance and enumeration agents that autonomously assess the environment, recommend next steps, and generate corresponding shell commands. By integrating human feedback into the workflow, the system can refine its strategies and perform targeted web searches to improve accuracy. We conducted a qualitative evaluation with four professional penetration testers. The results indicate that our architecture effectively supports network scanning, enumeration, and exploitation of vulnerable machines. These findings underscore the potential of AI agents in penetration testing, while also revealing key limitations and suggesting directions for future research to further automate and enhance the testing pipeline.

Description

Supervisor

Gunn, Lachlan

Thesis advisor

Balzarotti, Davide
Bergström, Daniel

Other note

Citation

Endorsement

Review

Supplemented By

Referenced By