TryHackMe is a cybersecurity education platform used by 7 million+ security practitioners worldwide. We build the tools that help security teams learn, practise, and stay sharp - from foundational skills training through to enterprise-grade capability testing.
Live Breach is our newest product: a high-fidelity breach simulation experience for enterprise security teams. We provision real cloud infrastructure, deploy realistic attack scenarios against it, and challenge the blue team to investigate, contain, and eradicate - end to end, in real time. The goal is to make it feel indistinguishable from a real incident.
We're hiring a contractor to own the AI engineering at the core of this product.
You'll build and own the AI systems that make Live Breach feel like a real incident rather than a scripted exercise. That centres on two interconnected components:
The AI attacker agent - an autonomous LLM-powered agent that receives a threat actor profile, a network briefing, and a configured attack chain, then executes it against a live environment. The core engineering challenge is making this agent adaptive: when the defending team takes containment actions in real time, the attacker needs to recognise what has happened and respond — pivoting to new hosts, re-establishing persistence, or changing technique.
The exercise orchestration layer - a parallel system that monitors the network during a live exercise, recognises which attack techniques have executed, listens for correct containment and eradication actions from participants, and surfaces investigation tasks tied to real attacker behaviour. This system needs precise, programmatic knowledge of what forensic artefacts each technique produces and what a valid defensive response looks like.
Alongside these core systems, you'll also work on:
Prompt engineering for both red and blue team agent components, in close collaboration with our content engineering team
Integration with adversary emulation tooling for realistic technique execution
User emulation and noise generation — simulating realistic background activity so participants must distinguish real attacker behaviour from normal log volume
Documentation and architecture that allows the broader engineering team to operate and debug the AI layer without dependency on any single person
The immediate priority is building the attacker agent from the ground up. You'll design and implement an autonomous LLM-powered agent that receives a threat actor profile, a network map, and a configured attack chain — and executes it against a live provisioned environment without human intervention.
This means: scoping the agent architecture, choosing the right tooling and framework, building the planning and execution loop, and getting to a working demo where the agent autonomously compromises a target network end to end.
A key part of this work will be using LLMs to analyse an attack chain and automatically configure the target network to be vulnerable in the right ways — introducing the misconfigurations, weak credentials, and exploitable conditions that the attack chain requires, without manual setup for each scenario. Speed matters here — we want to prove the core capability as quickly as possible so we can validate it with real clients.
Essential
Hands-on experience building LLM-powered agents — planning loops, tool use, memory, state management
Strong Python engineering; able to ship production-quality agentic systems, not just prototypes
Ability to design prompts and agent architectures that are reliable and predictable under adversarial conditions
Comfortable working autonomously on problems that aren't fully defined — you'll need to make good technical decisions with limited hand-holding
Strong async communication; the team is distributed and documentation matters
Strongly preferred
Working familiarity with cybersecurity concepts — attack techniques, MITRE ATT&CK, network fundamentals (Active Directory, lateral movement, persistence). You don't need to be a penetration tester, but you need enough domain fluency to build realistic attack logic
Experience with adversary emulation frameworks (MITRE CALDERA or similar)
Experience building event-driven systems that monitor and react to real-time state changes
Familiarity with cloud infrastructure (we provision VMs and networks dynamically per exercise)
Nice to have
Prior work in the cyber range, red team tooling, or security simulation space
Experience with multi-agent architectures where agents observe and react to each other
Discovering Direct IT Contract Opportunities for Contract Spy members.