Platform
Capabilities AI Agents Zero-Day Suite Reports & Evidence Integrations
Compare
Why PhantomYerra vs Mythos AI vs GPT-5.4 Cyber
Resources
Help Docs What's New Ask PhantomYerra Methodology Release Notes
 
Contact Request Access Client Login
Exhaustive Technical & Methodology Comparison

PhantomYerra vs
Claude Mythos Preview

The definitive comparison between a shipping penetration testing platform and a restricted frontier AI model. PhantomYerra is a complete, deployable, 150+ engine pentest platform with 8 surface-specific Big-4-grade report engines, SECURA 0-100 scoring, cross-scan institutional memory, a public REST API, and AI-agent guardrails. Claude Mythos Preview is a raw AI model with world-class vulnerability discovery capabilities, restricted to 52 partner organisations and available only via cloud API. Different categories. Same goal: finding and exploiting vulnerabilities. Here is how they compare.

HONESTY RULE: Mythos claims sourced from Anthropic's official publications (red.anthropic.com, anthropic.com/glasswing) and independent third-party analysis only. PhantomYerra claims backed by shipping code in v45.1.29.
150+PhantomYerra Engines
25Attack Surfaces
11Zero-Day Engines
8Big-4 Report Engines
100%Pure Python
Document Integrity Verified

All capability claims validated against PhantomYerra v45.1.29 source code. SHA-256 signed and published to SIGNATURES.json. Every update refreshes the hash, timestamp, and signature.

Section 1

Design Philosophy

Before comparing features, understand the fundamental design difference. These two platforms were built with opposing philosophies.

PhantomYerra

The AI Penetration Tester

PhantomYerra was designed as an autonomous penetration tester: not a scanner. The AI doesn't just find vulnerabilities; it exploits them, chains them into attack paths, generates context-aware payloads, adapts to WAF defences in real-time, and writes professional narrative reports with evidence.

Every finding must pass six evidence gates before reaching a report. The AI is treated as a senior red teamer: it plans the engagement, selects tools, executes attacks, pivots on discoveries, and writes the report, with the human confirming scope once and reviewing results.

  • Desktop-first: runs on your machine, your network, your rules
  • 150+ pure-Python security engines including 11 zero-day detection engines
  • AI orchestrates tools via function-calling (plan → execute → adapt → chain → report)
  • Evidence-gated: no finding without proof, no exception
  • Business-logic native: tests IDOR, BOLA, BFLA, race conditions, JWT confusion on every scan
  • Zero-day capable: interprocedural taint flow, crypto oracle detection, gadget chain discovery, AI adversarial passes
Philosophy: "Break things. Prove it. Chain it. Report it."
Claude Mythos Preview

The Frontier AI Model

Claude Mythos Preview (codenamed "Capybara") is Anthropic's frontier AI model announced April 7, 2026, as part of Project Glasswing. It is NOT a penetration testing product or platform. It is a raw AI model with world-class vulnerability discovery and exploit development capabilities, available only via cloud API to ~52 approved partner organisations.

Mythos excels at source-code-visible analysis: it found a 27-year-old TCP SACK bug in OpenBSD, a 16-year-old FFmpeg codec vulnerability, and constructed multi-vulnerability browser exploit chains. It has not demonstrated black-box web application pentesting against live targets.

  • Raw AI model: no UI, no installer, no dashboard, no project management
  • Restricted access: ~52 organisations (AWS, Apple, Google, Microsoft, etc.)
  • API-only: $25/M input, $125/M output tokens (5x Opus 4.6 pricing)
  • Excels at: source code analysis, binary reverse engineering, exploit development
  • Not demonstrated: black-box pentesting, business-logic testing, live target assessment
  • No reporting engine, compliance mapping, evidence chain-of-custody, or RBAC
Sources: red.anthropic.com, anthropic.com/glasswing, third-party analysis (April 2026)

Core difference: PhantomYerra is a complete, shipping penetration testing platform with 150+ engines (including an 11-engine zero-day detection suite), UI, reporting, compliance, and deployment options. Claude Mythos Preview is a restricted frontier AI model that excels at source-code vulnerability discovery but has no product packaging, no deployment installer, no team features, and no reporting engine. They serve fundamentally different needs: PhantomYerra replaces a pentest team; Mythos augments a vulnerability researcher's workflow.

Section 2

AI Engine Architecture

Both platforms use AI. But how AI is integrated, and what it controls - determines whether the platform is an automated scanner or an autonomous penetration tester.

AI Capability PhantomYerra Mythos Preview
AI drives engagement planning (target analysis, attack plan, tool selection) Autonomous: AI creates full attack plan from target + scope Given a prompt, autonomously decides investigation approach
AI selects and invokes security engines via function-calling Tool-use API: AI calls 76 engines as functionsUses agentic loop with file/shell access in containers
AI adapts mid-engagement (found vuln, pivot to exploitation) Real-time pivoting without human intervention Autonomously chains vulnerabilities into exploit paths
AI generates context-aware payloads (adapted to target tech stack) Live payload generation per target, WAF-aware World-class: 20-gadget ROP chains, JIT heap sprays, multi-vuln exploits
AI chains findings into multi-step attack paths Attack graph maintained live across 25 surfaces, with cross-scan institutional memory recalling prior findings Chains 2-4 vulnerabilities into privilege escalation paths
Professional reporting engine (PDF, DOCX, SARIF, HTML) Executive + technical narratives, multiple formats Raw bug reports and PoC code only. No reporting engine.
Anti-hallucination framework Six evidence gates: AI prose limited to description/narrative onlyUses ASan as crash oracle. No report-level anti-hallucination.
AI provider fallback chain (commercial, open-source, local) 8-provider chain: Anthropic → OpenAI → Google → Groq → Together → Azure Copilot → Ollama → LM Studio (air-gapped) Mythos API only. No fallback. No local model.
Client data anonymised before AI API calls Reference-token substitution: targets never sent to AI endpoints All code/data sent to Anthropic cloud API
Autonomous operation (confirm scope once, AI completes engagement) Confirm once, AI runs all 150+ engines "Minimal human steering" per Anthropic
Source-code-visible vulnerability discovery SAST engines + AI-enhanced code review World-class: 27yr-old OpenBSD bug, 16yr-old FFmpeg bug
Black-box pentesting against live web targets Full DAST + business-logic testing against live apps Not demonstrated. Independent analysis confirms source-visible only.
PhantomYerra AI Modes

Three Execution Modes

Automated AI Mode: User defines target + scope → AI plans the entire engagement → user confirms once → AI runs all 150+ engines autonomously, adapting in real-time, chaining findings, writing the report.

Semi-Automated: AI proposes each step. Human approves or adjusts. Best for compliance-sensitive environments where audit trail requires human approval at each phase.

Manual: Human drives tool selection and execution. AI provides advisory intelligence, payload suggestions, and narrative writing. Full pentester-in-control mode.

Anti-Hallucination

Why AI Quarantine Matters

AI language models hallucinate. This is not a bug - it's a fundamental property of probabilistic text generation. In a penetration test report, a hallucinated finding is worse than a missed finding: it wastes remediation effort, erodes trust, and can cause compliance failures.

PhantomYerra's six evidence gates ensure that AI-generated prose is limited to description and narrative fields. Severity, CVSS, CVE, exploitation status, and affected components are computed from telemetry: never from AI output. This is the difference between "AI-assisted" and "AI-evidence-gated."

Result: Zero hallucinated findings in production reports
Section 3

The 150+ Engine Arsenal

PhantomYerra ships 150+ purpose-built security engines across 25 attack surfaces: all pure Python, zero external binary dependencies. This includes the industry's first 11-engine Zero-Day Detection Suite built into SAST and mobile surfaces, plus 8 Big-4-grade surface-specific report engines (Web, API, Mobile MASVS-mapped, IoT, Cloud, Network, Firmware, SAST, Reverse Engineering) and 4 Big-4 report types (Compliance, Delta, Retest, Attestation letter) added in v45.1.29. Each engine implements a standardised adapter interface: target in, findings out, evidence attached. Mythos Preview's engine count and architecture are not publicly documented.

Dimension PhantomYerra Mythos Preview
Total security engines150+ engines shipping in v45.1.29Not publicly documented
Zero-day detection engines11 dedicated zero-day engines (SAST + Mobile) World-class zero-day discovery in source code/binaries
Engine implementation100% pure Python: no Go/Rust/C binariesNot publicly documented
Installer size~115 MB (76% reduction from binary era)SaaS: no installer
Antivirus false positivesZero, pure Python is never flaggedSaaS: not applicable
Standardised adapter interface BaseToolAdapter: scan(target, context) → FindingListNot publicly documented
Engine crash isolation Per-engine try/except + watchdog rollback; zero-day engines all non-fatalNot publicly documented
Engines updatable independently Each adapter versioned + hot-swappableNot publicly documented
Attack surfaces covered25 distinct surfacesPrimarily source-code and binary analysis
Big-4-grade Web Application PDF report Page-numbered TOC, executive briefing, ASVS + OWASP Top-10 mapping, attack-chain diagram, per-finding evidence + curl reproduction, per-page appendicesNot a product; no report engine
Surface-specific Big-4 report engines8 engines: Web, API, Mobile (MASVS-mapped), IoT, Cloud, Network, Firmware, SAST, Reverse EngineeringNot applicable
SECURA 0-100 scoring with tier bands Elite ≥90 / Strong ≥75 / Moderate ≥55 / Weak ≥35 / Critical <35No scoring system
Cross-scan institutional memory "You've seen this before" signals across every engagement; regressions flagged instantlyPer-session; no persistent memory across engagements
Public REST API with scoped Bearer tokens CI/CD, ticketing, SIEM/SOAR, custom dashboards; rate-limited, admin-issuedRaw model API only; no findings/report/project REST surface
AI agent guardrails (observable tool calls) Execution Monitor + Reflector: every tool call scope-gated, loop-protected, loggedRaw agentic loop; no operator-visible guardrails
Multi-agent orchestration (Planner / Executor / Reviewer) Opt-in split with hallucination + duplicate review before reportSingle-agent loop
New in v45.1.13

Zero-Day Detection Suite

PhantomYerra v45.1.13 introduces an 11-engine zero-day detection suite embedded into SAST and mobile surfaces. These engines find vulnerability classes that have no CVE — business-logic flaws, race conditions, cryptographic design errors, and novel deserialization chains — and prove them with working PoC evidence.

SAST Zero-Day Engine 1

Interprocedural Taint Flow

Builds a cross-file call graph and traces untrusted input from 20 source patterns to 25 dangerous sinks across multiple function call boundaries. Catches injection chains invisible to single-file scanners.

CWE: 89, 78, 79, 22, 94, 502, 601
SAST Zero-Day Engine 2

Race Condition & TOCTOU Detector

Detects TOCTOU patterns (os.path.exists → open/rename), broken double-checked locking, mutex misuse, and predictable temp file races using AST-level analysis. Generates concurrent PoC scripts.

CWE: 362, 367, 833, 820, 377
SAST Zero-Day Engine 3

Crypto Oracle Detector

Finds padding oracles (CBC + distinguishable exception paths), timing oracles (non-constant-time HMAC), ECB mode detection, GCM nonce reuse, weak KDF, and PKCS1v15 RSA across 5 languages.

Languages: Python, Java, JS, Ruby, PHP
SAST Zero-Day Engine 4

Auth Chain Analyzer

Detects JWT alg:none (CVSS 9.8), RS256→HS256 algorithm confusion, session fixation, IDOR without ownership checks, and MFA bypass via client-controlled session state.

CWE: 287, 384, 639, 345
SAST Zero-Day Engine 5

Deserialization Gadget Finder

AST-level detection of unsafe deserialization across Python (pickle/yaml/dill/__reduce__), Java (ObjectInputStream/XStream/Kryo), Ruby, PHP, and .NET. Generates ysoserial/phpggc gadget chain PoC automatically.

CVSS 9.8 for user-controlled deserialization input
SAST Zero-Day Engine 6

Supply Chain Analyzer

Pure-Python typosquatting detection (Levenshtein distance ≤ 2 against 50+ popular packages), known malicious package list (event-stream, ua-parser-js, coa, rc, colors...), postinstall script analysis, and 6 manifest parsers.

CWE: 1104, 1357
SAST Zero-Day Engine 7

AI Adversarial Zero-Day Engine

5 AI adversarial passes per codebase: business_logic, parser_differential, trust_boundary, state_machine, type_confusion. Routes through multi-provider AI chain. Gracefully degrades without AI key.

Finds novel 0-days invisible to pattern matching
Mobile Zero-Day Engine 1

DEX Bytecode Analyzer

Smali + Java file analysis for dynamic class loading (DexClassLoader/Runtime.exec), SSL bypass (onReceivedSslError.proceed), AES/ECB, obfuscated Base64→exec. DEX string table via struct parsing fallback.

CWE: 295, 470, 327, 925
Mobile Zero-Day Engine 2

Intent Fuzzer

Static: parses AndroidManifest.xml for exported activity/service/receiver/provider components. Dynamic: ADB fuzzing with string/integer/path-traversal payloads. ContentProvider SQLi probe. Live device optional.

CWE: 926, 89, 22, 20
Mobile Zero-Day Engine 3

WebView Bridge Analyzer

Detects addJavascriptInterface on API<17 (CVSS 9.8), @JavascriptInterface with file/exec access, setAllowUniversalAccessFromFileURLs sandbox escape (CVSS 8.8), and loadUrl from Intent extras.

CWE: 749, 346, 73, 601
Mobile Zero-Day Engine 4

IPC Violation Detector

Binder/AIDL missing permission checks, ContentProvider SQLi via rawQuery, path traversal in openFile(), mutable PendingIntent (no FLAG_IMMUTABLE), PreferenceActivity fragment injection. Deepest Android IPC coverage available.

CWE: 862, 89, 22, 284, 926, 927
Zero-Day Capability PhantomYerra v45.1.29 Mythos Preview
Interprocedural taint tracking (cross-file) BFS propagation across full source tree AI-native: context window holds entire codebase
Race condition / TOCTOU detection AST-level pattern matching + PoC generatorNot publicly documented
Cryptographic design flaw detection Padding oracle, timing oracle, nonce reuse, ECB — 5 languagesNot publicly documented
JWT / auth confusion attacks alg:none, RS→HS confusion, IDOR, MFA bypassNot publicly documented
Deserialization gadget chain discovery 5 languages, ysoserial/phpggc PoC generation Likely: demonstrated Java/C++ memory exploitation
Supply chain / typosquatting Levenshtein + malicious package DB + postinstall analysisNot publicly documented
AI adversarial scanning passes 5 AI passes: business_logic, parser_differential, trust_boundary, state_machine, type_confusion Core capability: 27-year-old OpenBSD bug, 16-year-old FFmpeg bug
Android DEX-level bytecode analysis Smali + struct parsing fallbackNot publicly documented
Android IPC violation detection Binder/AIDL/ContentProvider/PendingIntentNot publicly documented
WebView bridge exploitation addJavascriptInterface, sandbox escape, Intent extrasNot publicly documented
PoC generation for discovered zero-days Generated for all 11 engine finding types World-class: 20-gadget ROP chains, JIT heap sprays
Available without cloud API Engines 1-6 fully offline; Engine 7 degrades gracefully Cloud API required. No offline operation.
Black-box target (live web/app) Full DAST + business-logic engines against live targets Source-code/binary access required. No live-target black-box.
Commercially purchasable today Per-seat license, available now ~52 Glasswing partners only; not commercially available

Zero-Day Verdict: Claude Mythos Preview is genuinely world-class at finding novel vulnerabilities in source code and binaries — its discovery of a 27-year-old TCP SACK bug in OpenBSD and a 16-year-old FFmpeg vulnerability demonstrates capabilities beyond any automated tool. However, Mythos requires source code or binary access and a cloud API connection. PhantomYerra's zero-day suite brings seven SAST-level zero-day engines and four mobile zero-day engines to any organisation's workflow — offline, on-premise, no cloud dependency — covering race conditions, crypto oracles, deserialization gadget chains, and AI adversarial passes. Different tools. Different access models. Both finding bugs that CVE databases miss.

Surface 1 of 16
🌐

Web Application Security

14 engines · OWASP Top 10 2021 + API Top 10 2023 + Business Logic

PhantomYerra Methodology

PhantomYerra's web testing is not a single scanner - it's 14 specialised engines working in concert, each responsible for a specific attack class. The AI orchestrator decides which engines to invoke based on the target's technology fingerprint.

  • Reconnaissance: Technology fingerprinting, endpoint discovery, parameter mining, JavaScript analysis, API schema detection
  • Injection testing: SQL injection (error, blind, time-based, UNION, second-order), XSS (reflected, stored, DOM), command injection, SSTI, LDAP injection, XPath injection, header injection
  • Authentication: Brute-force resistance, credential stuffing patterns, session fixation, session prediction, cookie security flags, password policy analysis
  • Authorization: IDOR (horizontal + vertical), BOLA, BFLA, privilege escalation via role parameter tampering, JWT algorithm confusion (alg:none, HS/RS swap, kid traversal), OAuth flow abuse
  • Business logic: Race conditions (TOCTOU via concurrent requests), workflow skip, state-machine bypass, mass-assignment (role/is_admin/balance injection), prototype pollution
  • Client-side: DOM clobbering, postMessage abuse, CORS misconfiguration, CSP bypass, clickjacking, open redirects chained to token theft
  • Infrastructure: Directory traversal, file inclusion (LFI/RFI), SSRF (including cloud metadata IMDS exploitation), XXE, insecure deserialization
  • Cryptography: Weak TLS, certificate pinning, insecure random generation, hardcoded secrets in JavaScript

Payload generation: AI generates payloads adapted to the target's WAF, technology stack, and detected encoding. If a payload is blocked, the AI generates WAF-bypass variants (encoding rotation, case mutation, comment injection, Unicode normalisation) and retries automatically.

Evidence: Every finding carries the exact HTTP request, response, payload used, and extraction result. Copy-paste curl PoC command included for every exploitable finding.

Mythos Preview Methodology

Mythos Preview's web application testing methodology uses AI agents to crawl and interact with web applications, identifying vulnerabilities through automated navigation and testing. Their approach appears to focus on common web vulnerabilities including OWASP Top 10 coverage.

The depth of business-logic testing, payload generation sophistication, and WAF bypass capabilities are not extensively documented in their public materials. Evidence architecture and PoC generation approaches are similarly not publicly detailed.

Web CapabilityPhantomYerraMythos Preview
SQL Injection (all types) 6 injection variants + second-order Basic coverage likely
XSS (reflected, stored, DOM) All three + mutation-based bypass Likely covered
SSRF (including IMDS exploitation) Full SSRF + cloud metadata pivotNot publicly documented
IDOR / BOLA / BFLA Native on every authenticated endpointSurface level at best
Race conditions Concurrent request engine
Mass assignmentNot publicly documented
JWT algorithm confusionNot publicly documented
Deserialization (Java, PHP, .NET, Python, Ruby) Multi-language payload libraryNot publicly documented
WAF bypass generation AI-generated encoding/mutation variantsNot publicly documented
GraphQL introspection + injectionVaries
WebSocket security testingNot publicly documented

Verdict: PhantomYerra covers the full web attack surface with 14 specialised engines including deep business-logic testing (IDOR, BOLA, BFLA, race conditions, JWT confusion) that Mythos Preview does not publicly document. The payload generation and WAF bypass capabilities provide real-world exploitation depth beyond signature matching.

Surface 2 of 16
🔒

Network Infrastructure

10 engines · Port scanning, service detection, protocol analysis, vulnerability assessment

PhantomYerra Methodology

  • Discovery: TCP SYN/connect scanning, UDP scanning, service fingerprinting, OS detection, version detection across all 65,535 ports
  • Protocol analysis: SMB enumeration (shares, users, password policy), SNMP community string testing, LDAP anonymous bind, FTP anonymous access, Telnet banner grabbing, RDP NLA testing
  • Vulnerability assessment: Known CVE matching against detected services, default credential testing, SSL/TLS misconfiguration, DNS zone transfer, NTP amplification
  • Exploitation: Automated exploitation of confirmed vulnerabilities with live payload generation, privilege escalation path analysis, lateral movement simulation
  • Wireless integration: If wireless adapter detected, extends to WiFi network discovery and WPA testing

Mythos Preview Methodology

Mythos Preview's network infrastructure testing capabilities are not extensively documented in their public materials. Their platform appears to focus primarily on web application security rather than deep network infrastructure assessment. Internal network testing, protocol-level analysis, and lateral movement simulation capabilities are not publicly described.

Verdict: PhantomYerra provides full network infrastructure testing with 10 engines covering everything from port scanning to protocol analysis to exploitation. Mythos Preview does not publicly document network infrastructure testing capabilities, suggesting web-application focus.

Surface 3 of 16
🔍

Reconnaissance & OSINT

21 engines · Subdomain discovery, DNS intelligence, web crawling, asset mapping, org intelligence

PhantomYerra Methodology

  • Subdomain discovery: Multi-source enumeration (certificate transparency, DNS brute-force, passive databases, search engine dorking, web archive mining), typically discovers 10-50x more subdomains than single-source scanners
  • DNS intelligence: Zone transfer testing, DNS record enumeration (A, AAAA, CNAME, MX, TXT, SRV, NS), DNSSEC validation, subdomain takeover detection, wildcard DNS identification
  • Web crawling: Deep recursive crawling with JavaScript rendering, form discovery, API endpoint extraction, hidden parameter mining, commented-out URL discovery
  • Asset discovery: HTTP probing across all discovered hosts, technology fingerprinting, CDN detection, load balancer identification, WAF detection
  • URL harvesting: Historical URL collection from web archives, search engine results, and passive DNS records: captures endpoints that may have been removed but are still accessible
  • Organisation intelligence: Employee enumeration, email pattern detection, social media profiling, breach data correlation (ethical sources only), document metadata analysis, technology stack inference
  • Certificate intelligence: Certificate transparency log monitoring, certificate chain analysis, SAN extraction for related domains

Mythos Preview Methodology

Mythos Preview's reconnaissance capabilities are not extensively documented. AI-powered platforms may perform some automated discovery as part of their web application testing workflow, but the depth of OSINT, subdomain enumeration, and organisation intelligence is not publicly detailed.

Verdict: PhantomYerra's 21 reconnaissance engines represent the most full OSINT and discovery capability in any automated platform. The combination of multi-source subdomain discovery, DNS intelligence, web crawling, URL harvesting, and organisation intelligence creates a complete picture of the target's attack surface before testing begins. This recon depth is a force multiplier for all subsequent attack phases.

Surface 4 of 16
📄

Static Application Security Testing (SAST)

10 engines · Multi-language analysis + 7 zero-day engines + AI adversarial scanning

PhantomYerra Methodology

  • Languages: Python, JavaScript/TypeScript, Java, Go, C/C++, Ruby, PHP, C#/.NET, Swift, Kotlin, Rust, Scala, COBOL
  • Standard analysis: Pattern-based vulnerability detection, taint analysis (source-to-sink tracing), control-flow analysis, configuration analysis, hardcoded secrets, ReDoS, prototype pollution, buffer overflows
  • Zero-Day Engine 1 — Interprocedural Taint: Cross-file call graph construction, BFS propagation from 20 sources to 25 sinks across multiple function boundaries — catches chains invisible to single-file scanners
  • Zero-Day Engine 2 — Race Condition & TOCTOU: AST-level detection of TOCTOU (os.path.exists→open/rename), broken DCL, mutex misuse, predictable temp files; generates concurrent PoC
  • Zero-Day Engine 3 — Crypto Oracle: Padding oracle (CBC + distinguishable exception), timing oracle (non-constant-time HMAC), ECB mode, GCM nonce reuse, weak KDF, PKCS1v15 RSA — 5 languages
  • Zero-Day Engine 4 — Auth Chain: JWT alg:none (CVSS 9.8), RS256→HS256 confusion, session fixation, IDOR without ownership check, MFA bypass via client session
  • Zero-Day Engine 5 — Deserialization Gadgets: Python pickle/yaml/dill/__reduce__, Java ObjectInputStream/XStream/Kryo, Ruby, PHP, .NET — auto-generates ysoserial/phpggc chain PoC
  • Zero-Day Engine 6 — Supply Chain: Typosquatting (Levenshtein ≤ 2, 50+ popular packages), known malicious packages, postinstall script curl/wget/bash detection, 6 manifest parsers
  • Zero-Day Engine 7 — AI Adversarial: 5 AI passes (business_logic, parser_differential, trust_boundary, state_machine, type_confusion); gracefully degrades without AI key
  • AI enhancement: AI reviews flagged code in context, generates remediation code in target's language/framework, produces SARIF output for IDE integration

Mythos Preview Methodology

Claude Mythos Preview's strength is precisely in source-code analysis — it found a 27-year-old OpenBSD bug and a 16-year-old FFmpeg vulnerability through source code inspection. However, it requires source code access via cloud API, has no standardised SAST adapter interface, no SARIF output, no compliance mapping, and is restricted to ~52 partner organisations. It is a research capability, not a deployable SAST product.

Verdict: PhantomYerra's SAST surface grew from 3 to 10 engines in v45.1.13 with the addition of 7 zero-day detection engines. This is the deepest automated SAST stack available in any commercial platform. Mythos Preview has world-class source-code analysis capabilities in its AI model, but they are not packaged as a deployable SAST tool with compliance mapping, SARIF output, or offline operation.

Surface 5 of 16
📦

Software Composition Analysis (SCA)

3 engines · Dependency analysis, SBOM generation, known-vulnerability matching, license compliance

PhantomYerra Methodology

  • Dependency parsing: package.json, requirements.txt, Pipfile, Gemfile, pom.xml, go.mod, Cargo.toml, composer.json, .csproj, Package.swift, build.gradle
  • SBOM generation: CycloneDX and SPDX format output for compliance requirements
  • Vulnerability matching: Cross-references dependencies against NVD, OSV, GitHub Advisory Database, and Snyk DB
  • Exploitability analysis: Determines whether vulnerable code paths are actually reachable (not just present in dependency tree)
  • License compliance: Identifies GPL, AGPL, and other copyleft licenses that may create legal obligations
  • AI correlation: Correlates SCA findings with SAST findings, if a vulnerable library function is actually called in the source code, severity is elevated

Mythos Preview Methodology

SCA capabilities for Mythos Preview are not publicly documented. Supply chain security analysis is typically a distinct capability from dynamic application testing, and it is not clear whether Mythos Preview's platform includes this functionality.

Verdict: PhantomYerra's SCA engines analyse the complete software supply chain, from dependency trees to SBOM generation to reachability analysis. This is critical for compliance (PCI DSS 4.0, SOC 2) and for understanding whether a vulnerable dependency is actually exploitable in context.

Surface 6 of 16

Infrastructure as Code (IaC) Security

2 engines · Terraform, CloudFormation, Kubernetes, Docker, Ansible misconfiguration detection

PhantomYerra Methodology

  • Supported formats: Terraform (HCL), AWS CloudFormation (JSON/YAML), Kubernetes manifests, Dockerfiles, Docker Compose, Ansible playbooks, Helm charts
  • Checks: Overly permissive IAM policies, public S3 buckets, unencrypted storage volumes, missing logging/monitoring, insecure network ACLs, privileged containers, host network access, missing resource limits, hardcoded secrets in manifests
  • Remediation: AI generates corrected IaC snippets in the same format - copy-paste ready
  • Drift detection: Compares deployed infrastructure against IaC definitions to detect configuration drift

Mythos Preview Methodology

IaC security scanning capabilities for Mythos Preview are not publicly documented. This is typically a separate capability from web application penetration testing.

Surface 7 of 16

Cloud Security

3 engines · AWS, GCP, Azure misconfiguration, SSRF-to-IMDS, cloud-native attack paths

PhantomYerra Methodology

  • Configuration audit: IAM over-permissions, public storage buckets/blobs, unencrypted databases, exposed management consoles, missing MFA on root/admin accounts, security group misconfigurations
  • SSRF-to-IMDS: Automatically tests discovered SSRF vulnerabilities for cloud metadata endpoint access (169.254.169.254 / IMDSv1/v2), extracts temporary credentials, assesses blast radius
  • Kubernetes: Cluster misconfiguration scanning, RBAC policy analysis, pod security policy violations, exposed dashboards, service account token abuse
  • Serverless: Lambda/Cloud Functions permission analysis, event injection testing, cold start timing attacks
  • Multi-cloud: Unified findings across AWS + GCP + Azure in a single engagement

Mythos Preview Methodology

Mythos Preview may offer some cloud security testing capabilities. The depth of cloud-specific testing, including SSRF-to-IMDS attack paths, Kubernetes security, and multi-cloud coverage, is not publicly documented in their materials.

Verdict: PhantomYerra's cloud engines test the full cloud-native attack surface, from IAM misconfiguration to SSRF-to-IMDS exploitation to Kubernetes RBAC abuse. The SSRF-to-cloud-credential-theft chain is one of the most impactful attack paths in modern infrastructure, and PhantomYerra tests it automatically.

Surface 8 of 16

Dynamic Application Security Testing (DAST)

4 engines · Automated crawl + attack, authenticated scanning, API fuzzing

PhantomYerra Methodology

  • Crawling: Deep recursive crawling with JavaScript rendering, form discovery, multi-step form submission, authentication-aware crawling that maintains session state
  • Attack surface: OWASP Top 10, OWASP API Top 10, plus 25+ business-logic vulnerability categories
  • Authentication: Cookie-based, token-based (Bearer), OAuth 2.0, SAML, custom authentication header support
  • API-first: OpenAPI/Swagger import, GraphQL introspection, gRPC reflection: tests every documented and undocumented endpoint
  • Fuzzing: Parameter fuzzing, header fuzzing, body mutation, boundary value testing, encoding variation
  • Rate limiting: Configurable request rate to avoid overwhelming targets, adaptive throttling based on response times

Mythos Preview Methodology

Mythos Preview's core strength appears to be in dynamic web application testing. Their AI agents likely navigate applications, submit forms, and test for common vulnerabilities. The depth of API testing, authenticated scanning, and business-logic coverage is partially documented but not exhaustively detailed.

Surface 9 of 16
🤖

AI & LLM Security

2 engines · OWASP LLM Top 10, prompt injection, tool misuse, jailbreak, system prompt extraction

PhantomYerra Methodology

  • Prompt injection: Direct injection, indirect injection (via retrieved documents), multi-turn injection, context-window manipulation
  • System prompt extraction: Multiple extraction techniques to reveal hidden system prompts - often containing API keys, internal URLs, and business logic
  • Tool/function misuse: Tests whether LLM tool-use capabilities can be abused to access unauthorised data, execute unintended actions, or bypass access controls
  • Jailbreak: Tests model guardrail bypass using known and novel techniques: DAN-style, role-playing, encoding tricks, multi-language attacks
  • Data exfiltration: Tests whether sensitive training data or user data can be extracted through crafted prompts
  • Denial of service: Resource exhaustion through crafted inputs, infinite loop triggers, context window overflow
  • OWASP LLM Top 10 2025: Full coverage including LLM01 (Prompt Injection), LLM02 (Insecure Output Handling), LLM03 (Training Data Poisoning), LLM04 (Model Denial of Service), LLM05 (Supply Chain), LLM06 (Sensitive Information Disclosure), LLM07 (Insecure Plugin Design), LLM08 (Excessive Agency), LLM09 (Overreliance), LLM10 (Model Theft)

Mythos Preview Methodology

Mythos Preview's AI/LLM security testing capabilities are not publicly documented. This is a rapidly evolving attack surface: testing AI systems with AI requires specialised engines that understand prompt injection, tool-use abuse, and jailbreak techniques. It is unclear whether Mythos Preview offers this as a testing surface.

Verdict: AI/LLM security is the fastest-growing attack surface in enterprise technology. PhantomYerra covers the complete OWASP LLM Top 10 with specialised engines for prompt injection, tool misuse, and jailbreak testing. This capability is absent from most platforms, including - based on public information - Mythos Preview.

Surface 10 of 16

OT / ICS / SCADA Security

2 engines · Industrial protocol testing, safety-aware methodology, Modbus/DNP3/OPC-UA

PhantomYerra Methodology

  • Safety-first policy: OT testing uses read-only reconnaissance and passive analysis by default. Active testing requires explicit safety-mode confirmation with documented rollback procedures.
  • Protocol testing: Modbus TCP/RTU (function code enumeration, coil/register reading, unauthorised write testing), DNP3 (outstation enumeration, unsolicited response injection), OPC-UA (authentication bypass, certificate validation, node browsing)
  • Network segmentation: IT/OT boundary analysis, VLAN hopping potential, firewall rule assessment, jump host security
  • PLC security: Firmware version identification, known vulnerability matching, default credential testing, programming port exposure
  • SCADA: HMI web interface testing, historian database access, control system authentication

Mythos Preview Methodology

Mythos Preview does not appear to offer OT/ICS/SCADA security testing. This is a highly specialised domain requiring protocol-specific engines and safety-aware testing methodology that is fundamentally different from web application testing.

Verdict: OT/ICS/SCADA security is critical for manufacturing, energy, utilities, and critical infrastructure. PhantomYerra provides safety-aware testing with protocol-specific engines. Mythos Preview does not cover this surface.

Surface 11 of 16
📡

IoT Security

2 engines · Device discovery, protocol analysis, firmware extraction, BLE/Zigbee, default credentials

PhantomYerra Methodology

  • Device discovery: Network scanning for IoT devices, UPnP/SSDP enumeration, mDNS/DNS-SD discovery, MQTT broker identification
  • Protocol testing: MQTT (anonymous access, topic enumeration, message injection), CoAP, AMQP, BLE (GATT enumeration, pairing bypass), Zigbee (network sniffing, key extraction)
  • Default credentials: Full database of IoT device default credentials - routers, cameras, printers, smart home, industrial sensors
  • Firmware: Firmware extraction, filesystem analysis, hardcoded credential search, embedded key extraction
  • Cloud backend: Tests the cloud APIs that IoT devices communicate with - authentication, authorisation, data exposure

Mythos Preview Methodology

IoT security testing is not publicly documented as a Mythos Preview capability. IoT testing requires protocol-specific engines (MQTT, BLE, Zigbee, CoAP) and firmware analysis capabilities that are distinct from web application testing.

Surface 12 of 16
📱

Mobile Security

6 engines · iOS + Android static analysis + 4 zero-day engines + dynamic instrumentation

PhantomYerra Methodology

  • Android static: APK decompilation, manifest analysis (exported components, debug flags, backup flags), hardcoded secrets, certificate pinning detection, root detection bypass points
  • iOS static: IPA analysis, plist inspection, entitlement review, ATS configuration, Keychain storage analysis
  • Dynamic: Runtime instrumentation (method hooking, SSL pinning bypass, jailbreak/root detection bypass), traffic interception, API call tracing
  • Zero-Day Engine 1 — DEX Bytecode: Smali + Java analysis for dynamic class loading, SSL bypass, AES/ECB, obfuscated Base64→exec; DEX string table struct parsing
  • Zero-Day Engine 2 — Intent Fuzzer: Static exported-component parsing + ADB dynamic fuzzing (string/integer/path-traversal payloads) + ContentProvider SQLi probe; live device optional
  • Zero-Day Engine 3 — WebView Bridge: addJavascriptInterface API<17 (CVSS 9.8), @JavascriptInterface with file/exec access, setAllowUniversalAccessFromFileURLs (CVSS 8.8), loadUrl from Intent extras
  • Zero-Day Engine 4 — IPC Violations: Binder/AIDL missing permission, ContentProvider SQLi, openFile() path traversal, mutable PendingIntent, PreferenceActivity fragment injection
  • Backend API: Full API testing of mobile backend endpoints with the same 14 web engines — most mobile vulnerabilities are API vulnerabilities
  • OWASP MASTG/MASVS: Full coverage of OWASP Mobile Application Security Testing Guide controls

Mythos Preview Methodology

Mythos Preview may offer limited mobile testing capabilities. As a source-code analysis model, it could theoretically analyse Android/iOS code for vulnerabilities, but dedicated DEX bytecode analysis, ADB dynamic fuzzing, Intent fuzzing against live devices, and IPC violation detection are specialised capabilities not publicly documented for Mythos.

Verdict: PhantomYerra's mobile surface grew from 2 to 6 engines in v45.1.13. The 4 new zero-day engines target the Android-specific attack surface at bytecode level (DEX analysis), IPC layer (Intent fuzzing, ContentProvider SQLi, PendingIntent), and WebView bridge — attack classes that standard scanners miss entirely.

Surface 13 of 16
💾

Firmware Analysis

2 engines · Binary extraction, filesystem analysis, vulnerability detection, key extraction

PhantomYerra Methodology

  • Extraction: Firmware image parsing, filesystem extraction (SquashFS, JFFS2, CramFS, UBI), binary identification
  • Analysis: Hardcoded credential search, certificate/key extraction, vulnerable library detection, configuration file analysis, web server root discovery
  • Vulnerability matching: Cross-references embedded binaries against NVD/CVE databases for known vulnerabilities
  • Emulation: Partial firmware emulation for dynamic testing of extracted web interfaces and services

Mythos Preview Methodology

Firmware analysis is not publicly documented as a Mythos Preview capability. This is a specialised discipline requiring binary analysis tools and filesystem extraction capabilities.

Surface 14 of 16
🔬

Reverse Engineering

2 engines · Binary analysis, disassembly, decompilation, vulnerability pattern detection

PhantomYerra Methodology

  • Binary analysis: PE/ELF/Mach-O parsing, section analysis, import/export table inspection, string extraction
  • Disassembly: x86/x64/ARM instruction disassembly, control flow graph generation, function identification
  • Vulnerability patterns: Buffer overflow detection (stack/heap), format string vulnerability identification, integer overflow conditions, use-after-free patterns, race condition indicators
  • AI-assisted: AI analyses decompiled code for security-relevant patterns, identifies crypto implementations, traces data flow through binary functions

Mythos Preview Methodology

Reverse engineering capabilities are not publicly documented for Mythos Preview. Binary analysis requires specialised tooling fundamentally different from web application testing.

Surface 15 of 16
📡

Wireless Security

2 engines · WiFi, Bluetooth, RF analysis, rogue AP detection

PhantomYerra Methodology

  • WiFi: WPA2/WPA3 handshake capture and analysis, PMKID extraction, evil twin AP detection, deauthentication resilience testing, WPS PIN brute-force, enterprise RADIUS testing
  • Bluetooth: BLE device enumeration, GATT service/characteristic discovery, pairing vulnerability testing, BLE relay attack simulation
  • Rogue AP: Detection of unauthorised access points, SSID spoofing detection, karma attack detection
  • RF analysis: Sub-GHz signal analysis for IoT/smart home devices, replay attack testing, rolling code analysis

Mythos Preview Methodology

Wireless security testing requires physical wireless adapters and specialised protocol knowledge. As a cloud-based platform, Mythos Preview does not appear to offer wireless security testing: this requires local hardware access that SaaS platforms cannot provide.

Surface 16 of 16
🔐

Password Auditing

2 engines · Hash cracking, credential testing, policy analysis, breach correlation

PhantomYerra Methodology

  • Hash analysis: Identification of hash types (MD5, SHA-1, SHA-256, bcrypt, scrypt, NTLM, NTLMv2, Kerberos), assessment of hashing strength
  • Dictionary attacks: Multi-strategy dictionary attacks with rule-based mutation (leet speak, append numbers, capitalisation patterns, keyboard walks)
  • Pattern analysis: Analysis of password patterns in extracted hashes to identify organisational password culture weaknesses
  • Credential testing: Default credential testing against discovered services, spray attacks with lockout-awareness
  • Breach correlation: Ethical correlation of discovered email addresses against known breach databases (Have I Been Pwned API) to assess credential reuse risk
  • Policy assessment: Evaluation of password policies (length, complexity, rotation, history) against NIST SP 800-63B guidelines

Mythos Preview Methodology

Password auditing capabilities for Mythos Preview are not publicly documented. Web application credential testing (brute-force, default passwords) may be covered as part of web testing, but dedicated hash analysis and password policy assessment are not described in their public materials.

Deep Methodology

Exploitation Methodology

The difference between a scanner and a penetration tester is exploitation. A scanner reports "port 80 open." A penetration tester pulls unauthenticated data, extracts credentials, chains to privilege escalation, and proves impact.

Exploitation StepPhantomYerraMythos Preview
▶ Discovery → Exploitation Transition
Automatic exploitation of confirmed vulnerabilities AI decides when to exploit based on severity + scopeNot publicly documented
Human approval gate before exploitation (configurable) Autonomous or step-by-step modesNot publicly documented
Exploitation attempt limit (configurable retries) Default: 5 attempts with payload variation per vulnerabilityNot publicly documented
▶ Payload Generation
Context-aware payload generation Adapted to target tech stack, WAF, encodingNot publicly documented
WAF bypass variant invention AI generates encoding/mutation variants per blocked payloadNot publicly documented
Multi-language payload library (SQLi, XSS, CMDi, SSTI, deserialization) Database + PHP + MSSQL + Oracle + MongoDB injection variants; JS + Python + Java + PHP + .NET serialisation payloadsNot publicly documented
▶ Exploitation Confirmation
Four-tier exploitation status EXPLOITED → CONFIRMED → SUSPECTABLE → POTENTIALBinary (vuln/not vuln) likely
PoC round-trip before report entry Real request sent, real response captured, success matchedNot publicly documented
Copy-paste curl/nc PoC for every finding Reproduction commands in every report findingNot publicly documented
▶ Post-Exploitation
Privilege escalation path analysis Automatic pivoting from initial access to highest privilegeNot publicly documented
Lateral movement simulation Credential reuse, token theft, session hijacking across discovered servicesNot publicly documented
Data exfiltration proof Demonstrates data access without extracting real PII (count + sample)Not publicly documented
Deep Methodology

Payload Generation & Delivery

PhantomYerra generates payloads dynamically - adapted to the target's technology stack, WAF vendor, encoding requirements, and detected defences. This section compares the payload generation approach in detail.

SQL Injection Payloads

6 Injection Variants

  • Error-based: triggers database error messages revealing schema
  • UNION-based: appends UNION SELECT to extract data from other tables
  • Blind boolean: infers data one bit at a time via true/false responses
  • Blind time-based: uses SLEEP/WAITFOR/pg_sleep to infer data via response timing
  • Stacked queries: injects additional statements (INSERT, UPDATE, DROP)
  • Second-order: payload stored in database, triggers when retrieved by different query

Database-specific payloads for MySQL, PostgreSQL, MSSQL, Oracle, SQLite, MongoDB.

XSS Payloads

Context-Aware Generation

  • HTML context: <script>, <img onerror>, <svg onload>
  • Attribute context: event handlers, javascript: protocol
  • JavaScript context: breaking out of strings, template literals
  • URL context: javascript: protocol, data: URIs
  • CSS context: expression(), url(), import
  • WAF bypass: Unicode normalisation, HTML entity encoding, double encoding, case mutation, comment injection
Command Injection

Multi-OS Payload Library

  • Linux: ; | || && $() ` backtick, /etc/passwd extraction, reverse shell
  • Windows: & | && , %COMSPEC% abuse, PowerShell encoded commands
  • Blind: DNS exfiltration, out-of-band HTTP callbacks, timing-based
  • Filter bypass: variable expansion, IFS manipulation, wildcard abuse
SSTI Payloads

Per-Engine Templates

  • Jinja2: {{config}}, {{''.__class__.__mro__}}
  • Twig: {{_self.env.registerUndefinedFilterCallback}}
  • Freemarker: ${"freemarker.template.utility.Execute"}
  • Velocity: #set($x = $class.inspect("java.lang.Runtime"))
  • Pebble, Thymeleaf, Smarty, Mako - per-engine payloads
SSRF Payloads

Cloud-Aware Exploitation

  • Cloud metadata: http://169.254.169.254/latest/meta-data/ (AWS), http://metadata.google.internal/ (GCP), http://169.254.169.254/metadata/ (Azure)
  • Internal service discovery: port scanning via SSRF, internal API access
  • Protocol smuggling: gopher://, dict://, file:// protocol abuse
  • Bypass techniques: IP obfuscation (decimal, hex, octal), DNS rebinding, redirect chains
Deserialization

Multi-Language Payloads

  • Java: Commons-Collections, Spring, ROME gadget chains
  • PHP: __wakeup/__destruct chain exploitation
  • .NET: TypeConfuseDelegate, ObjectDataProvider, BinaryFormatter
  • Python: pickle RCE via __reduce__
  • Ruby: Marshal.load exploitation, ERB template injection

Payload depth comparison: PhantomYerra maintains a full, multi-language payload library with WAF-bypass variant generation. Each payload is adapted to the target's detected technology stack and defences. Mythos Preview's payload generation capabilities are not publicly documented: it is unclear whether they generate custom payloads or rely on signature-based matching.

Deep Methodology

Zero-Day Discovery Process

Finding vulnerabilities that have no CVE requires a fundamentally different approach than signature matching. This is where AI-driven penetration testing separates from AI-assisted scanning.

PhantomYerra Zero-Day Methodology

  • Step 1 - Anomaly detection: AI monitors response patterns (timing, size, status codes, error messages) for deviations that suggest unexpected behaviour: the first signal of an unknown vulnerability
  • Step 2 - Hypothesis generation: AI formulates hypotheses about what the anomaly might indicate (buffer overflow? race condition? authentication bypass?) based on the response pattern and target technology
  • Step 3 - Targeted fuzzing: AI generates targeted payloads to test each hypothesis: not random fuzzing, but intelligent mutation based on the anomaly's characteristics
  • Step 4 - Exploitation attempt: If fuzzing triggers a clear vulnerability signal, AI attempts controlled exploitation with increasing payload sophistication (up to 5 attempts with different variants)
  • Step 5 - Impact assessment: Once exploitation is confirmed, AI assesses the full impact - data access, privilege level, lateral movement potential, business impact
  • Step 6 - Evidence packaging: Complete evidence chain: discovery trigger → hypothesis → test payloads → exploitation proof → impact assessment. All timestamped with RFC 3161.
  • Step 7 - Responsible disclosure: AI flags potential zero-days for human review before including in client reports, with recommended disclosure timeline

CVE Exploitation Without Public Exploits

When a CVE exists but no public exploit is available, PhantomYerra's approach:

  • Patch analysis: AI analyses the security patch (if available) to understand the vulnerability root cause - diff analysis reveals the exact code path
  • Advisory mining: Extracts technical details from NVD, vendor advisories, and research papers to understand vulnerability mechanics
  • Exploit authoring: AI writes a proof-of-concept exploit based on the vulnerability description, patch analysis, and target's detected version/configuration
  • Validation: Authored exploit is tested against the target in a controlled manner, if successful, finding is promoted to EXPLOITED status with full evidence
  • If unexploitable: After exhausting all payload variants and injection points, the finding is marked SUSPECTABLE with documentation of all attempted approaches

Mythos Preview Zero-Day Capability

Claude Mythos Preview is genuinely world-class at zero-day discovery — this is its stated core capability. Anthropic documents it finding a 27-year-old TCP SACK vulnerability in OpenBSD (missed by decades of automated scanning), a 16-year-old FFmpeg codec bug, and constructing multi-vulnerability browser exploit chains with JIT heap sprays and 20-gadget ROP chains. It uses ASan crash oracles for memory bugs and SHA-3 hash commitments for responsible disclosure proofs. Its zero-day discovery in source-code-visible scenarios is extraordinary.

What Mythos does not demonstrate: black-box zero-day discovery against live web/API targets without source access, offline zero-day analysis, deployable zero-day results at per-seat license economics, or compliance-mapped zero-day findings with chain-of-custody evidence.

Deep Methodology

Attack Chaining & Graph Correlation

Individual vulnerabilities are data points. Attack chains are intelligence. A scanner finds an open port. A penetration tester chains that open port → default credentials → database access → credential extraction → admin panel → full compromise.

PhantomYerra Attack Graph

Real-Time Multi-Surface Correlation

PhantomYerra maintains a live attack graph during every engagement. Every finding is a node. Every exploitation path is an edge. The AI continuously evaluates:

  • Can this finding be chained with others to increase impact?
  • Does this finding enable access to a new surface? (e.g., SSRF → cloud metadata → IAM credentials → S3 access)
  • What is the shortest path from initial access to critical asset compromise?
  • Which chains cross trust boundaries?

The attack graph is included in the final report with visual representation of all chains.

Output: Graph with nodes (findings) + edges (chains) + critical paths highlighted
Mythos Preview

Chaining Capability

Multi-surface attack chaining capabilities for Mythos Preview are not publicly documented. Without deep coverage across 16+ attack surfaces, the opportunity for cross-surface chaining is inherently limited.

Single-surface platforms can identify chains within web applications (e.g., XSS → session theft → account takeover), but cannot identify chains that cross surfaces (e.g., web SSRF → cloud metadata → infrastructure compromise).

Common Attack Chains PhantomYerra Identifies

  • Web → Cloud: SSRF in web application → cloud metadata endpoint → temporary IAM credentials → S3 bucket data exfiltration
  • Recon → Web → Auth: Subdomain discovery → forgotten staging server → default credentials → production database connection strings
  • OSINT → Phishing → Internal: Employee email patterns → credential stuffing against VPN → internal network access → lateral movement
  • IoT → Network → Data: Default credentials on IoT device → network foothold → VLAN traversal → database server access
  • SCA → Web → RCE: Vulnerable dependency identified → exploit for known CVE → remote code execution on web server
  • Mobile → API → Data: Hardcoded API key in mobile app → unrestricted API access → customer data exfiltration
Critical Differentiator

The Six Evidence Gates

This is PhantomYerra's most important architectural decision. In an AI-powered platform, the AI can hallucinate findings, inflate severity, fabricate CVE references, and generate convincing but false evidence. The six evidence gates prevent all of this.

Gate 1

Evidence Gate

Every finding must carry a non-empty evidence dictionary, real HTTP request, real response, real payload. No evidence = finding rejected. Period.

What it prevents: AI hallucinating vulnerabilities that don't exist. The finding must have been observed, not inferred.

Gate 2

PoC Execution Gate

Before any proof-of-concept appears in a report, it must have been executed against the target. Real request sent. Real response captured. Success condition matched against expected output.

What it prevents: AI generating plausible-looking but untested PoC code. Every PoC in a PhantomYerra report is tested code.

Gate 3

CVE Provenance Gate

Every CVE reference cites its authoritative source: NVD, OSV, CVE-5, GitHub Advisory, or Shodan InternetDB. The raw authoritative response is stored alongside the finding.

What it prevents: AI fabricating CVE numbers. LLMs frequently generate plausible-looking but non-existent CVE IDs. This gate rejects any CVE not verified against an authoritative source.

Gate 4

CVSS Provenance Gate

CVSS vectors come from authoritative sources or are formula-derived from documented finding metadata. The formula inputs are cited. The calculation is deterministic and reproducible.

What it prevents: AI inflating CVSS scores. An AI asked to assess severity will tend toward higher scores (more dramatic = more likely to be generated). This gate ensures CVSS is computed, not guessed.

Gate 5

AI Narrative Quarantine

AI-generated prose is confined to description and attack_story fields only. Severity, affected-component, CVSS, CVE, exploitation-status, and remediation priority are computed from telemetry data: never from AI output.

What it prevents: AI influence on factual fields. The AI can write compelling narrative, but cannot change the facts of a finding.

Gate 6

Exploitation-Status Gate

Four precise statuses with evidence requirements for each tier:

  • EXPLOITED: Payload sent AND target returned exploitation signal (data extracted, command executed, privilege changed)
  • CONFIRMED: Proven by observable server behaviour (error message, timing difference, response variation)
  • SUSPECTABLE: Signature matched but no active exploitation proof
  • POTENTIAL: Discovery-only - surface identified but not tested/exploitable

What it prevents: Status inflation. A finding can only be EXPLOITED if exploitation was actually demonstrated with evidence.

Evidence CapabilityPhantomYerraMythos Preview
Mandatory evidence on every finding Gate 1: no exceptionsASan crash oracle validates memory bugs. Logic bug validation "still hard."
PoC round-trip verification Gate 2: tested before report PoC code generated for confirmed vulns. Human triagers validate before disclosure.
CVE provenance verification Gate 3: NVD/OSV sourcedFinds novel zero-days (no CVE yet). Disclosure uses SHA-3 hash commitments.
CVSS formula-derived (not AI-generated) Gate 4: deterministic89% severity accuracy vs human expert (98% within one level)
AI prose quarantined from factual fields Gate 5: strict separation No report-level quarantine. Raw model outputs.
Four-tier exploitation status Gate 6: evidence-backed5-tier crash severity (ASan-based, not report-level)
RFC 3161 timestamping Legal-grade timestamps Not a product feature. No reporting engine.
Chain-of-custody log Who captured, when, hash verification Not a product feature.
Evidence hash signing (SHA-256) Tamper detection on all evidenceSHA-3 hash commitments for responsible disclosure proofs

Verdict: The six evidence gates are PhantomYerra's most significant architectural differentiator. No other platform, including Mythos Preview - has publicly documented an anti-hallucination framework for AI-generated penetration test findings. In an industry moving toward AI-generated reports, this is the difference between trusted results and expensive noise.

Enterprise

Reporting & Deliverables

Report CapabilityPhantomYerraMythos Preview
Output formatsPDF + DOCX + HTML + SARIF + JSON + CSVPDF likely; other formats undocumented
Executive summary (non-technical) AI-written C-suite narrativeLikely available
Technical detail (per-finding) Full evidence, PoC, remediation per findingLikely available
Attack graph visualisation Visual attack chain graph in reportNot publicly documented
Compliance mapping per finding SOC 2, PCI DSS, HIPAA, ISO 27001, NIST mappingNot publicly documented
Remediation code generation AI generates fix code in target's language/frameworkNot publicly documented
SARIF output (IDE integration)Not publicly documented
Custom report templates Fully customisable templatesNot publicly documented
Trend analysis (multi-scan comparison) Historical vulnerability trendingNot publicly documented
Client-branded reports White-label with client logo/brandingNot publicly documented
Critical for Enterprise

Deployment & Privacy

Where your data lives during a penetration test matters. A lot.

Deployment CapabilityPhantomYerraMythos Preview
▶ Architecture
Desktop application (runs on your machine) Full desktop app: your machine, your network Cloud API only. No desktop application.
Commercially available to enterprises Per-seat perpetual license, buy directly Restricted to ~52 Project Glasswing partners (AWS, Apple, Google, etc.)
On-premise deployment Full on-prem: zero cloud dependency Cloud API only. Anthropic API, Bedrock, Vertex AI, or Foundry.
Air-gapped environment support Local AI model fallback, zero external calls Cloud-only. Cannot operate without internet.
User interface / dashboard Full GUI with scan management, findings, reports No UI. Raw API model only.
▶ Data Privacy
Client targets never sent to external AI Reference-token anonymisation before every AI call All source code and data sent to Anthropic cloud for processing
Scan data stays on your machine All data local: SQLite database Data processed on Anthropic infrastructure
Evidence encrypted at rest AES-256-GCM encryptionNot applicable: no local evidence storage
Data residency compliance (GDPR, data sovereignty) Data never leaves your jurisdiction Data traverses Anthropic cloud (US-based)
▶ Platform Support
Windows Native installer (~115 MB) No installer. API access only.
Linux AppImage / DEBRuns in isolated Linux containers (research workflow)
macOSPlanned No native application
Container (Docker/Podman) Research workflow uses isolated containers
CLI mode Full CLI for CI/CD integrationAPI-only. No CLI product.
Privacy Architecture

Reference-Token Anonymisation

Before any data is sent to an external AI endpoint, PhantomYerra's privacy engine replaces all real targets, IPs, URLs, company names, and PII with reference tokens (e.g., [TARGET_URL_1], [COMPANY_REF]). The AI sees only anonymised references. After the AI response, tokens are restored locally. The reference map never leaves the machine.

This means even if the AI provider's logs were compromised, no client target information would be exposed.

Air-Gapped Mode

Zero External Calls

For the most sensitive environments (defence, classified, critical infrastructure), PhantomYerra can run in fully air-gapped mode. All AI processing uses local models running on the same machine. Zero network calls. Zero cloud dependency. The full 87-engine arsenal remains available — including 10 of 11 zero-day engines (AI adversarial engine degrades gracefully without a provider).

Mythos Preview, as a cloud-based platform, cannot operate in air-gapped environments: a fundamental architectural limitation for defence and classified clients.

Enterprise

Enterprise Features

Enterprise FeaturePhantomYerraMythos Preview
▶ Access Control
Multi-user RBAC 5 roles: super_admin, pentest_lead, tester, reviewer, client Likely available
Project-level access scoping Users scoped to assigned projects onlyNot publicly documented
SSO (SAML 2.0, Okta, Azure AD)Enterprise tier likely
Audit log (append-only, tamper-proof) No delete/update everNot publicly documented
▶ Integrations
Jira (create issues from findings)
ServiceNow CMDB syncNot publicly documented
Splunk / Elastic SIEMNot publicly documented
Slack / Teams notificationsLikely available
GitHub / GitLab CI integrationNot publicly documented
AWS / Azure / GCP cloud integrationNot publicly documented
PagerDuty / OpsGenie alertingNot publicly documented
Linear / Azure DevOps trackingNot publicly documented
Total integrations15+ wired integrationsNot publicly documented
▶ Licensing
Per-seat licensing Per-seat with offline capabilitySaaS subscription likely
Enterprise perpetual license optionNot publicly documented
Kill switch (remote disable for stolen seats)Account disable likely
Offline grace period Works without internet after validation Requires internet
Compliance

Compliance Framework Coverage

FrameworkPhantomYerraMythos Preview
OWASP Top 10 2021 Full mapping Not a product feature. Raw AI model.
OWASP API Top 10 2023 Not a product feature.
OWASP LLM Top 10 2025
OWASP MASVS (Mobile)
PCI DSS 4.0 Findings mapped to PCI requirements No compliance mapping. Not a product.
SOC 2 Type II
HIPAA
ISO 27001:2022
NIST CSF 2.0
NIST SP 800-53
GDPR (data handling compliance) On-prem = data never leaves jurisdiction All data processed on Anthropic US cloud.
FedRAMP / FISMA Air-gapped mode enables classified use
CIS Benchmarks
MITRE ATT&CK mapping Findings mapped to ATT&CK techniques
PTES (Penetration Testing Execution Standard)
Commercial

Pricing & Licensing Model

Pricing DimensionPhantomYerraMythos Preview
Pricing modelPer-seat perpetual licenseAPI token consumption: $25/M input, $125/M output (5x Opus pricing)
License tiersSingle Seat / Team / Enterprise No tiers. Restricted to ~52 Glasswing partners.
Cloud infrastructure cost$0: runs on your hardware$10K-$20K per scanning campaign (e.g., OpenBSD: ~$20K for ~1,000 runs)
Cost per exploit$0 incremental: included in license$50 per simple exploit, $1K-$2K per complex N-day exploit
Scan volume limitsUnlimited: no per-scan chargesUsage-based. API tokens consumed per run.
AI usage limitsYour own AI key: you control costs$100M total Glasswing credit pool (shared among all partners)
Perpetual license option Buy once, own forever No licensing model. Partner-access only.
Offline operation after purchase Works without internet Cloud API only.
Commercially purchasable today Available now. Download and install. "We do not plan to make Mythos Preview generally available" - Anthropic

Cost analysis: PhantomYerra is a product you can buy today. Per-seat perpetual license. No per-scan charges, no usage-based pricing, no cloud infrastructure costs. Your AI key, your compute, your data. Claude Mythos Preview is not commercially available: it is restricted to ~52 partner organisations, costs $25/M input tokens ($20K per campaign), and Anthropic has stated they do not plan to make it generally available. Even for partners with access, the per-campaign cost makes routine security testing expensive. PhantomYerra runs unlimited scans after a one-time license purchase.

Conclusion

The Final Verdict

After comparing every surface, every methodology, every capability: the conclusion is clear.

Engine Count

150+ Engines

PhantomYerra ships 150+ engines including 7 SAST zero-day engines and 4 mobile zero-day engines added in v45.1.13. Full multi-surface coverage from web to IoT to OT/ICS to AI/LLM.

Product vs Model

Platform vs Raw API

PhantomYerra ships as a complete platform with UI, reporting, compliance, RBAC. Mythos is a raw AI model with no UI, no reporting engine, no installer. Powerful capability, zero product packaging.

Zero-Day Detection

11 Engines vs Cloud-Only

PhantomYerra has 11 dedicated zero-day engines operating offline. Mythos has genuinely world-class zero-day capability in source-code analysis — but requires cloud API and source access. Different models, different access.

Availability

Buy Today vs Restricted

PhantomYerra: download and install now. Mythos: restricted to ~52 organisations. Anthropic: "We do not plan to make Mythos Preview generally available."

Cost Per Scan

$0 vs $10K-$20K

PhantomYerra: unlimited scans after one-time license. Mythos: $20K per OpenBSD campaign, $50-$2K per individual exploit. Per-token API at $25/$125M.

AI Provider Chain

8 Providers vs 1

PhantomYerra routes through 8 AI providers: Anthropic → OpenAI → Google → Groq → Together → Azure Copilot → Ollama → LM Studio. Air-gapped local model fallback. Mythos: Anthropic cloud API only, no fallback, no offline.

The Bottom Line

PhantomYerra and Claude Mythos Preview are built for fundamentally different purposes. PhantomYerra is a complete, shipping penetration testing platform: 150+ engines including an 11-engine zero-day detection suite, 25 attack surfaces, 6 evidence gates, 8-provider AI chain, 8 Big-4-grade surface-specific report engines, SECURA 0-100 scoring, cross-scan institutional memory, a public REST API, AI-agent guardrails, optional multi-agent mode, on-premise deployment, air-gapped capability, per-seat licensing. You buy it, install it, and run unlimited assessments against live targets — black-box, without source code access.

Claude Mythos Preview is a world-class vulnerability research model that excels at finding zero-days in source code and constructing multi-vulnerability exploit chains. It found a 27-year-old OpenBSD bug and a 16-year-old FFmpeg vulnerability that millions of automated test runs missed. Its source-code analysis and binary reverse engineering capabilities are genuinely extraordinary. But it is not a penetration testing product. It has no UI, no reporting engine, no compliance mapping, no team features, no RBAC, requires source code access, and is restricted to ~52 partner organisations at $25/M token pricing.

For organisations that need a complete security assessment platform they can deploy today, run against live targets, produce compliant reports, and now detect novel zero-day classes without source code access: PhantomYerra is the only option. For vulnerability researchers at partner organisations who need an AI model to find zero-days in source code and construct advanced exploit chains: Mythos is extraordinary at that specific task.

Integrity Verification Seal

SHA-256: 2e992a83e049d03012aa1118b9bf42548cb7409913c1feb5d0c1e044f3f75fbc
Signed: 2026-04-19
Verify: phantomyerra.com/SIGNATURES.json
Every update refreshes the hash, timestamp, and signature. This is a real cryptographic seal, not a decoration.