PhantomYerra vs
Claude Mythos Preview
The definitive comparison between a shipping penetration testing platform and a restricted frontier AI model. PhantomYerra is a complete, deployable, 150+ engine pentest platform with 8 surface-specific Big-4-grade report engines, SECURA 0-100 scoring, cross-scan institutional memory, a public REST API, and AI-agent guardrails. Claude Mythos Preview is a raw AI model with world-class vulnerability discovery capabilities, restricted to 52 partner organisations and available only via cloud API. Different categories. Same goal: finding and exploiting vulnerabilities. Here is how they compare.
All capability claims validated against PhantomYerra v45.1.29 source code. SHA-256 signed and published to SIGNATURES.json. Every update refreshes the hash, timestamp, and signature.
Design Philosophy
Before comparing features, understand the fundamental design difference. These two platforms were built with opposing philosophies.
The AI Penetration Tester
PhantomYerra was designed as an autonomous penetration tester: not a scanner. The AI doesn't just find vulnerabilities; it exploits them, chains them into attack paths, generates context-aware payloads, adapts to WAF defences in real-time, and writes professional narrative reports with evidence.
Every finding must pass six evidence gates before reaching a report. The AI is treated as a senior red teamer: it plans the engagement, selects tools, executes attacks, pivots on discoveries, and writes the report, with the human confirming scope once and reviewing results.
- Desktop-first: runs on your machine, your network, your rules
- 150+ pure-Python security engines including 11 zero-day detection engines
- AI orchestrates tools via function-calling (plan → execute → adapt → chain → report)
- Evidence-gated: no finding without proof, no exception
- Business-logic native: tests IDOR, BOLA, BFLA, race conditions, JWT confusion on every scan
- Zero-day capable: interprocedural taint flow, crypto oracle detection, gadget chain discovery, AI adversarial passes
The Frontier AI Model
Claude Mythos Preview (codenamed "Capybara") is Anthropic's frontier AI model announced April 7, 2026, as part of Project Glasswing. It is NOT a penetration testing product or platform. It is a raw AI model with world-class vulnerability discovery and exploit development capabilities, available only via cloud API to ~52 approved partner organisations.
Mythos excels at source-code-visible analysis: it found a 27-year-old TCP SACK bug in OpenBSD, a 16-year-old FFmpeg codec vulnerability, and constructed multi-vulnerability browser exploit chains. It has not demonstrated black-box web application pentesting against live targets.
- Raw AI model: no UI, no installer, no dashboard, no project management
- Restricted access: ~52 organisations (AWS, Apple, Google, Microsoft, etc.)
- API-only: $25/M input, $125/M output tokens (5x Opus 4.6 pricing)
- Excels at: source code analysis, binary reverse engineering, exploit development
- Not demonstrated: black-box pentesting, business-logic testing, live target assessment
- No reporting engine, compliance mapping, evidence chain-of-custody, or RBAC
Core difference: PhantomYerra is a complete, shipping penetration testing platform with 150+ engines (including an 11-engine zero-day detection suite), UI, reporting, compliance, and deployment options. Claude Mythos Preview is a restricted frontier AI model that excels at source-code vulnerability discovery but has no product packaging, no deployment installer, no team features, and no reporting engine. They serve fundamentally different needs: PhantomYerra replaces a pentest team; Mythos augments a vulnerability researcher's workflow.
AI Engine Architecture
Both platforms use AI. But how AI is integrated, and what it controls - determines whether the platform is an automated scanner or an autonomous penetration tester.
| AI Capability | PhantomYerra | Mythos Preview |
|---|---|---|
| AI drives engagement planning (target analysis, attack plan, tool selection) | ✓ Autonomous: AI creates full attack plan from target + scope | ✓ Given a prompt, autonomously decides investigation approach |
| AI selects and invokes security engines via function-calling | ✓ Tool-use API: AI calls 76 engines as functions | Uses agentic loop with file/shell access in containers |
| AI adapts mid-engagement (found vuln, pivot to exploitation) | ✓ Real-time pivoting without human intervention | ✓ Autonomously chains vulnerabilities into exploit paths |
| AI generates context-aware payloads (adapted to target tech stack) | ✓ Live payload generation per target, WAF-aware | ✓ World-class: 20-gadget ROP chains, JIT heap sprays, multi-vuln exploits |
| AI chains findings into multi-step attack paths | ✓ Attack graph maintained live across 25 surfaces, with cross-scan institutional memory recalling prior findings | ✓ Chains 2-4 vulnerabilities into privilege escalation paths |
| Professional reporting engine (PDF, DOCX, SARIF, HTML) | ✓ Executive + technical narratives, multiple formats | ✗ Raw bug reports and PoC code only. No reporting engine. |
| Anti-hallucination framework | ✓ Six evidence gates: AI prose limited to description/narrative only | Uses ASan as crash oracle. No report-level anti-hallucination. |
| AI provider fallback chain (commercial, open-source, local) | ✓ 8-provider chain: Anthropic → OpenAI → Google → Groq → Together → Azure Copilot → Ollama → LM Studio (air-gapped) | ✗ Mythos API only. No fallback. No local model. |
| Client data anonymised before AI API calls | ✓ Reference-token substitution: targets never sent to AI endpoints | ✗ All code/data sent to Anthropic cloud API |
| Autonomous operation (confirm scope once, AI completes engagement) | ✓ Confirm once, AI runs all 150+ engines | ✓ "Minimal human steering" per Anthropic |
| Source-code-visible vulnerability discovery | ✓ SAST engines + AI-enhanced code review | ✓ World-class: 27yr-old OpenBSD bug, 16yr-old FFmpeg bug |
| Black-box pentesting against live web targets | ✓ Full DAST + business-logic testing against live apps | ✗ Not demonstrated. Independent analysis confirms source-visible only. |
Three Execution Modes
Automated AI Mode: User defines target + scope → AI plans the entire engagement → user confirms once → AI runs all 150+ engines autonomously, adapting in real-time, chaining findings, writing the report.
Semi-Automated: AI proposes each step. Human approves or adjusts. Best for compliance-sensitive environments where audit trail requires human approval at each phase.
Manual: Human drives tool selection and execution. AI provides advisory intelligence, payload suggestions, and narrative writing. Full pentester-in-control mode.
Why AI Quarantine Matters
AI language models hallucinate. This is not a bug - it's a fundamental property of probabilistic text generation. In a penetration test report, a hallucinated finding is worse than a missed finding: it wastes remediation effort, erodes trust, and can cause compliance failures.
PhantomYerra's six evidence gates ensure that AI-generated prose is limited to description and narrative fields. Severity, CVSS, CVE, exploitation status, and affected components are computed from telemetry: never from AI output. This is the difference between "AI-assisted" and "AI-evidence-gated."
Result: Zero hallucinated findings in production reportsThe 150+ Engine Arsenal
PhantomYerra ships 150+ purpose-built security engines across 25 attack surfaces: all pure Python, zero external binary dependencies. This includes the industry's first 11-engine Zero-Day Detection Suite built into SAST and mobile surfaces, plus 8 Big-4-grade surface-specific report engines (Web, API, Mobile MASVS-mapped, IoT, Cloud, Network, Firmware, SAST, Reverse Engineering) and 4 Big-4 report types (Compliance, Delta, Retest, Attestation letter) added in v45.1.29. Each engine implements a standardised adapter interface: target in, findings out, evidence attached. Mythos Preview's engine count and architecture are not publicly documented.
| Dimension | PhantomYerra | Mythos Preview |
|---|---|---|
| Total security engines | 150+ engines shipping in v45.1.29 | Not publicly documented |
| Zero-day detection engines | 11 dedicated zero-day engines (SAST + Mobile) | ✓ World-class zero-day discovery in source code/binaries |
| Engine implementation | 100% pure Python: no Go/Rust/C binaries | Not publicly documented |
| Installer size | ~115 MB (76% reduction from binary era) | SaaS: no installer |
| Antivirus false positives | Zero, pure Python is never flagged | SaaS: not applicable |
| Standardised adapter interface | ✓ BaseToolAdapter: scan(target, context) → FindingList | Not publicly documented |
| Engine crash isolation | ✓ Per-engine try/except + watchdog rollback; zero-day engines all non-fatal | Not publicly documented |
| Engines updatable independently | ✓ Each adapter versioned + hot-swappable | Not publicly documented |
| Attack surfaces covered | 25 distinct surfaces | Primarily source-code and binary analysis |
| Big-4-grade Web Application PDF report | ✓ Page-numbered TOC, executive briefing, ASVS + OWASP Top-10 mapping, attack-chain diagram, per-finding evidence + curl reproduction, per-page appendices | Not a product; no report engine |
| Surface-specific Big-4 report engines | 8 engines: Web, API, Mobile (MASVS-mapped), IoT, Cloud, Network, Firmware, SAST, Reverse Engineering | Not applicable |
| SECURA 0-100 scoring with tier bands | ✓ Elite ≥90 / Strong ≥75 / Moderate ≥55 / Weak ≥35 / Critical <35 | No scoring system |
| Cross-scan institutional memory | ✓ "You've seen this before" signals across every engagement; regressions flagged instantly | Per-session; no persistent memory across engagements |
| Public REST API with scoped Bearer tokens | ✓ CI/CD, ticketing, SIEM/SOAR, custom dashboards; rate-limited, admin-issued | Raw model API only; no findings/report/project REST surface |
| AI agent guardrails (observable tool calls) | ✓ Execution Monitor + Reflector: every tool call scope-gated, loop-protected, logged | Raw agentic loop; no operator-visible guardrails |
| Multi-agent orchestration (Planner / Executor / Reviewer) | ✓ Opt-in split with hallucination + duplicate review before report | Single-agent loop |
Zero-Day Detection Suite
PhantomYerra v45.1.13 introduces an 11-engine zero-day detection suite embedded into SAST and mobile surfaces. These engines find vulnerability classes that have no CVE — business-logic flaws, race conditions, cryptographic design errors, and novel deserialization chains — and prove them with working PoC evidence.
Interprocedural Taint Flow
Builds a cross-file call graph and traces untrusted input from 20 source patterns to 25 dangerous sinks across multiple function call boundaries. Catches injection chains invisible to single-file scanners.
CWE: 89, 78, 79, 22, 94, 502, 601Race Condition & TOCTOU Detector
Detects TOCTOU patterns (os.path.exists → open/rename), broken double-checked locking, mutex misuse, and predictable temp file races using AST-level analysis. Generates concurrent PoC scripts.
CWE: 362, 367, 833, 820, 377Crypto Oracle Detector
Finds padding oracles (CBC + distinguishable exception paths), timing oracles (non-constant-time HMAC), ECB mode detection, GCM nonce reuse, weak KDF, and PKCS1v15 RSA across 5 languages.
Languages: Python, Java, JS, Ruby, PHPAuth Chain Analyzer
Detects JWT alg:none (CVSS 9.8), RS256→HS256 algorithm confusion, session fixation, IDOR without ownership checks, and MFA bypass via client-controlled session state.
CWE: 287, 384, 639, 345Deserialization Gadget Finder
AST-level detection of unsafe deserialization across Python (pickle/yaml/dill/__reduce__), Java (ObjectInputStream/XStream/Kryo), Ruby, PHP, and .NET. Generates ysoserial/phpggc gadget chain PoC automatically.
CVSS 9.8 for user-controlled deserialization inputSupply Chain Analyzer
Pure-Python typosquatting detection (Levenshtein distance ≤ 2 against 50+ popular packages), known malicious package list (event-stream, ua-parser-js, coa, rc, colors...), postinstall script analysis, and 6 manifest parsers.
CWE: 1104, 1357AI Adversarial Zero-Day Engine
5 AI adversarial passes per codebase: business_logic, parser_differential, trust_boundary, state_machine, type_confusion. Routes through multi-provider AI chain. Gracefully degrades without AI key.
Finds novel 0-days invisible to pattern matchingDEX Bytecode Analyzer
Smali + Java file analysis for dynamic class loading (DexClassLoader/Runtime.exec), SSL bypass (onReceivedSslError.proceed), AES/ECB, obfuscated Base64→exec. DEX string table via struct parsing fallback.
CWE: 295, 470, 327, 925Intent Fuzzer
Static: parses AndroidManifest.xml for exported activity/service/receiver/provider components. Dynamic: ADB fuzzing with string/integer/path-traversal payloads. ContentProvider SQLi probe. Live device optional.
CWE: 926, 89, 22, 20WebView Bridge Analyzer
Detects addJavascriptInterface on API<17 (CVSS 9.8), @JavascriptInterface with file/exec access, setAllowUniversalAccessFromFileURLs sandbox escape (CVSS 8.8), and loadUrl from Intent extras.
CWE: 749, 346, 73, 601IPC Violation Detector
Binder/AIDL missing permission checks, ContentProvider SQLi via rawQuery, path traversal in openFile(), mutable PendingIntent (no FLAG_IMMUTABLE), PreferenceActivity fragment injection. Deepest Android IPC coverage available.
CWE: 862, 89, 22, 284, 926, 927| Zero-Day Capability | PhantomYerra v45.1.29 | Mythos Preview |
|---|---|---|
| Interprocedural taint tracking (cross-file) | ✓ BFS propagation across full source tree | ✓ AI-native: context window holds entire codebase |
| Race condition / TOCTOU detection | ✓ AST-level pattern matching + PoC generator | Not publicly documented |
| Cryptographic design flaw detection | ✓ Padding oracle, timing oracle, nonce reuse, ECB — 5 languages | Not publicly documented |
| JWT / auth confusion attacks | ✓ alg:none, RS→HS confusion, IDOR, MFA bypass | Not publicly documented |
| Deserialization gadget chain discovery | ✓ 5 languages, ysoserial/phpggc PoC generation | ✓ Likely: demonstrated Java/C++ memory exploitation |
| Supply chain / typosquatting | ✓ Levenshtein + malicious package DB + postinstall analysis | Not publicly documented |
| AI adversarial scanning passes | ✓ 5 AI passes: business_logic, parser_differential, trust_boundary, state_machine, type_confusion | ✓ Core capability: 27-year-old OpenBSD bug, 16-year-old FFmpeg bug |
| Android DEX-level bytecode analysis | ✓ Smali + struct parsing fallback | Not publicly documented |
| Android IPC violation detection | ✓ Binder/AIDL/ContentProvider/PendingIntent | Not publicly documented |
| WebView bridge exploitation | ✓ addJavascriptInterface, sandbox escape, Intent extras | Not publicly documented |
| PoC generation for discovered zero-days | ✓ Generated for all 11 engine finding types | ✓ World-class: 20-gadget ROP chains, JIT heap sprays |
| Available without cloud API | ✓ Engines 1-6 fully offline; Engine 7 degrades gracefully | ✗ Cloud API required. No offline operation. |
| Black-box target (live web/app) | ✓ Full DAST + business-logic engines against live targets | ✗ Source-code/binary access required. No live-target black-box. |
| Commercially purchasable today | ✓ Per-seat license, available now | ✗ ~52 Glasswing partners only; not commercially available |
Zero-Day Verdict: Claude Mythos Preview is genuinely world-class at finding novel vulnerabilities in source code and binaries — its discovery of a 27-year-old TCP SACK bug in OpenBSD and a 16-year-old FFmpeg vulnerability demonstrates capabilities beyond any automated tool. However, Mythos requires source code or binary access and a cloud API connection. PhantomYerra's zero-day suite brings seven SAST-level zero-day engines and four mobile zero-day engines to any organisation's workflow — offline, on-premise, no cloud dependency — covering race conditions, crypto oracles, deserialization gadget chains, and AI adversarial passes. Different tools. Different access models. Both finding bugs that CVE databases miss.
PhantomYerra Methodology
PhantomYerra's web testing is not a single scanner - it's 14 specialised engines working in concert, each responsible for a specific attack class. The AI orchestrator decides which engines to invoke based on the target's technology fingerprint.
- Reconnaissance: Technology fingerprinting, endpoint discovery, parameter mining, JavaScript analysis, API schema detection
- Injection testing: SQL injection (error, blind, time-based, UNION, second-order), XSS (reflected, stored, DOM), command injection, SSTI, LDAP injection, XPath injection, header injection
- Authentication: Brute-force resistance, credential stuffing patterns, session fixation, session prediction, cookie security flags, password policy analysis
- Authorization: IDOR (horizontal + vertical), BOLA, BFLA, privilege escalation via role parameter tampering, JWT algorithm confusion (alg:none, HS/RS swap, kid traversal), OAuth flow abuse
- Business logic: Race conditions (TOCTOU via concurrent requests), workflow skip, state-machine bypass, mass-assignment (role/is_admin/balance injection), prototype pollution
- Client-side: DOM clobbering, postMessage abuse, CORS misconfiguration, CSP bypass, clickjacking, open redirects chained to token theft
- Infrastructure: Directory traversal, file inclusion (LFI/RFI), SSRF (including cloud metadata IMDS exploitation), XXE, insecure deserialization
- Cryptography: Weak TLS, certificate pinning, insecure random generation, hardcoded secrets in JavaScript
Payload generation: AI generates payloads adapted to the target's WAF, technology stack, and detected encoding. If a payload is blocked, the AI generates WAF-bypass variants (encoding rotation, case mutation, comment injection, Unicode normalisation) and retries automatically.
Evidence: Every finding carries the exact HTTP request, response, payload used, and extraction result. Copy-paste curl PoC command included for every exploitable finding.
Mythos Preview Methodology
Mythos Preview's web application testing methodology uses AI agents to crawl and interact with web applications, identifying vulnerabilities through automated navigation and testing. Their approach appears to focus on common web vulnerabilities including OWASP Top 10 coverage.
The depth of business-logic testing, payload generation sophistication, and WAF bypass capabilities are not extensively documented in their public materials. Evidence architecture and PoC generation approaches are similarly not publicly detailed.
| Web Capability | PhantomYerra | Mythos Preview |
|---|---|---|
| SQL Injection (all types) | ✓ 6 injection variants + second-order | ✓ Basic coverage likely |
| XSS (reflected, stored, DOM) | ✓ All three + mutation-based bypass | ✓ Likely covered |
| SSRF (including IMDS exploitation) | ✓ Full SSRF + cloud metadata pivot | Not publicly documented |
| IDOR / BOLA / BFLA | ✓ Native on every authenticated endpoint | Surface level at best |
| Race conditions | ✓ Concurrent request engine | ✗ |
| Mass assignment | ✓ | Not publicly documented |
| JWT algorithm confusion | ✓ | Not publicly documented |
| Deserialization (Java, PHP, .NET, Python, Ruby) | ✓ Multi-language payload library | Not publicly documented |
| WAF bypass generation | ✓ AI-generated encoding/mutation variants | Not publicly documented |
| GraphQL introspection + injection | ✓ | Varies |
| WebSocket security testing | ✓ | Not publicly documented |
Verdict: PhantomYerra covers the full web attack surface with 14 specialised engines including deep business-logic testing (IDOR, BOLA, BFLA, race conditions, JWT confusion) that Mythos Preview does not publicly document. The payload generation and WAF bypass capabilities provide real-world exploitation depth beyond signature matching.
PhantomYerra Methodology
- Discovery: TCP SYN/connect scanning, UDP scanning, service fingerprinting, OS detection, version detection across all 65,535 ports
- Protocol analysis: SMB enumeration (shares, users, password policy), SNMP community string testing, LDAP anonymous bind, FTP anonymous access, Telnet banner grabbing, RDP NLA testing
- Vulnerability assessment: Known CVE matching against detected services, default credential testing, SSL/TLS misconfiguration, DNS zone transfer, NTP amplification
- Exploitation: Automated exploitation of confirmed vulnerabilities with live payload generation, privilege escalation path analysis, lateral movement simulation
- Wireless integration: If wireless adapter detected, extends to WiFi network discovery and WPA testing
Mythos Preview Methodology
Mythos Preview's network infrastructure testing capabilities are not extensively documented in their public materials. Their platform appears to focus primarily on web application security rather than deep network infrastructure assessment. Internal network testing, protocol-level analysis, and lateral movement simulation capabilities are not publicly described.
Verdict: PhantomYerra provides full network infrastructure testing with 10 engines covering everything from port scanning to protocol analysis to exploitation. Mythos Preview does not publicly document network infrastructure testing capabilities, suggesting web-application focus.
PhantomYerra Methodology
- Subdomain discovery: Multi-source enumeration (certificate transparency, DNS brute-force, passive databases, search engine dorking, web archive mining), typically discovers 10-50x more subdomains than single-source scanners
- DNS intelligence: Zone transfer testing, DNS record enumeration (A, AAAA, CNAME, MX, TXT, SRV, NS), DNSSEC validation, subdomain takeover detection, wildcard DNS identification
- Web crawling: Deep recursive crawling with JavaScript rendering, form discovery, API endpoint extraction, hidden parameter mining, commented-out URL discovery
- Asset discovery: HTTP probing across all discovered hosts, technology fingerprinting, CDN detection, load balancer identification, WAF detection
- URL harvesting: Historical URL collection from web archives, search engine results, and passive DNS records: captures endpoints that may have been removed but are still accessible
- Organisation intelligence: Employee enumeration, email pattern detection, social media profiling, breach data correlation (ethical sources only), document metadata analysis, technology stack inference
- Certificate intelligence: Certificate transparency log monitoring, certificate chain analysis, SAN extraction for related domains
Mythos Preview Methodology
Mythos Preview's reconnaissance capabilities are not extensively documented. AI-powered platforms may perform some automated discovery as part of their web application testing workflow, but the depth of OSINT, subdomain enumeration, and organisation intelligence is not publicly detailed.
Verdict: PhantomYerra's 21 reconnaissance engines represent the most full OSINT and discovery capability in any automated platform. The combination of multi-source subdomain discovery, DNS intelligence, web crawling, URL harvesting, and organisation intelligence creates a complete picture of the target's attack surface before testing begins. This recon depth is a force multiplier for all subsequent attack phases.
PhantomYerra Methodology
- Languages: Python, JavaScript/TypeScript, Java, Go, C/C++, Ruby, PHP, C#/.NET, Swift, Kotlin, Rust, Scala, COBOL
- Standard analysis: Pattern-based vulnerability detection, taint analysis (source-to-sink tracing), control-flow analysis, configuration analysis, hardcoded secrets, ReDoS, prototype pollution, buffer overflows
- Zero-Day Engine 1 — Interprocedural Taint: Cross-file call graph construction, BFS propagation from 20 sources to 25 sinks across multiple function boundaries — catches chains invisible to single-file scanners
- Zero-Day Engine 2 — Race Condition & TOCTOU: AST-level detection of TOCTOU (os.path.exists→open/rename), broken DCL, mutex misuse, predictable temp files; generates concurrent PoC
- Zero-Day Engine 3 — Crypto Oracle: Padding oracle (CBC + distinguishable exception), timing oracle (non-constant-time HMAC), ECB mode, GCM nonce reuse, weak KDF, PKCS1v15 RSA — 5 languages
- Zero-Day Engine 4 — Auth Chain: JWT alg:none (CVSS 9.8), RS256→HS256 confusion, session fixation, IDOR without ownership check, MFA bypass via client session
- Zero-Day Engine 5 — Deserialization Gadgets: Python pickle/yaml/dill/__reduce__, Java ObjectInputStream/XStream/Kryo, Ruby, PHP, .NET — auto-generates ysoserial/phpggc chain PoC
- Zero-Day Engine 6 — Supply Chain: Typosquatting (Levenshtein ≤ 2, 50+ popular packages), known malicious packages, postinstall script curl/wget/bash detection, 6 manifest parsers
- Zero-Day Engine 7 — AI Adversarial: 5 AI passes (business_logic, parser_differential, trust_boundary, state_machine, type_confusion); gracefully degrades without AI key
- AI enhancement: AI reviews flagged code in context, generates remediation code in target's language/framework, produces SARIF output for IDE integration
Mythos Preview Methodology
Claude Mythos Preview's strength is precisely in source-code analysis — it found a 27-year-old OpenBSD bug and a 16-year-old FFmpeg vulnerability through source code inspection. However, it requires source code access via cloud API, has no standardised SAST adapter interface, no SARIF output, no compliance mapping, and is restricted to ~52 partner organisations. It is a research capability, not a deployable SAST product.
Verdict: PhantomYerra's SAST surface grew from 3 to 10 engines in v45.1.13 with the addition of 7 zero-day detection engines. This is the deepest automated SAST stack available in any commercial platform. Mythos Preview has world-class source-code analysis capabilities in its AI model, but they are not packaged as a deployable SAST tool with compliance mapping, SARIF output, or offline operation.
PhantomYerra Methodology
- Dependency parsing: package.json, requirements.txt, Pipfile, Gemfile, pom.xml, go.mod, Cargo.toml, composer.json, .csproj, Package.swift, build.gradle
- SBOM generation: CycloneDX and SPDX format output for compliance requirements
- Vulnerability matching: Cross-references dependencies against NVD, OSV, GitHub Advisory Database, and Snyk DB
- Exploitability analysis: Determines whether vulnerable code paths are actually reachable (not just present in dependency tree)
- License compliance: Identifies GPL, AGPL, and other copyleft licenses that may create legal obligations
- AI correlation: Correlates SCA findings with SAST findings, if a vulnerable library function is actually called in the source code, severity is elevated
Mythos Preview Methodology
SCA capabilities for Mythos Preview are not publicly documented. Supply chain security analysis is typically a distinct capability from dynamic application testing, and it is not clear whether Mythos Preview's platform includes this functionality.
Verdict: PhantomYerra's SCA engines analyse the complete software supply chain, from dependency trees to SBOM generation to reachability analysis. This is critical for compliance (PCI DSS 4.0, SOC 2) and for understanding whether a vulnerable dependency is actually exploitable in context.
PhantomYerra Methodology
- Supported formats: Terraform (HCL), AWS CloudFormation (JSON/YAML), Kubernetes manifests, Dockerfiles, Docker Compose, Ansible playbooks, Helm charts
- Checks: Overly permissive IAM policies, public S3 buckets, unencrypted storage volumes, missing logging/monitoring, insecure network ACLs, privileged containers, host network access, missing resource limits, hardcoded secrets in manifests
- Remediation: AI generates corrected IaC snippets in the same format - copy-paste ready
- Drift detection: Compares deployed infrastructure against IaC definitions to detect configuration drift
Mythos Preview Methodology
IaC security scanning capabilities for Mythos Preview are not publicly documented. This is typically a separate capability from web application penetration testing.
PhantomYerra Methodology
- Configuration audit: IAM over-permissions, public storage buckets/blobs, unencrypted databases, exposed management consoles, missing MFA on root/admin accounts, security group misconfigurations
- SSRF-to-IMDS: Automatically tests discovered SSRF vulnerabilities for cloud metadata endpoint access (169.254.169.254 / IMDSv1/v2), extracts temporary credentials, assesses blast radius
- Kubernetes: Cluster misconfiguration scanning, RBAC policy analysis, pod security policy violations, exposed dashboards, service account token abuse
- Serverless: Lambda/Cloud Functions permission analysis, event injection testing, cold start timing attacks
- Multi-cloud: Unified findings across AWS + GCP + Azure in a single engagement
Mythos Preview Methodology
Mythos Preview may offer some cloud security testing capabilities. The depth of cloud-specific testing, including SSRF-to-IMDS attack paths, Kubernetes security, and multi-cloud coverage, is not publicly documented in their materials.
Verdict: PhantomYerra's cloud engines test the full cloud-native attack surface, from IAM misconfiguration to SSRF-to-IMDS exploitation to Kubernetes RBAC abuse. The SSRF-to-cloud-credential-theft chain is one of the most impactful attack paths in modern infrastructure, and PhantomYerra tests it automatically.
PhantomYerra Methodology
- Crawling: Deep recursive crawling with JavaScript rendering, form discovery, multi-step form submission, authentication-aware crawling that maintains session state
- Attack surface: OWASP Top 10, OWASP API Top 10, plus 25+ business-logic vulnerability categories
- Authentication: Cookie-based, token-based (Bearer), OAuth 2.0, SAML, custom authentication header support
- API-first: OpenAPI/Swagger import, GraphQL introspection, gRPC reflection: tests every documented and undocumented endpoint
- Fuzzing: Parameter fuzzing, header fuzzing, body mutation, boundary value testing, encoding variation
- Rate limiting: Configurable request rate to avoid overwhelming targets, adaptive throttling based on response times
Mythos Preview Methodology
Mythos Preview's core strength appears to be in dynamic web application testing. Their AI agents likely navigate applications, submit forms, and test for common vulnerabilities. The depth of API testing, authenticated scanning, and business-logic coverage is partially documented but not exhaustively detailed.
PhantomYerra Methodology
- Prompt injection: Direct injection, indirect injection (via retrieved documents), multi-turn injection, context-window manipulation
- System prompt extraction: Multiple extraction techniques to reveal hidden system prompts - often containing API keys, internal URLs, and business logic
- Tool/function misuse: Tests whether LLM tool-use capabilities can be abused to access unauthorised data, execute unintended actions, or bypass access controls
- Jailbreak: Tests model guardrail bypass using known and novel techniques: DAN-style, role-playing, encoding tricks, multi-language attacks
- Data exfiltration: Tests whether sensitive training data or user data can be extracted through crafted prompts
- Denial of service: Resource exhaustion through crafted inputs, infinite loop triggers, context window overflow
- OWASP LLM Top 10 2025: Full coverage including LLM01 (Prompt Injection), LLM02 (Insecure Output Handling), LLM03 (Training Data Poisoning), LLM04 (Model Denial of Service), LLM05 (Supply Chain), LLM06 (Sensitive Information Disclosure), LLM07 (Insecure Plugin Design), LLM08 (Excessive Agency), LLM09 (Overreliance), LLM10 (Model Theft)
Mythos Preview Methodology
Mythos Preview's AI/LLM security testing capabilities are not publicly documented. This is a rapidly evolving attack surface: testing AI systems with AI requires specialised engines that understand prompt injection, tool-use abuse, and jailbreak techniques. It is unclear whether Mythos Preview offers this as a testing surface.
Verdict: AI/LLM security is the fastest-growing attack surface in enterprise technology. PhantomYerra covers the complete OWASP LLM Top 10 with specialised engines for prompt injection, tool misuse, and jailbreak testing. This capability is absent from most platforms, including - based on public information - Mythos Preview.
PhantomYerra Methodology
- Safety-first policy: OT testing uses read-only reconnaissance and passive analysis by default. Active testing requires explicit safety-mode confirmation with documented rollback procedures.
- Protocol testing: Modbus TCP/RTU (function code enumeration, coil/register reading, unauthorised write testing), DNP3 (outstation enumeration, unsolicited response injection), OPC-UA (authentication bypass, certificate validation, node browsing)
- Network segmentation: IT/OT boundary analysis, VLAN hopping potential, firewall rule assessment, jump host security
- PLC security: Firmware version identification, known vulnerability matching, default credential testing, programming port exposure
- SCADA: HMI web interface testing, historian database access, control system authentication
Mythos Preview Methodology
Mythos Preview does not appear to offer OT/ICS/SCADA security testing. This is a highly specialised domain requiring protocol-specific engines and safety-aware testing methodology that is fundamentally different from web application testing.
Verdict: OT/ICS/SCADA security is critical for manufacturing, energy, utilities, and critical infrastructure. PhantomYerra provides safety-aware testing with protocol-specific engines. Mythos Preview does not cover this surface.
PhantomYerra Methodology
- Device discovery: Network scanning for IoT devices, UPnP/SSDP enumeration, mDNS/DNS-SD discovery, MQTT broker identification
- Protocol testing: MQTT (anonymous access, topic enumeration, message injection), CoAP, AMQP, BLE (GATT enumeration, pairing bypass), Zigbee (network sniffing, key extraction)
- Default credentials: Full database of IoT device default credentials - routers, cameras, printers, smart home, industrial sensors
- Firmware: Firmware extraction, filesystem analysis, hardcoded credential search, embedded key extraction
- Cloud backend: Tests the cloud APIs that IoT devices communicate with - authentication, authorisation, data exposure
Mythos Preview Methodology
IoT security testing is not publicly documented as a Mythos Preview capability. IoT testing requires protocol-specific engines (MQTT, BLE, Zigbee, CoAP) and firmware analysis capabilities that are distinct from web application testing.
PhantomYerra Methodology
- Android static: APK decompilation, manifest analysis (exported components, debug flags, backup flags), hardcoded secrets, certificate pinning detection, root detection bypass points
- iOS static: IPA analysis, plist inspection, entitlement review, ATS configuration, Keychain storage analysis
- Dynamic: Runtime instrumentation (method hooking, SSL pinning bypass, jailbreak/root detection bypass), traffic interception, API call tracing
- Zero-Day Engine 1 — DEX Bytecode: Smali + Java analysis for dynamic class loading, SSL bypass, AES/ECB, obfuscated Base64→exec; DEX string table struct parsing
- Zero-Day Engine 2 — Intent Fuzzer: Static exported-component parsing + ADB dynamic fuzzing (string/integer/path-traversal payloads) + ContentProvider SQLi probe; live device optional
- Zero-Day Engine 3 — WebView Bridge: addJavascriptInterface API<17 (CVSS 9.8), @JavascriptInterface with file/exec access, setAllowUniversalAccessFromFileURLs (CVSS 8.8), loadUrl from Intent extras
- Zero-Day Engine 4 — IPC Violations: Binder/AIDL missing permission, ContentProvider SQLi, openFile() path traversal, mutable PendingIntent, PreferenceActivity fragment injection
- Backend API: Full API testing of mobile backend endpoints with the same 14 web engines — most mobile vulnerabilities are API vulnerabilities
- OWASP MASTG/MASVS: Full coverage of OWASP Mobile Application Security Testing Guide controls
Mythos Preview Methodology
Mythos Preview may offer limited mobile testing capabilities. As a source-code analysis model, it could theoretically analyse Android/iOS code for vulnerabilities, but dedicated DEX bytecode analysis, ADB dynamic fuzzing, Intent fuzzing against live devices, and IPC violation detection are specialised capabilities not publicly documented for Mythos.
Verdict: PhantomYerra's mobile surface grew from 2 to 6 engines in v45.1.13. The 4 new zero-day engines target the Android-specific attack surface at bytecode level (DEX analysis), IPC layer (Intent fuzzing, ContentProvider SQLi, PendingIntent), and WebView bridge — attack classes that standard scanners miss entirely.
PhantomYerra Methodology
- Extraction: Firmware image parsing, filesystem extraction (SquashFS, JFFS2, CramFS, UBI), binary identification
- Analysis: Hardcoded credential search, certificate/key extraction, vulnerable library detection, configuration file analysis, web server root discovery
- Vulnerability matching: Cross-references embedded binaries against NVD/CVE databases for known vulnerabilities
- Emulation: Partial firmware emulation for dynamic testing of extracted web interfaces and services
Mythos Preview Methodology
Firmware analysis is not publicly documented as a Mythos Preview capability. This is a specialised discipline requiring binary analysis tools and filesystem extraction capabilities.
PhantomYerra Methodology
- Binary analysis: PE/ELF/Mach-O parsing, section analysis, import/export table inspection, string extraction
- Disassembly: x86/x64/ARM instruction disassembly, control flow graph generation, function identification
- Vulnerability patterns: Buffer overflow detection (stack/heap), format string vulnerability identification, integer overflow conditions, use-after-free patterns, race condition indicators
- AI-assisted: AI analyses decompiled code for security-relevant patterns, identifies crypto implementations, traces data flow through binary functions
Mythos Preview Methodology
Reverse engineering capabilities are not publicly documented for Mythos Preview. Binary analysis requires specialised tooling fundamentally different from web application testing.
PhantomYerra Methodology
- WiFi: WPA2/WPA3 handshake capture and analysis, PMKID extraction, evil twin AP detection, deauthentication resilience testing, WPS PIN brute-force, enterprise RADIUS testing
- Bluetooth: BLE device enumeration, GATT service/characteristic discovery, pairing vulnerability testing, BLE relay attack simulation
- Rogue AP: Detection of unauthorised access points, SSID spoofing detection, karma attack detection
- RF analysis: Sub-GHz signal analysis for IoT/smart home devices, replay attack testing, rolling code analysis
Mythos Preview Methodology
Wireless security testing requires physical wireless adapters and specialised protocol knowledge. As a cloud-based platform, Mythos Preview does not appear to offer wireless security testing: this requires local hardware access that SaaS platforms cannot provide.
PhantomYerra Methodology
- Hash analysis: Identification of hash types (MD5, SHA-1, SHA-256, bcrypt, scrypt, NTLM, NTLMv2, Kerberos), assessment of hashing strength
- Dictionary attacks: Multi-strategy dictionary attacks with rule-based mutation (leet speak, append numbers, capitalisation patterns, keyboard walks)
- Pattern analysis: Analysis of password patterns in extracted hashes to identify organisational password culture weaknesses
- Credential testing: Default credential testing against discovered services, spray attacks with lockout-awareness
- Breach correlation: Ethical correlation of discovered email addresses against known breach databases (Have I Been Pwned API) to assess credential reuse risk
- Policy assessment: Evaluation of password policies (length, complexity, rotation, history) against NIST SP 800-63B guidelines
Mythos Preview Methodology
Password auditing capabilities for Mythos Preview are not publicly documented. Web application credential testing (brute-force, default passwords) may be covered as part of web testing, but dedicated hash analysis and password policy assessment are not described in their public materials.
Exploitation Methodology
The difference between a scanner and a penetration tester is exploitation. A scanner reports "port 80 open." A penetration tester pulls unauthenticated data, extracts credentials, chains to privilege escalation, and proves impact.
| Exploitation Step | PhantomYerra | Mythos Preview |
|---|---|---|
| ▶ Discovery → Exploitation Transition | ||
| Automatic exploitation of confirmed vulnerabilities | ✓ AI decides when to exploit based on severity + scope | Not publicly documented |
| Human approval gate before exploitation (configurable) | ✓ Autonomous or step-by-step modes | Not publicly documented |
| Exploitation attempt limit (configurable retries) | ✓ Default: 5 attempts with payload variation per vulnerability | Not publicly documented |
| ▶ Payload Generation | ||
| Context-aware payload generation | ✓ Adapted to target tech stack, WAF, encoding | Not publicly documented |
| WAF bypass variant invention | ✓ AI generates encoding/mutation variants per blocked payload | Not publicly documented |
| Multi-language payload library (SQLi, XSS, CMDi, SSTI, deserialization) | ✓ Database + PHP + MSSQL + Oracle + MongoDB injection variants; JS + Python + Java + PHP + .NET serialisation payloads | Not publicly documented |
| ▶ Exploitation Confirmation | ||
| Four-tier exploitation status | ✓ EXPLOITED → CONFIRMED → SUSPECTABLE → POTENTIAL | Binary (vuln/not vuln) likely |
| PoC round-trip before report entry | ✓ Real request sent, real response captured, success matched | Not publicly documented |
| Copy-paste curl/nc PoC for every finding | ✓ Reproduction commands in every report finding | Not publicly documented |
| ▶ Post-Exploitation | ||
| Privilege escalation path analysis | ✓ Automatic pivoting from initial access to highest privilege | Not publicly documented |
| Lateral movement simulation | ✓ Credential reuse, token theft, session hijacking across discovered services | Not publicly documented |
| Data exfiltration proof | ✓ Demonstrates data access without extracting real PII (count + sample) | Not publicly documented |
Payload Generation & Delivery
PhantomYerra generates payloads dynamically - adapted to the target's technology stack, WAF vendor, encoding requirements, and detected defences. This section compares the payload generation approach in detail.
6 Injection Variants
- Error-based: triggers database error messages revealing schema
- UNION-based: appends UNION SELECT to extract data from other tables
- Blind boolean: infers data one bit at a time via true/false responses
- Blind time-based: uses SLEEP/WAITFOR/pg_sleep to infer data via response timing
- Stacked queries: injects additional statements (INSERT, UPDATE, DROP)
- Second-order: payload stored in database, triggers when retrieved by different query
Database-specific payloads for MySQL, PostgreSQL, MSSQL, Oracle, SQLite, MongoDB.
Context-Aware Generation
- HTML context: <script>, <img onerror>, <svg onload>
- Attribute context: event handlers, javascript: protocol
- JavaScript context: breaking out of strings, template literals
- URL context: javascript: protocol, data: URIs
- CSS context: expression(), url(), import
- WAF bypass: Unicode normalisation, HTML entity encoding, double encoding, case mutation, comment injection
Multi-OS Payload Library
- Linux: ; | || && $() ` backtick, /etc/passwd extraction, reverse shell
- Windows: & | && , %COMSPEC% abuse, PowerShell encoded commands
- Blind: DNS exfiltration, out-of-band HTTP callbacks, timing-based
- Filter bypass: variable expansion, IFS manipulation, wildcard abuse
Per-Engine Templates
- Jinja2: {{config}}, {{''.__class__.__mro__}}
- Twig: {{_self.env.registerUndefinedFilterCallback}}
- Freemarker: ${"freemarker.template.utility.Execute"}
- Velocity: #set($x = $class.inspect("java.lang.Runtime"))
- Pebble, Thymeleaf, Smarty, Mako - per-engine payloads
Cloud-Aware Exploitation
- Cloud metadata: http://169.254.169.254/latest/meta-data/ (AWS), http://metadata.google.internal/ (GCP), http://169.254.169.254/metadata/ (Azure)
- Internal service discovery: port scanning via SSRF, internal API access
- Protocol smuggling: gopher://, dict://, file:// protocol abuse
- Bypass techniques: IP obfuscation (decimal, hex, octal), DNS rebinding, redirect chains
Multi-Language Payloads
- Java: Commons-Collections, Spring, ROME gadget chains
- PHP: __wakeup/__destruct chain exploitation
- .NET: TypeConfuseDelegate, ObjectDataProvider, BinaryFormatter
- Python: pickle RCE via __reduce__
- Ruby: Marshal.load exploitation, ERB template injection
Payload depth comparison: PhantomYerra maintains a full, multi-language payload library with WAF-bypass variant generation. Each payload is adapted to the target's detected technology stack and defences. Mythos Preview's payload generation capabilities are not publicly documented: it is unclear whether they generate custom payloads or rely on signature-based matching.
Zero-Day Discovery Process
Finding vulnerabilities that have no CVE requires a fundamentally different approach than signature matching. This is where AI-driven penetration testing separates from AI-assisted scanning.
PhantomYerra Zero-Day Methodology
- Step 1 - Anomaly detection: AI monitors response patterns (timing, size, status codes, error messages) for deviations that suggest unexpected behaviour: the first signal of an unknown vulnerability
- Step 2 - Hypothesis generation: AI formulates hypotheses about what the anomaly might indicate (buffer overflow? race condition? authentication bypass?) based on the response pattern and target technology
- Step 3 - Targeted fuzzing: AI generates targeted payloads to test each hypothesis: not random fuzzing, but intelligent mutation based on the anomaly's characteristics
- Step 4 - Exploitation attempt: If fuzzing triggers a clear vulnerability signal, AI attempts controlled exploitation with increasing payload sophistication (up to 5 attempts with different variants)
- Step 5 - Impact assessment: Once exploitation is confirmed, AI assesses the full impact - data access, privilege level, lateral movement potential, business impact
- Step 6 - Evidence packaging: Complete evidence chain: discovery trigger → hypothesis → test payloads → exploitation proof → impact assessment. All timestamped with RFC 3161.
- Step 7 - Responsible disclosure: AI flags potential zero-days for human review before including in client reports, with recommended disclosure timeline
CVE Exploitation Without Public Exploits
When a CVE exists but no public exploit is available, PhantomYerra's approach:
- Patch analysis: AI analyses the security patch (if available) to understand the vulnerability root cause - diff analysis reveals the exact code path
- Advisory mining: Extracts technical details from NVD, vendor advisories, and research papers to understand vulnerability mechanics
- Exploit authoring: AI writes a proof-of-concept exploit based on the vulnerability description, patch analysis, and target's detected version/configuration
- Validation: Authored exploit is tested against the target in a controlled manner, if successful, finding is promoted to EXPLOITED status with full evidence
- If unexploitable: After exhausting all payload variants and injection points, the finding is marked SUSPECTABLE with documentation of all attempted approaches
Mythos Preview Zero-Day Capability
Claude Mythos Preview is genuinely world-class at zero-day discovery — this is its stated core capability. Anthropic documents it finding a 27-year-old TCP SACK vulnerability in OpenBSD (missed by decades of automated scanning), a 16-year-old FFmpeg codec bug, and constructing multi-vulnerability browser exploit chains with JIT heap sprays and 20-gadget ROP chains. It uses ASan crash oracles for memory bugs and SHA-3 hash commitments for responsible disclosure proofs. Its zero-day discovery in source-code-visible scenarios is extraordinary.
What Mythos does not demonstrate: black-box zero-day discovery against live web/API targets without source access, offline zero-day analysis, deployable zero-day results at per-seat license economics, or compliance-mapped zero-day findings with chain-of-custody evidence.
Attack Chaining & Graph Correlation
Individual vulnerabilities are data points. Attack chains are intelligence. A scanner finds an open port. A penetration tester chains that open port → default credentials → database access → credential extraction → admin panel → full compromise.
Real-Time Multi-Surface Correlation
PhantomYerra maintains a live attack graph during every engagement. Every finding is a node. Every exploitation path is an edge. The AI continuously evaluates:
- Can this finding be chained with others to increase impact?
- Does this finding enable access to a new surface? (e.g., SSRF → cloud metadata → IAM credentials → S3 access)
- What is the shortest path from initial access to critical asset compromise?
- Which chains cross trust boundaries?
The attack graph is included in the final report with visual representation of all chains.
Output: Graph with nodes (findings) + edges (chains) + critical paths highlightedChaining Capability
Multi-surface attack chaining capabilities for Mythos Preview are not publicly documented. Without deep coverage across 16+ attack surfaces, the opportunity for cross-surface chaining is inherently limited.
Single-surface platforms can identify chains within web applications (e.g., XSS → session theft → account takeover), but cannot identify chains that cross surfaces (e.g., web SSRF → cloud metadata → infrastructure compromise).
Common Attack Chains PhantomYerra Identifies
- Web → Cloud: SSRF in web application → cloud metadata endpoint → temporary IAM credentials → S3 bucket data exfiltration
- Recon → Web → Auth: Subdomain discovery → forgotten staging server → default credentials → production database connection strings
- OSINT → Phishing → Internal: Employee email patterns → credential stuffing against VPN → internal network access → lateral movement
- IoT → Network → Data: Default credentials on IoT device → network foothold → VLAN traversal → database server access
- SCA → Web → RCE: Vulnerable dependency identified → exploit for known CVE → remote code execution on web server
- Mobile → API → Data: Hardcoded API key in mobile app → unrestricted API access → customer data exfiltration
The Six Evidence Gates
This is PhantomYerra's most important architectural decision. In an AI-powered platform, the AI can hallucinate findings, inflate severity, fabricate CVE references, and generate convincing but false evidence. The six evidence gates prevent all of this.
Evidence Gate
Every finding must carry a non-empty evidence dictionary, real HTTP request, real response, real payload. No evidence = finding rejected. Period.
What it prevents: AI hallucinating vulnerabilities that don't exist. The finding must have been observed, not inferred.
PoC Execution Gate
Before any proof-of-concept appears in a report, it must have been executed against the target. Real request sent. Real response captured. Success condition matched against expected output.
What it prevents: AI generating plausible-looking but untested PoC code. Every PoC in a PhantomYerra report is tested code.
CVE Provenance Gate
Every CVE reference cites its authoritative source: NVD, OSV, CVE-5, GitHub Advisory, or Shodan InternetDB. The raw authoritative response is stored alongside the finding.
What it prevents: AI fabricating CVE numbers. LLMs frequently generate plausible-looking but non-existent CVE IDs. This gate rejects any CVE not verified against an authoritative source.
CVSS Provenance Gate
CVSS vectors come from authoritative sources or are formula-derived from documented finding metadata. The formula inputs are cited. The calculation is deterministic and reproducible.
What it prevents: AI inflating CVSS scores. An AI asked to assess severity will tend toward higher scores (more dramatic = more likely to be generated). This gate ensures CVSS is computed, not guessed.
AI Narrative Quarantine
AI-generated prose is confined to description and attack_story fields only. Severity, affected-component, CVSS, CVE, exploitation-status, and remediation priority are computed from telemetry data: never from AI output.
What it prevents: AI influence on factual fields. The AI can write compelling narrative, but cannot change the facts of a finding.
Exploitation-Status Gate
Four precise statuses with evidence requirements for each tier:
- EXPLOITED: Payload sent AND target returned exploitation signal (data extracted, command executed, privilege changed)
- CONFIRMED: Proven by observable server behaviour (error message, timing difference, response variation)
- SUSPECTABLE: Signature matched but no active exploitation proof
- POTENTIAL: Discovery-only - surface identified but not tested/exploitable
What it prevents: Status inflation. A finding can only be EXPLOITED if exploitation was actually demonstrated with evidence.
| Evidence Capability | PhantomYerra | Mythos Preview |
|---|---|---|
| Mandatory evidence on every finding | ✓ Gate 1: no exceptions | ASan crash oracle validates memory bugs. Logic bug validation "still hard." |
| PoC round-trip verification | ✓ Gate 2: tested before report | ✓ PoC code generated for confirmed vulns. Human triagers validate before disclosure. |
| CVE provenance verification | ✓ Gate 3: NVD/OSV sourced | Finds novel zero-days (no CVE yet). Disclosure uses SHA-3 hash commitments. |
| CVSS formula-derived (not AI-generated) | ✓ Gate 4: deterministic | 89% severity accuracy vs human expert (98% within one level) |
| AI prose quarantined from factual fields | ✓ Gate 5: strict separation | ✗ No report-level quarantine. Raw model outputs. |
| Four-tier exploitation status | ✓ Gate 6: evidence-backed | 5-tier crash severity (ASan-based, not report-level) |
| RFC 3161 timestamping | ✓ Legal-grade timestamps | ✗ Not a product feature. No reporting engine. |
| Chain-of-custody log | ✓ Who captured, when, hash verification | ✗ Not a product feature. |
| Evidence hash signing (SHA-256) | ✓ Tamper detection on all evidence | SHA-3 hash commitments for responsible disclosure proofs |
Verdict: The six evidence gates are PhantomYerra's most significant architectural differentiator. No other platform, including Mythos Preview - has publicly documented an anti-hallucination framework for AI-generated penetration test findings. In an industry moving toward AI-generated reports, this is the difference between trusted results and expensive noise.
Reporting & Deliverables
| Report Capability | PhantomYerra | Mythos Preview |
|---|---|---|
| Output formats | PDF + DOCX + HTML + SARIF + JSON + CSV | PDF likely; other formats undocumented |
| Executive summary (non-technical) | ✓ AI-written C-suite narrative | Likely available |
| Technical detail (per-finding) | ✓ Full evidence, PoC, remediation per finding | Likely available |
| Attack graph visualisation | ✓ Visual attack chain graph in report | Not publicly documented |
| Compliance mapping per finding | ✓ SOC 2, PCI DSS, HIPAA, ISO 27001, NIST mapping | Not publicly documented |
| Remediation code generation | ✓ AI generates fix code in target's language/framework | Not publicly documented |
| SARIF output (IDE integration) | ✓ | Not publicly documented |
| Custom report templates | ✓ Fully customisable templates | Not publicly documented |
| Trend analysis (multi-scan comparison) | ✓ Historical vulnerability trending | Not publicly documented |
| Client-branded reports | ✓ White-label with client logo/branding | Not publicly documented |
Deployment & Privacy
Where your data lives during a penetration test matters. A lot.
| Deployment Capability | PhantomYerra | Mythos Preview |
|---|---|---|
| ▶ Architecture | ||
| Desktop application (runs on your machine) | ✓ Full desktop app: your machine, your network | ✗ Cloud API only. No desktop application. |
| Commercially available to enterprises | ✓ Per-seat perpetual license, buy directly | ✗ Restricted to ~52 Project Glasswing partners (AWS, Apple, Google, etc.) |
| On-premise deployment | ✓ Full on-prem: zero cloud dependency | ✗ Cloud API only. Anthropic API, Bedrock, Vertex AI, or Foundry. |
| Air-gapped environment support | ✓ Local AI model fallback, zero external calls | ✗ Cloud-only. Cannot operate without internet. |
| User interface / dashboard | ✓ Full GUI with scan management, findings, reports | ✗ No UI. Raw API model only. |
| ▶ Data Privacy | ||
| Client targets never sent to external AI | ✓ Reference-token anonymisation before every AI call | ✗ All source code and data sent to Anthropic cloud for processing |
| Scan data stays on your machine | ✓ All data local: SQLite database | ✗ Data processed on Anthropic infrastructure |
| Evidence encrypted at rest | ✓ AES-256-GCM encryption | Not applicable: no local evidence storage |
| Data residency compliance (GDPR, data sovereignty) | ✓ Data never leaves your jurisdiction | ✗ Data traverses Anthropic cloud (US-based) |
| ▶ Platform Support | ||
| Windows | ✓ Native installer (~115 MB) | ✗ No installer. API access only. |
| Linux | ✓ AppImage / DEB | Runs in isolated Linux containers (research workflow) |
| macOS | Planned | ✗ No native application |
| Container (Docker/Podman) | ✓ | ✓ Research workflow uses isolated containers |
| CLI mode | ✓ Full CLI for CI/CD integration | API-only. No CLI product. |
Reference-Token Anonymisation
Before any data is sent to an external AI endpoint, PhantomYerra's privacy engine replaces all real targets, IPs, URLs, company names, and PII with reference tokens (e.g., [TARGET_URL_1], [COMPANY_REF]). The AI sees only anonymised references. After the AI response, tokens are restored locally. The reference map never leaves the machine.
This means even if the AI provider's logs were compromised, no client target information would be exposed.
Zero External Calls
For the most sensitive environments (defence, classified, critical infrastructure), PhantomYerra can run in fully air-gapped mode. All AI processing uses local models running on the same machine. Zero network calls. Zero cloud dependency. The full 87-engine arsenal remains available — including 10 of 11 zero-day engines (AI adversarial engine degrades gracefully without a provider).
Mythos Preview, as a cloud-based platform, cannot operate in air-gapped environments: a fundamental architectural limitation for defence and classified clients.
Enterprise Features
| Enterprise Feature | PhantomYerra | Mythos Preview |
|---|---|---|
| ▶ Access Control | ||
| Multi-user RBAC | ✓ 5 roles: super_admin, pentest_lead, tester, reviewer, client | ✓ Likely available |
| Project-level access scoping | ✓ Users scoped to assigned projects only | Not publicly documented |
| SSO (SAML 2.0, Okta, Azure AD) | ✓ | Enterprise tier likely |
| Audit log (append-only, tamper-proof) | ✓ No delete/update ever | Not publicly documented |
| ▶ Integrations | ||
| Jira (create issues from findings) | ✓ | ✓ |
| ServiceNow CMDB sync | ✓ | Not publicly documented |
| Splunk / Elastic SIEM | ✓ | Not publicly documented |
| Slack / Teams notifications | ✓ | Likely available |
| GitHub / GitLab CI integration | ✓ | Not publicly documented |
| AWS / Azure / GCP cloud integration | ✓ | Not publicly documented |
| PagerDuty / OpsGenie alerting | ✓ | Not publicly documented |
| Linear / Azure DevOps tracking | ✓ | Not publicly documented |
| Total integrations | 15+ wired integrations | Not publicly documented |
| ▶ Licensing | ||
| Per-seat licensing | ✓ Per-seat with offline capability | SaaS subscription likely |
| Enterprise perpetual license option | ✓ | Not publicly documented |
| Kill switch (remote disable for stolen seats) | ✓ | Account disable likely |
| Offline grace period | ✓ Works without internet after validation | ✗ Requires internet |
Compliance Framework Coverage
| Framework | PhantomYerra | Mythos Preview |
|---|---|---|
| OWASP Top 10 2021 | ✓ Full mapping | ✗ Not a product feature. Raw AI model. |
| OWASP API Top 10 2023 | ✓ | ✗ Not a product feature. |
| OWASP LLM Top 10 2025 | ✓ | ✗ |
| OWASP MASVS (Mobile) | ✓ | ✗ |
| PCI DSS 4.0 | ✓ Findings mapped to PCI requirements | ✗ No compliance mapping. Not a product. |
| SOC 2 Type II | ✓ | ✗ |
| HIPAA | ✓ | ✗ |
| ISO 27001:2022 | ✓ | ✗ |
| NIST CSF 2.0 | ✓ | ✗ |
| NIST SP 800-53 | ✓ | ✗ |
| GDPR (data handling compliance) | ✓ On-prem = data never leaves jurisdiction | ✗ All data processed on Anthropic US cloud. |
| FedRAMP / FISMA | ✓ Air-gapped mode enables classified use | ✗ |
| CIS Benchmarks | ✓ | ✗ |
| MITRE ATT&CK mapping | ✓ Findings mapped to ATT&CK techniques | ✗ |
| PTES (Penetration Testing Execution Standard) | ✓ | ✗ |
Pricing & Licensing Model
| Pricing Dimension | PhantomYerra | Mythos Preview |
|---|---|---|
| Pricing model | Per-seat perpetual license | API token consumption: $25/M input, $125/M output (5x Opus pricing) |
| License tiers | Single Seat / Team / Enterprise | ✗ No tiers. Restricted to ~52 Glasswing partners. |
| Cloud infrastructure cost | $0: runs on your hardware | $10K-$20K per scanning campaign (e.g., OpenBSD: ~$20K for ~1,000 runs) |
| Cost per exploit | $0 incremental: included in license | $50 per simple exploit, $1K-$2K per complex N-day exploit |
| Scan volume limits | Unlimited: no per-scan charges | Usage-based. API tokens consumed per run. |
| AI usage limits | Your own AI key: you control costs | $100M total Glasswing credit pool (shared among all partners) |
| Perpetual license option | ✓ Buy once, own forever | ✗ No licensing model. Partner-access only. |
| Offline operation after purchase | ✓ Works without internet | ✗ Cloud API only. |
| Commercially purchasable today | ✓ Available now. Download and install. | ✗ "We do not plan to make Mythos Preview generally available" - Anthropic |
Cost analysis: PhantomYerra is a product you can buy today. Per-seat perpetual license. No per-scan charges, no usage-based pricing, no cloud infrastructure costs. Your AI key, your compute, your data. Claude Mythos Preview is not commercially available: it is restricted to ~52 partner organisations, costs $25/M input tokens ($20K per campaign), and Anthropic has stated they do not plan to make it generally available. Even for partners with access, the per-campaign cost makes routine security testing expensive. PhantomYerra runs unlimited scans after a one-time license purchase.
The Final Verdict
After comparing every surface, every methodology, every capability: the conclusion is clear.
150+ Engines
PhantomYerra ships 150+ engines including 7 SAST zero-day engines and 4 mobile zero-day engines added in v45.1.13. Full multi-surface coverage from web to IoT to OT/ICS to AI/LLM.
Platform vs Raw API
PhantomYerra ships as a complete platform with UI, reporting, compliance, RBAC. Mythos is a raw AI model with no UI, no reporting engine, no installer. Powerful capability, zero product packaging.
11 Engines vs Cloud-Only
PhantomYerra has 11 dedicated zero-day engines operating offline. Mythos has genuinely world-class zero-day capability in source-code analysis — but requires cloud API and source access. Different models, different access.
Buy Today vs Restricted
PhantomYerra: download and install now. Mythos: restricted to ~52 organisations. Anthropic: "We do not plan to make Mythos Preview generally available."
$0 vs $10K-$20K
PhantomYerra: unlimited scans after one-time license. Mythos: $20K per OpenBSD campaign, $50-$2K per individual exploit. Per-token API at $25/$125M.
8 Providers vs 1
PhantomYerra routes through 8 AI providers: Anthropic → OpenAI → Google → Groq → Together → Azure Copilot → Ollama → LM Studio. Air-gapped local model fallback. Mythos: Anthropic cloud API only, no fallback, no offline.
The Bottom Line
PhantomYerra and Claude Mythos Preview are built for fundamentally different purposes. PhantomYerra is a complete, shipping penetration testing platform: 150+ engines including an 11-engine zero-day detection suite, 25 attack surfaces, 6 evidence gates, 8-provider AI chain, 8 Big-4-grade surface-specific report engines, SECURA 0-100 scoring, cross-scan institutional memory, a public REST API, AI-agent guardrails, optional multi-agent mode, on-premise deployment, air-gapped capability, per-seat licensing. You buy it, install it, and run unlimited assessments against live targets — black-box, without source code access.
Claude Mythos Preview is a world-class vulnerability research model that excels at finding zero-days in source code and constructing multi-vulnerability exploit chains. It found a 27-year-old OpenBSD bug and a 16-year-old FFmpeg vulnerability that millions of automated test runs missed. Its source-code analysis and binary reverse engineering capabilities are genuinely extraordinary. But it is not a penetration testing product. It has no UI, no reporting engine, no compliance mapping, no team features, no RBAC, requires source code access, and is restricted to ~52 partner organisations at $25/M token pricing.
For organisations that need a complete security assessment platform they can deploy today, run against live targets, produce compliant reports, and now detect novel zero-day classes without source code access: PhantomYerra is the only option. For vulnerability researchers at partner organisations who need an AI model to find zero-days in source code and construct advanced exploit chains: Mythos is extraordinary at that specific task.
SHA-256: 2e992a83e049d03012aa1118b9bf42548cb7409913c1feb5d0c1e044f3f75fbc
Signed: 2026-04-19
Verify: phantomyerra.com/SIGNATURES.json
Every update refreshes the hash, timestamp, and signature. This is a real cryptographic seal, not a decoration.