Local-Only Mode
PhantomYerra's biggest competitive advantage: 100% local AI inference with zero data transmitted outside your machine. Works in fully air-gapped environments with no internet connection required after initial setup.
Your Data Never Leaves Your Machine
Every competitor, including Shannon: requires an internet connection to an AI API for every scan. Your target URLs, source code snippets, vulnerability descriptions, and findings all travel to an external server for processing. PhantomYerra is the only security tool that gives you a genuine choice.
Shannon (Cloud-Only)
- Requires Anthropic API for every scan
- Target URLs sent to external servers
- Source code analysis leaves your machine
- Findings transmitted to AI for narrative writing
- No air-gapped mode
- No internet = no AI = degraded functionality
- Not suitable for classified/regulated environments
PhantomYerra (Local-First)
- Full AI inference via Ollama: zero API calls
- Target URLs never transmitted
- Source code stays 100% local
- All findings processed on-machine
- True air-gapped mode available
- No internet = full functionality via local models
- Approved for government, healthcare, finance
Three Operating Modes
Cloud Mode (Default)
Uses Anthropic Claude for maximum capability. PrivacyFilter anonymizes all data before transmission - target URLs, IPs, and company names are replaced with tokens. Real values never leave your machine.
Local-Only Mode
All AI inference routes to Ollama running on localhost. Zero data transmitted. Works 100% offline. Choose this for air-gapped networks, classified environments, or maximum confidentiality.
Hybrid Mode
Smart routing: target-sensitive analysis (URLs, source code, finding details) goes to Ollama locally. Non-sensitive tasks (report narrative, remediation advice) go to Claude for maximum quality.
Air-Gapped Mode
Complete network isolation. All AI is local. License checked locally. No telemetry. No external calls of any kind. Designed for defence, intelligence, and critical infrastructure environments.
How to Enable Local Mode
Step 1: Install Ollama
Windows
# Download from https://ollama.com/download/windows # Run the installer - Ollama starts automatically as a system service # Verify installation: ollama --version
Linux
curl -fsSL https://ollama.com/install.sh | sh # Ollama runs as a systemd service on port 11434 ollama --version
macOS
# Download from https://ollama.com/download/mac # Open the .dmg and drag Ollama to Applications # Ollama runs in the menu bar - click to start ollama --version
Step 2: Pull a Model
Choose a model based on your available VRAM. If you have no GPU, Ollama will run on CPU (slower but functional). Contact support@phantomyerra.com for the current recommended model names for your hardware configuration.
# Minimum (CPU-only, works on any machine): ollama pull [small-fast-model] # ~4 GB - fast, general purpose # Recommended for security work (8-16 GB VRAM): ollama pull [code-analysis-model] # best for source code analysis # Best quality (24-48 GB VRAM or large RAM for CPU): ollama pull [large-reasoning-model] # complex security reasoning
Step 3: Enable Local Mode in PhantomYerra
Open Settings
Click the gear icon in the bottom-left sidebar or press Ctrl+,
Navigate to AI Configuration
In Settings, select AI Configuration from the left menu.
Select "Local-Only Mode"
Under AI Provider Mode, click Local-Only (Ollama). PhantomYerra will verify Ollama is running before switching.
Verify the Status Indicator
The top bar shows a shield icon with "Local AI" when local mode is active. All subsequent scans use Ollama exclusively.
Recommended Models
Contact support@phantomyerra.com for current recommended Ollama model names. PhantomYerra auto-selects the best available model from what you have installed - larger models provide better security reasoning at the cost of speed and hardware requirements.
| Model Tier | Best For | VRAM Required | Speed |
|---|---|---|---|
| Large (70B class) | Complex security reasoning, attack chain analysis | 48 GB VRAM (or 64 GB RAM for CPU) | Slow - high quality |
| Medium (34B class) | Source code analysis, SAST, exploit generation | 20 GB VRAM (or 32 GB RAM for CPU) | Medium |
| Small (7B class) | Fast general tasks, entry point (any machine) | 6 GB VRAM (or 8 GB RAM for CPU) | Very Fast |
Performance Expectations
| Capability | Local (small model) | Local (large model) | Cloud AI |
|---|---|---|---|
| Payload generation | Good | Excellent | Excellent |
| Code vulnerability analysis | Fair | Excellent | Excellent |
| Attack chain reasoning | Fair | Very Good | Excellent |
| Report narrative writing | Good | Very Good | Excellent |
| Speed (tokens/sec) | ~50 t/s (GPU) | ~8 t/s (GPU) | ~80 t/s (API) |
| Data privacy | 100% local | 100% local | Anonymized via PrivacyFilter |
| Internet required | No | No | Yes |
| Cost per scan | $0 | $0 | ~$0.05–0.50 |
Use Cases
Government and Defence Contractors
Security assessments on classified or FOUO systems cannot transmit data to commercial AI APIs. Local-only mode satisfies ITAR, CMMC, and IL4/IL5 data handling requirements. No data leaves your secure enclave.
Healthcare and HIPAA Environments
Penetration testing of healthcare systems involves PHI and PII in HTTP traffic and scan output. Local-only mode ensures no patient data ever reaches an external server, maintaining HIPAA compliance.
Financial Institutions (SOX, PCI-DSS)
Testing banking applications, trading systems, and payment processors requires strict data sovereignty. Local-only mode keeps all cardholder data, account numbers, and financial records on-premises.
Red Team Engagements - Target Confidentiality
During red team operations, target names, internal network topology, and attack strategies are highly sensitive. Local-only mode ensures the client's identity and vulnerabilities never reach third-party servers.
Frequently Asked Questions
codellama:34b is specifically trained on code and performs extremely well for vulnerability pattern detection, taint analysis, and exploit generation from source. With local mode, your entire codebase stays on your machine - no source code ever transmitted.ollama pull <model>, and disconnect again. The initial model download is the only internet access required.