Virtual SOC Analyst The article describes an AI-powered Virtual SOC Analyst tool, hosted at analista.byronlainez.click, that uses Gemma 4 to rapidly analyze AWS security logs (such as CloudTrail, WAF, and Nginx) and detect threats like SQL injection in under 30 seconds. It can also identify critical architecture misconfigurations, such as an RDS database exposed in a public subnet, by analyzing uploaded screenshots of AWS architecture diagrams. The system leverages a 128K context window to correlate security events across large log files in a single pass, outputting structured threat analysis, MITRE ATT&CK mappings, and deployment-ready AWS WAF block rules. What I Built analista.byronlainez.click is an AI-powered Virtual SOC Security Operations Center Analyst that: - Ingests raw cloud security logs AWS CloudTrail, WAF, Nginx with no size limits - Automatically maps every detected threat to MITRE ATT&CK and OWASP Top 10 - Generates production-ready AWS WAF block rules and Terraform HCL — copy-paste to deploy - Analyzes screenshots of dashboards and architecture diagrams using Gemma 4's multimodal vision - Runs a fully private local mode where Gemma 4 4B executes inside the browser via WebLLM — your logs never leave your machine The Problem It Solves If you manage AWS infrastructure, you know the pain: CloudTrail + WAF + Nginx logs grow exponentially . A production environment generates tens of thousands of security events per hour. When an incident happens: - Junior SOC analysts hit alert fatigue within 30 minutes of a real incident - Correlating events separated in time a 2:00 AM port scan and a 5:00 AM SQL injection from the same IP requires expensive SIEM tooling or hours of manual grep queries - Writing a WAF block rule takes 10–15 minutes — time that an active attacker is already exploiting This tool cuts all of that to under 30 seconds. The real enemy in cybersecurity is often not the attacker — it is the sheer volume of data that buries the signal inside the noise. Gemma 4 solves this. Demo Unfortunately, due to the API limit, I couldn't record the video, but I'm leaving the link so you can try it. In the settings, if you don't see the option to enter the API, reduce your window resolution. 🌐 Live app: analista.byronlainez.click https://analista.byronlainez.click/ End-to-End Example I fed the tool a raw AWS WAF log containing a SQL injection attempt from an automated scanner: Input — raw WAF log: { "timestamp": "2026-05-21T02:14:33Z", "action": "ALLOW", "httpRequest": { "uri": "/api/users?id=1' OR '1'='1", "clientIp": "45.33.32.156", "country": "RU", "headers": { "name": "User-Agent", "value": "sqlmap/1.7" } } } Gemma 4 analysis — returned in under 8 seconds: 🚨 THREAT DETECTED — HIGH CONFIDENCE MITRE Tactic: Initial Access MITRE Technique: T1190 — Exploit Public-Facing Application OWASP Category: A03:2021 — Injection SQL Injection Attacker IP: 45.33.32.156 Tool Signature: sqlmap/1.7 automated SQL injection scanner Timeline correlation: This IP performed ICMP reconnaissance at 01:47 UTC before switching to active exploitation. Recommended Action: BLOCK immediately and audit all requests from this source in the last 24 hours for lateral movement. Auto-generated AWS WAF block rule — ready to deploy: { "Name": "BlockMaliciousIPsGemma", "Priority": 10, "Statement": { "IPSetReferenceStatement": { "ARN": "arn:aws:wafv2:us-east-1:123456789012:regional/ipset/MaliciousIPs/a1b2c3d4" } }, "Action": { "Block": {} }, "VisibilityConfig": { "SampledRequestsEnabled": true, "CloudWatchMetricsEnabled": true, "MetricName": "BlockMaliciousIPsGemmaMetric" } } No editing needed. Paste this directly into AWS Console or deploy via CLI. Visual Triage — Multimodal in Action I uploaded a screenshot of an AWS architecture diagram where a database was sitting in a public subnet. Gemma 4 flagged it without any additional prompting: ⚠️ ARCHITECTURE MISCONFIGURATION DETECTED Finding: RDS instance appears exposed in a public subnet Risk Level: CRITICAL — direct internet-reachable database MITRE Ref: T1190, T1078 Valid Accounts via exposed DB port Remediation: 1. Move RDS to a private subnet immediately 2. Configure NAT Gateway for outbound-only connectivity 3. Enable RDS encryption at rest KMS if not already active 4. Audit Security Group rules — port 3306/5432 must not be 0.0.0.0/0 This second layer of analysis — visual + log correlation — is something no purely text-based model can replicate. Code 🔗 Repository: github.com/Byronsasvin/bals-analyst-v2 https://github.com/Byronsasvin/bals-analyst-v2 Core Architecture The system is built around a structured prompt engineering core that leverages Gemma 4's 128K context window to correlate security events across massive log files in a single inference pass — no chunking, no summarization loss. SOC analyst system prompt simplified : SYSTEM PROMPT = """ You are a senior SOC analyst with expertise in AWS security, MITRE ATT&CK framework, and OWASP Top 10. Analyze the provided security logs and return a structured JSON with: - threat detected: boolean - confidence: HIGH | MEDIUM | LOW - mitre tactic: string - mitre technique: string include T-number - owasp category: string or null - attacker ips: array of strings - attack timeline: chronologically ordered events - waf rule json: complete AWS WAF rule object, deployment-ready - remediation steps: prioritized action list If image input is provided, also analyze for: - Architecture misconfigurations public subnets, open ports - Visual anomalies in traffic/metric charts """ def analyze log content: str, screenshot=None - dict: messages = {"role": "user", "content": } if screenshot: Gemma 4 multimodal: image tokens must precede text tokens messages 0 "content" .append { "type": "image", "image": screenshot } messages 0 "content" .append { "type": "text", "text": f"{SYSTEM PROMPT}\n\nLogs to analyze:\n{log content}" } response = gemma4 client.chat messages, response format={"type": "json object"}, max tokens=2048, temperature=0.1 Low temperature = consistent structured output return json.loads response.choices 0 .message.content Local edge mode — Gemma 4 4B running 100% in-browser: js import { CreateWebWorkerMLCEngine } from "@mlc-ai/web-llm"; // Runs in a Web Worker — zero server calls, zero data leakage const engine = await CreateWebWorkerMLCEngine new Worker new URL './worker.js', import.meta.url , { type: 'module' } , "gemma-4-4b-it-q4f32 1-MLC", { initProgressCallback: p = updateProgressBar p.progress } ; async function analyzeLocally logContent { const reply = await engine.chat.completions.create { messages: { role: "system", content: SYSTEM PROMPT }, { role: "user", content: logContent } , temperature: 0.1, max tokens: 2048 } ; return JSON.parse reply.choices 0 .message.content ; } After the model downloads once ~2.5 GB cached in IndexedDB , every analysis run is completely offline. Your production logs never touch a server. How I Used Gemma 4 I made a deliberate choice to use two different Gemma 4 variants for two distinct security scenarios. Here is the reasoning behind each decision. Gemma 4 27B — Deep cloud forensics via Gemini API Why 27B? The 128K context window was the decisive factor. I benchmarked the same log analysis task against smaller models and previous-generation LLMs. Every one of them failed in the same way: they either refused files larger than ~30K tokens, or they exhibited the classic "lost in the middle" problem — forgetting events from the beginning of the log by the time they reached the end. With Gemma 4 27B , I fed a complete 72-hour CloudTrail export ~85K tokens in a single call. It correctly identified a three-hop attack chain: | Time | IP | Event | |---|---|---| | 02:14 UTC | 45.33.32.156 | ICMP sweep — passive reconnaissance | | 03:47 UTC | 45.33.32.159 | WAF probing — fuzzing for bypasses | | 05:22 UTC | 45.33.32.157 | Active SQL injection using sqlmap | That correlation across 3 hours and 3 rotating IPs from the same subnet would have taken a human analyst 45+ minutes to find manually. Gemma 4 found it in one inference pass, in under 90 seconds. This is what the 128K window actually unlocks in a security context: not just "longer documents," but temporal correlation at scale without losing context. Gemma 4 4B — Privacy-first local analysis via WebLLM Why 4B in the browser? Because compliance is a hard blocker for most enterprises. Uploading production security logs to any external API — even a secure, encrypted one — can violate: - GDPR Article 28 — data processor agreements and data residency requirements - HIPAA — if HTTP request logs contain PHI embedded in URL parameters - PCI-DSS — cardholder data potentially visible in WAF request logs By running Gemma 4 4B locally via WebLLM , the sensitive data never leaves the user's machine. The model runs in a browser Web Worker with no outbound network calls after the initial model download. This makes the tool usable for banks, hospitals, and any regulated industry that would otherwise be completely blocked from using a cloud API version. The 4B model handles single-event triage with enough accuracy for real-time alerting. Users who need deep forensic correlation across large log archives can switch to the cloud mode with a single toggle. Native multimodal — an unexpected force multiplier Building the visual triage module revealed something I did not anticipate: Gemma 4 can read AWS dashboard screenshots with the accuracy of a trained human analyst. Feed it a CloudWatch metrics screenshot showing a traffic anomaly, and it correctly identifies: - The approximate time window of the spike - Whether the pattern resembles a DDoS, a scraper bot, or a legitimate traffic surge - Which alarms should have fired but did not — flagging monitoring gaps This is a second analysis layer that no text-only model can replicate. It required zero extra tooling — just passing the screenshot as native image input to Gemma 4. From MVP to Enterprise: The Closed-Loop SOC Pipeline This app is a working MVP. Here is how the same architecture scales to a fully automated, production-grade SecOps pipeline: AWS WAF / CloudTrail │ ▼ Amazon Kinesis Firehose ← real-time event stream │ ▼ Classifier Lambda ← fast filter: normal vs suspicious │ suspicious events only ▼ analista.byronlainez.click API │ ▼ Gemma 4 31B Dense ← deep reasoning + timeline correlation │ ▼ Generate WAF Rule JSON │ ▼ Lambda → Update WAF IP Set ← automatic block in ~200ms │ ▼ Slack / Teams webhook ← SOC team notified with full report What this closed-loop approach delivers: - Zero-touch threat containment — from detection to active block in under 500ms, with no human in the loop for high-confidence threats - Automated IAM privilege audits — Gemma 4 31B Dense running nightly scans of all IAM policies, surfacing silent privilege escalation paths before attackers find them - SIEM enrichment — structured threat reports pushed directly into Splunk, Microsoft Sentinel, or any webhook-compatible platform That is what open-weights models like Gemma 4 make possible — and why I believe this architecture represents the future of accessible, privacy-respecting enterprise security. Try analista.byronlainez.click with your own logs. What threats did Gemma 4 find in your infrastructure? Drop your results in the comments 👇