# Building PROTO RECON — From Vague Idea to a Browser-Based Tactical HUD

> Source: <https://dev.to/katsuo-chang/building-proto-recon-from-vague-idea-to-a-browser-based-tactical-hud-2p78>
> Published: 2026-05-24 00:12:32+00:00

Introduction
This post walks through how PROTO RECON — an experimental app that combines phone sensors with in-browser ML — evolved from pre-coding requirements to its current implementation. I've verified the UI behavior, but I'm still at the stage of reading the source code for the first time. Here I organize the original requirements and an AI-generated source map as a foundation for deeper dives in future posts.
Note: This is still in testing. Performance tuning for heavy workloads hasn't been done yet — use at your own risk.
Where It Started
It began with a vague question: Can we build a digital user experience on top of the real world? The seed of the idea was whether today's phone sensors could overlay something like the robot-view or cockpit-view perspectives from old sci-fi and games onto live scenery.
Before writing code, I summarized the requirements, turned functional specs into prose through dialogue with AI, and then started development.
Technology Choices (Before Implementation)
The noise layer was added later. When I had Cursor implement it, three.js was skipped in favor of Canvas 2D and CSS for CRT-style effects. Writing directly without extra modules kept things lighter — a decision that has paid off on real devices.
Gyro, compass, altitude (on supported devices), GPS, and similar sensors were also in scope, wired to the HUD and minimap.
Requirements at the Time (Summary)
The planning document — built through repeated conversations with AI — was titled Tactical Recon & Guidance Terminal "PROJECT: LOCK-ON (working title)". At that point, sensor accuracy led us to assume Phase 2 would require a Unity (native) rewrite. After coding with Cursor, however, things moved better than expected even without three.js, and the outlook shifted to Phase 1 might be achievable on the web alone.
Core Concept
-
Visuals: Monochrome green military-terminal UI, scan lines, noise
-
AI: Real-time object detection for targeting and lock-on
-
Immersion: Gyro, altitude, BEEP sounds — instrument and audio feedback
-
Future: VR-goggle HUD, gaze control (concept stage)
Plan vs. Current State
The full planning document is collapsed below due to length.
Full planning document (PROJECT: LOCK-ON draft)
Tactical Recon & Guidance Terminal App — "PROJECT: LOCK-ON (working title)" Planning Document
This document summarizes a concept for a smartphone camera-filter app that blends retro-futuristic UI with modern AI.
1. Project Overview
A "play" app that recreates the experience of a military recon terminal from 1980s–90s sci-fi films and games, using live smartphone camera feed and AI.
2. Core Concept
-
Visuals: Monochrome green evoking old terminals. Retro feel via scan lines and noise.
-
AI interaction: Automatic targeting and lock-on through real-time object detection.
-
Immersion: Gyro, barometric sensor (altimeter), BEEP feedback.
-
Extensibility: HUD and gaze control with VR goggles (e.g. Hacosco) — concept stage.
3. Main Features (excerpt)
3.1 Video filter
- Green overlay (night-vision goggle texture)
- Edge bar graphs and scan indicators
3.2 AI lock-on
- Automatic moving-object detection with green bounding-box tracking
- SE accelerates when the target stays in the reticle center
- Lock complete: box turns red, warning sound
- Tap to fire a virtual missile (effect)
3.3 Sensor-linked instruments
- Gyro: pitch and roll numeric display
- Altimeter: relative altitude from barometric sensor (iPhone, etc.)
4. Development Phases (roadmap at the time)
-
Phase 1: Web prototype (Next.js + Three.js + TF.js). Shareable via URL.
-
Phase 2: Unity port. Barometric altitude, vibration.
-
Phase 3: VR/XR. Stereoscopic display, gaze lock-on.
5. Distribution Strategy
Dev logs on Qiita / Zenn, short-form video on social, posts on Reddit, etc.
Created: April 29, 2026 (planning stage)
How Development Proceeded
Coding was delegated to Cursor, adding features one prompt at a time until the current state. For prompts, I load the requirements document into Gemini and ask what kind of prompt would work best. I've checked JS behavior through the UI, but reading the source comes next. The file list and dependency diagram below were AI-generated and cleaned up for this article.
Source Layout
HTML (root level)
src/
— runtime files
Entry, config, UI
Video & AI inference
Distance & AR
Navigation, map, sensors
Feedback & monitoring
src/
— tests only (not used on screen)
Dependencies (overview)
flowchart TB
index[index.html] --> main[main.js]
main --> config[appConfig / applyDevConfig / uiLexicon]
main --> cam[camera + HUD]
main --> det[personDetection]
main --> face[facePrivacy]
det --> tf[tfBackend]
face --> tf
main --> map[miniMap]
map --> leaf[miniMapLeaflet]
main --> nav[compass / motionHud / motionStabilityLink]
main --> range[objectRange]
range --> xr[webxrHitTestRange]
main --> health[liveHealth]
main --> sfx[audioFx]
main.js
ties almost everything together; appConfig.js
settings and uiLexicon.js
copy drive behavior and display across modules.
Closing — What to Read Next
Future posts will open the source and go deep feature by feature. The study roadmap below is the planned reading order (01 is done with this article's file map).
Source reading roadmap (32 parts — click to expand)
Stage 1 — Config and skeleton (no permissions needed)
Stage 2 ⭐⭐ — Look and boot "theater" (testable without camera)
UI animation and audio control without device permissions.
Stage 3 ⭐⭐⭐ — Media, sensors, tapes (real device required)
Stage 4 ⭐⭐⭐⭐ — Detection, privacy, ML
Stage 5 ⭐⭐⭐⭐⭐ — Integration, ops, errors
The next post will start with roadmap #02 (appConfig.js
).
