{"slug": "show-hn-browser-native-gpu-sharing", "title": "Show HN: Browser-Native GPU Sharing", "summary": "A new open-source tool allows users to turn any browser with WebGPU support into a cluster node for sharing GPU inference, enabling LLM hosting without Python environments or driver setup. The system lets users host models on a powerful workstation and access them securely from other devices like phones or laptops, or allow others to connect via HTTP. Powered by WebGPU and Transformers.js, the tool processes images locally on the user's hardware, keeping data private and avoiding third-party AI APIs.", "body_md": "### Instant local hosting\n\nOpen a tab, pick a model, and start hosting. RF-DETR and SmolVLM load in a Web Worker on WebGPU — no Python environment or driver setup.\n\nTurn any browser with WebGPU into a cluster node. Share inference for LLM models — Host a model on your powerful workstation and access it securely from your phone, laptop, or let others connect to it.\n\nPowered by **WebGPU** & **Transformers.js** · No GPU\ndrivers to install · Open HTTP API\n\nContribute spare GPU cycles from your workstation. Clients send images over HTTP; your browser runs the model and returns results — privately, on your hardware.\n\nOpen a tab, pick a model, and start hosting. RF-DETR and SmolVLM load in a Web Worker on WebGPU — no Python environment or driver setup.\n\nInference runs on your GPU in the browser. Images are processed on your machine; nothing is sent to third-party AI APIs.\n\nConnect from curl, Python, Node, or any HTTP client. Simple JSON endpoints for detection and image description — queue and broker included.\n\nA lightweight Node broker coordinates tasks. Browser hosts stay connected via SSE and pull jobs when idle.\n\n```\nBrowser host\n                WebGPU inference\n```\n\nOpen the host page, choose a host id and model, then click Start hosting. Keep the tab open while you share GPU time.\n\nThe broker forwards detection and description tasks to your browser. One job runs at a time per host.\n\nPoint clients at `POST /v1/detect`\n\nor `/v1/describe`\n\nwith\nyour host id. Results return as JSON.\n\nUse the cluster monitor to see online hosts and copy ready-made curl examples.\n\n```\ncurl -X POST 'http://localhost:5180/v1/detect' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"host\": \"my-gpu-node\",\n    \"image_url\": \"https://example.com/photo.jpg\",\n    \"threshold\": 0.5\n  }'\n```\n\nModels download from Hugging Face on first load. Pick one per host session.\n\nReal-time object detection (COCO) via ONNX on WebGPU. Endpoint:\n`POST /v1/detect`\n\nDescribe images with a compact VLM on WebGPU. Endpoint:\n`POST /v1/describe`\n\nShare your GPU or explore nodes already online.", "url": "https://wpnews.pro/news/show-hn-browser-native-gpu-sharing", "canonical_source": "https://apssouza22-webgpu-cluster.hf.space/", "published_at": "2026-06-03 07:14:02+00:00", "updated_at": "2026-06-03 07:15:55.510023+00:00", "lang": "en", "topics": ["ai-infrastructure", "ai-tools", "computer-vision", "large-language-models"], "entities": ["WebGPU", "Transformers.js", "RF-DETR", "SmolVLM", "Node"], "alternates": {"html": "https://wpnews.pro/news/show-hn-browser-native-gpu-sharing", "markdown": "https://wpnews.pro/news/show-hn-browser-native-gpu-sharing.md", "text": "https://wpnews.pro/news/show-hn-browser-native-gpu-sharing.txt", "jsonld": "https://wpnews.pro/news/show-hn-browser-native-gpu-sharing.jsonld"}}