Show HN: Classify mechanical faults using Contrastive Language-Audio Pretraining

wpnews.pro

cd /news/machine-learning/show-hn-classify-mechanical-faults-u… · home › topics › machine-learning › article

[ARTICLE · art-47715] src=github.com ↗ pub=2026-07-01T16:57Z topic=machine-learning verified=true sentiment=· neutral

Show HN: Classify mechanical faults using Contrastive Language-Audio Pretraining

A developer released cardiag, an open-source audio-ML pipeline that uses Contrastive Language-Audio Pretraining (CLAP) to classify mechanical faults from phone recordings. The tool achieves 0.79 AUROC for fault detection and provides calibrated triage, returning 'uncertain' when confidence is low. The project is available as a CLI and web app.

read3 min views1 publishedJul 1, 2026

Show HN: Classify mechanical faults using Contrastive Language-Audio Pretraining — Image: source

cardiag

is an end-to-end audio-ML pipeline. It scrapes fault-sound clips from YouTube/TikTok, cleans the audio (isolating the mechanical sound from speech, music, and noise), embeds it with a frozen CLAP model, and trains small linear heads to triage the fault. It is exposed as a CLI and a live web app.

cardiag-demo.mp4 #

This is a proof of concept, and honest about what that means. Diagnosing a car fault from a phone recording is genuinely hard, so cardiag

is built as a calibrated triage aid rather than a diagnoser: it tells you whether something sounds wrong, roughly where in the car it is, and a ranked shortlist of likely parts. When the audio won't support a call, it says "uncertain" instead of bluffing.

The real contribution is the cleaning + honest-training recipe, which is reusable on other audio datasets. The modest accuracy here reflects how hard the problem is from crude phone audio (we hit the literature ceiling); the

samemethod reaches 0.93 AUROC on clean engine audio. See[docs/DEFENSE.md].

Two pages visualize the first two stages of the pipeline:

Isolating the engine audio— an interactive look at theclean()

cascade pulling a short mechanical span out of noisy YouTube audio (speech, music, road noise).CLAP, visualized— how the frozen CLAP model turns those spans into the 512-d embedding the linear heads classify.

Measured out-of-sample, leakage-safe (by-video grouped CV over 1,031 video groups; permutation p = 0.0005). These are honest numbers, not a leaderboard.

Capability	Result	vs. chance
Is something wrong? (fault/normal)	AUROC 0.79 [0.76, 0.83]
0.50
Where in the car? (6 zones)	right zone in top-3 ≈ 75%
2×
Which part? (12+ families)	right part in top-3 ≈ 45–65%
3–4×
Knows when it doesn't know	calibrated (ECE ≈ 0.04), returns `UNCERTAIN`
—

Full details, and the one head we demoted for failing out-of-sample (knock), are in docs/MODEL_CARD.md.

A fresh clone is immediately usable. A small pre-trained model ships in models/

, and a synthetic demo clip is bundled, so nothing needs to be downloaded or scraped.

git clone <this-repo> && cd car-diagnosis
uv venv && source .venv/bin/activate
uv pip install -e ".[scrape,web,dev,viz]"     # Python 3.11

cardiag doctor                 # preflight: what's installed
cardiag train --fixtures       # a working model offline in ~2s (no scrape, no 2 GB download)
cardiag diagnose <clip.wav>    # verdict + where-in-the-car + ranked parts
cardiag serve --model models   # live web app: drop a clip / paste a link, "explain why"

Verify the whole thing end-to-end in an isolated worktree: bash scripts/clone_verify.sh

audio ──► clean() cascade ──► CLAP embedding ──► linear heads ──► Diagnosis
          (isolate spans)     (frozen, 512-d)    (fault/region/    (calibrated,
                                                  part/knock)       UNCERTAIN-aware)

There is one segmentation path. Scraped clips, your own recordings (cardiag ingest

, any length), and uploads at inference all flow through the same clean()

cascade that isolates short mechanical spans. Spans over ~10 s are split into windows so CLAP never silently truncates them. Training and serving share one embedding contract, so there is no train/serve skew.

cardiag diagnose clip.wav            # full model: verdict + region + ranked parts
cardiag triage   clip.wav            # calibrated engine-vs-running-gear
cardiag clean    clip.wav            # isolate the mechanical sound (no model needed)
cardiag inspect  clip.wav -o r.html  # SEE/HEAR the pipeline: spans, spectrograms, scores
cardiag ingest   ./my_audio --kind fault --cause wheel_bearing   # bring your own audio
cardiag scrape   youtube|tiktok      # build a corpus (Reddit is deprecated — too noisy)
cardiag train                        # train on your corpus

Add --json

to any inference command for machine-readable output.

docs/DEFENSE.md— the honest case that a deliberately crude method earns a real triage result.docs/MODEL_CARD.md— per-head metrics, intended use, limitations.docs/architecture.md— pipeline diagrams.docs/scraping-guide.md— start-to-finish corpus building.

Valid for social-style / targeted-upload audio (YouTube, TikTok, or a phone clip a user records deliberately). It is not a safety-critical or standalone diagnostic. It is a triage assistant that narrows where to look and is honest about its uncertainty. Model files are joblib artifacts: load only ones you trust.

License: see LICENSE.

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/show-hn-classify-mechani…

Read original on github.com → github.com/adam-s/car-diagnosis

mentioned entities

CLAP

YouTube

TikTok

cardiag

metadata

slugshow-hn-classify-mechanical-faults-using-contrastive-language-audio-pretraining

topic#machine-learning

secondary1 topics

sentimentneutral

canonicalgithub.com

navigation

← prevAfter spooking Trump into safety…

next →SCOTUS killed the independent ag…

── more in #machine-learning 4 stories · sorted by recency

digiday.com · 1 Jul · #machine-learning

Future of TV Briefing: The 5 biggest news stories of 2026 so far

dev.to · 4 Jul · #machine-learning

I Tracked 28,000 People Trying to Build a Daily Habit - Heres When They Quit

dev.to · 4 Jul · #machine-learning

Mnemo AI: Building an AI That Never Forgets You

dev.to · 4 Jul · #machine-learning

I Ditched Vector Search for My Coding Agent's Memory. FTS5 Won.

── more on @clap 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required