[Day 9] A local Japanese sentiment AI (BERT) read 8 years of a LINE chat, and the ups and downs surfaced from numbers alone A developer used a local Japanese sentiment AI (BERT) to analyze 87,621 messages from an 8-year LINE chat history, running entirely on a DGX Spark to keep the private conversation data on-device. By scoring 66,329 text messages for tone and aggregating monthly message counts, the analysis revealed four distinct phases in the relationship's arc—ramp-up, an 8-month silence, a second peak, and a stable plateau—without the developer re-reading any message content. The tone data showed that the mood had already turned negative in the month before the silence began, suggesting the sentiment shift preceded the drop in communication volume. Day 9. Today is less about model internals and more of a personal experiment: have a local AI analyze the entire chat history with one LINE friend. LINE is the dominant messaging app in Japan. When I exported it, 8 years were sitting there — from the very first message to today. It started, we talked a lot, it went quiet for a while, then picked up again. That whole arc is in there. Because the content is what it is, nothing left my machine: everything ran locally on my DGX Spark. What I used: my home AI box DGX Spark + a Japanese sentiment model for tone + a bigger local model to guess events from numbers . Re-reading 8 years of messages one by one isn't realistic. So instead of reading the content, I looked only at the "shape" of the conversation — when, how much, and in what tone we talked. Concretely: From message counts and tone alone, the 8-year arc came out clearly on a chart. Started, went quiet, came back — the flow was visible without me re-reading a thing. LINE chat export text │ ▼ 1. Parse: split each message into {datetime, who, type, text} │ from here on, message text never leaves the machine ▼ 2. Aggregate: monthly counts, time-of-day, reply gaps │ ▼ 3. Tone scoring: classify each of 66k messages pos/neu/neg │ ▼ 4. Turning-point detection: from sudden changes in the numbers │ + also show ONLY the numbers to a bigger AI and ask it to guess ▼ 5. Answer check: compare against the real timeline You can export a LINE chat as text from the chat screen "send chat history" . Data size: | Item | Value | |---|---| | Span | ~8 years 2 months | | Total messages | 87,621 | | Text messages | 66,329 | | Stickers | 15,605 | | Photos | 3,982 | 15,605 stickers… that's a lot. | Step | Model | What it does | What it sees | |---|---|---|---| | 3. Tone | Japanese sentiment model koheiduck/bert-japanese-finetuned-sentiment | scores each message pos/neu/neg | 66k message texts scores averaged per month | | 4. Turning points | a bigger local model Qwen2.5 72B | guesses "what happened to these two?" | only the per-month table of counts + tone scores no conversation, no words | Both run locally on my own machine. This chart is the highlight. Top: monthly message count. Bottom: tone up = positive, down = negative . The x-axis is months since the conversation started. Axis labels are in Japanese. Plotted, it isn't a steady climb or a flat line — it splits cleanly into "chapters": ramp-up → an 8-month silence → a second peak → a stable plateau. Four phases, at a glance. Tone has two peaks of about +0.6, around the start and around when things resumed overall mean ≈ 0, slightly negative in the later years . The interesting part: in the month before the silence, tone had already dropped to −0.1. The mood dimmed before the volume did. There are two dips into negative tone. The one before the silence was an "omen." The other is the recent years — not an omen, but the effect of logistics-y messages "what time are you home?" piling up. 💡 Mini-note: how is "tone" turned into a number? The scoring is done by a Japanese sentiment model. Roughly: - pre-trained on lots of Japanese text labeled positive / negative - judges with context, not just by spotting keywords - returns a probability of "positive-ness" / "negative-ness" per message - I used the difference as a per-message score A few actual judgments short, name- and place-free one-liners : | Message | Verdict | |---|---| | 「楽しかったね!」 that was fun | Positive | | 「これめちゃうまい」 this is so good | Positive | | 「おはようございます」 good morning | Neutral | | 「もうお家?」 home already? | Neutral | | 「全く集中できない」 can't focus at all | Negative | | 「それは悔しいな、、」 that's frustrating… | Negative | | a long trip-planning message | Neutral | | a snappy one-liner sent in a huff | Negative | Plain happy lines score positive; logistics "good morning", "home already?" score neutral; tiredness or irritation scores negative. Even long, businesslike planning messages lean neutral. Message density by weekday × hour brighter = more . A clear concentration at 7–9 a.m. First, the simple method: mechanically pick the points where message volume jumped or dropped, then check against the real timeline. | Real event | Auto-detected timing | |---|---| | When it started | exact match | | When it went quiet | exact match | | When it resumed | exact match | | When it got lively again | a few months off | | A big life milestone | hard to detect barely shows in counts | Sharp volume changes were nailed. But "a big life milestone" got missed. So I showed the same numbers to the bigger local model and asked "what happened?" — and got back: Rather than hunting for a single spike, it reads the whole sequence of numbers as a "flow," so it could pick up even an event that barely moves the counts. Counts and tone were enough to see the 8-year shape. Silence marks the quiet stretch; a surge marks the resumption — straight off the chart. Given only monthly numbers, the model inferred even a barely-visible event "something big around here" , and it lined up with reality. It connects scattered points into one flow. The slight negative lean in later years isn't about getting along badly. Logistics messages "what time are you home?" just don't score high. Low score ≠ trouble. It isn't that sentiment analysis is poor — the scores need to be read together with context. time