cd /news/natural-language-processing/day-9-a-local-japanese-sentiment-ai-… · home topics natural-language-processing article
[ARTICLE · art-18167] src=dev.to pub= topic=natural-language-processing verified=true sentiment=· neutral

[Day 9] A local Japanese sentiment AI (BERT) read 8 years of a LINE chat, and the ups and downs surfaced from numbers alone

A developer used a local Japanese sentiment AI (BERT) to analyze 87,621 messages from an 8-year LINE chat history, running entirely on a DGX Spark to keep the private conversation data on-device. By scoring 66,329 text messages for tone and aggregating monthly message counts, the analysis revealed four distinct phases in the relationship's arc—ramp-up, an 8-month silence, a second peak, and a stable plateau—without the developer re-reading any message content. The tone data showed that the mood had already turned negative in the month before the silence began, suggesting the sentiment shift preceded the drop in communication volume.

read5 min publishedMay 29, 2026

Day 9. Today is less about model internals and more of a personal experiment: have a local AI analyze the entire chat history with one LINE friend. (LINE is the dominant messaging app in Japan.)

When I exported it, 8 years were sitting there — from the very first message to today. It started, we talked a lot, it went quiet for a while, then picked up again. That whole arc is in there.

Because the content is what it is, nothing left my machine: everything ran locally on my DGX Spark.

What I used: my home AI box (DGX Spark) + a Japanese sentiment model (for tone) + a bigger local model (to guess events from numbers).

Re-reading 8 years of messages one by one isn't realistic. So instead of reading the content, I looked only at the "shape" of the conversation — when, how much, and in what tone we talked.

Concretely:

From message counts and tone alone, the 8-year arc came out clearly on a chart. Started, went quiet, came back — the flow was visible without me re-reading a thing.

LINE chat export (text)
        │
        ▼
 1. Parse: split each message into {datetime, who, type, text}
        │   (from here on, message text never leaves the machine)
        ▼
 2. Aggregate: monthly counts, time-of-day, reply gaps
        │
        ▼
 3. Tone scoring: classify each of 66k messages pos/neu/neg
        │
        ▼
 4. Turning-point detection: from sudden changes in the numbers
        │   + also show ONLY the numbers to a bigger AI and ask it to guess
        ▼
 5. Answer check: compare against the real timeline

You can export a LINE chat as text from the chat screen ("send chat history").

Data size:

Item Value
Span ~8 years 2 months
Total messages 87,621
Text messages 66,329
Stickers 15,605
Photos 3,982

15,605 stickers… that's a lot.

Step Model What it does What it sees
3. Tone Japanese sentiment model (koheiduck/bert-japanese-finetuned-sentiment )
scores each message pos/neu/neg 66k message texts (scores averaged per month)
4. Turning points a bigger local model (Qwen2.5 72B)
guesses "what happened to these two?" only the per-month table of counts + tone scores (no conversation, no words)

Both run locally on my own machine.

This chart is the highlight. Top: monthly message count. Bottom: tone (up = positive, down = negative). The x-axis is months since the conversation started. (Axis labels are in Japanese.)

Plotted, it isn't a steady climb or a flat line — it splits cleanly into "chapters": ramp-up → an 8-month silence → a second peak → a stable plateau. Four phases, at a glance.

Tone has two peaks of about +0.6, around the start and around when things resumed (overall mean ≈ 0, slightly negative in the later years). The interesting part: in the month before the silence, tone had already dropped to −0.1. The mood dimmed before the volume did.

There are two dips into negative tone. The one before the silence was an "omen." The other is the recent years — not an omen, but the effect of logistics-y messages ("what time are you home?") piling up.

💡 Mini-note: how is "tone" turned into a number?

The scoring is done by a Japanese sentiment model. Roughly:

  • pre-trained on lots of Japanese text labeled positive / negative
  • judges with context, not just by spotting keywords
  • returns a probability of "positive-ness" / "negative-ness" per message
  • I used the difference as a per-message score

A few actual judgments (short, name- and place-free one-liners):

Message Verdict
「楽しかったね!」 (that was fun!) Positive
「これめちゃうまい」 (this is so good) Positive
「おはようございます」 (good morning) Neutral
「もうお家?」 (home already?) Neutral
「全く集中できない」 (can't focus at all) Negative
「それは悔しいな、、」 (that's frustrating…) Negative
(a long trip-planning message) Neutral
(a snappy one-liner sent in a huff) Negative

Plain happy lines score positive; logistics ("good morning", "home already?") score neutral; tiredness or irritation scores negative. Even long, businesslike planning messages lean neutral.

Message density by weekday × hour (brighter = more).

A clear concentration at 7–9 a.m.!

First, the simple method: mechanically pick the points where message volume jumped or dropped, then check against the real timeline.

Real event Auto-detected timing
When it started exact match
When it went quiet exact match
When it resumed exact match
When it got lively again a few months off
A big life milestone hard to detect (barely shows in counts)

Sharp volume changes were nailed. But "a big life milestone" got missed. So I showed the same numbers to the bigger local model and asked "what happened?" — and got back:

Rather than hunting for a single spike, it reads the whole sequence of numbers as a "flow," so it could pick up even an event that barely moves the counts.

Counts and tone were enough to see the 8-year shape. Silence marks the quiet stretch; a surge marks the resumption — straight off the chart.

Given only monthly numbers, the model inferred even a barely-visible event ("something big around here"), and it lined up with reality. It connects scattered points into one flow.

The slight negative lean in later years isn't about getting along badly. Logistics messages ("what time are you home?") just don't score high. Low score ≠ trouble. It isn't that sentiment analysis is poor — the scores need to be read together with context.

time<TAB>name<TAB>text

. Multi-line messages (4,987 of them) are merged back into the previous message.koheiduck/bert-japanese-finetuned-sentiment

, a 3-class (pos / neu / neg) Japanese model.Weather forecasts say one temperature, but everyone feels it differently. Same degrees, different "do I need a coat?" So next I'm building my own personal "weather officer" AI: from past weather data, it'll tell me each morning something like "coat + beanie today." Over the next 100 days I'll teach it my own sense of cold — the start of a longer project.

── more in #natural-language-processing 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/day-9-a-local-japane…] indexed:0 read:5min 2026-05-29 ·