# I Built an "Amazon-Style" AI Review Summarizer for Any Dataset (NLP, Transformers, Streamlit)

> Source: <https://dev.to/srihari_p_v/i-built-an-amazon-style-ai-review-summarizer-for-any-dataset-nlp-transformers-streamlit-1h7d>
> Published: 2026-06-18 01:30:00+00:00

Have you seen those new AI-generated review summaries on Amazon? They are incredibly useful for buyers, but there’s a catch: they are completely locked inside Amazon’s ecosystem.

If you are a developer, PM, or data scientist trying to analyze 5,000 scattered App Store reviews, Shopify comments, or Zendesk tickets, you are still stuck doing it manually or relying on basic word clouds.

I wanted to fix that. So, I built NEXUS 🧠—a production-grade Review Intelligence Engine that brings that exact "Amazon-style" AI analysis to any dataset.

Here is a deep dive into the architecture and how I put it together. 👇

🏗️ 1. The Deep Learning Baseline

Before jumping into massive pre-trained models, I wanted to establish a strong, custom baseline.

The Data: Trained on the Sentiment140 dataset (1.6 Million records).

The Architecture: I built a custom deep Bidirectional LSTM using TensorFlow/Keras. I utilized a 128-dim Embedding layer and stacked Bi-LSTMs to capture deep contextual sequences.

Optimization: Used aggressive Dropout(0.5) layers and EarlyStopping on validation loss to halt training dynamically and restore the best weights, preventing overfitting.

🤖 2. The Transformer Inference Pipelines

To achieve zero-shot classification and granular emotional analysis in the live app, I loaded lightweight HuggingFace pipelines directly into memory:

Sentiment: DeBERTa-v3 for highly accurate Zero-Shot classification (Positive, Neutral, Negative).

Emotional Topography: RoBERTa-go_emotions to extract 28 micro-emotions, which I mapped to heuristic scores (Joy, Frustration, Urgency, Resolve).

⚙️ 3. The "Amazon-Style" Intelligence Engine

Here was the biggest challenge: heavy generative LLMs (like DistilBART) consume massive RAM and are prone to hallucination.

Instead of relying purely on an LLM to write the summary, I wrote a deterministic Component-Impact Engine. It uses Regex and Pandas to chunk sentences, extract hardware/software components (battery, screen, software, ports), calculate the failure/praise rates of each, and dynamically synthesize a natural language summary.

The output? Exactly what engineering needs to see: "Customers heavily praise the screen and UI, but express deep frustration with the battery life."

✨ 4. The Frontend UX/UI

Streamlit is fantastic for Python devs, but out-of-the-box, it can look a bit generic. I wanted a premium, glossy feel. I injected hundreds of lines of custom CSS to override the default DOM, creating a "glassmorphism" aesthetic with animated micro-interactions, gradient borders, and custom Plotly charts.

NEXUS doesn't just say a review is "negative"—it tells the engineering team exactly what is breaking so they can push a fix faster.

I'd love to hear your thoughts! Have you experimented with DeBERTa vs. custom Bi-LSTMs for your own sentiment projects? Let's chat in the comments! 💬

Link- [https://sentimentanalyser-ucccl9ut869ugpmqid2ttg.streamlit.app/](https://sentimentanalyser-ucccl9ut869ugpmqid2ttg.streamlit.app/)
