What I Learned Building an Autonomous Deal-Hunting Agent System

wpnews.pro

Over the past week I built a multi-agent AI system that autonomously scans the internet for bargains, estimates the true value of products using three different pricing techniques, and pushes a notification straight to my phone the moment it finds a deal worth acting on. Along the way I picked up a ton of practical, transferable lessons about agentic AI architecture, RAG, fine-tuning vs. prompting, tool calling, and shipping a real (if scrappy) product with a Gradio front end.

This post is my write-up of the whole journey, with the code that made it work.

The system, nicknamed "The Price Is Right", is built around the idea that no single model is the best at everything. Instead of one giant prompt, the architecture splits the problem into focused agents that each do one job well, coordinated by a planning agent:

Here's how the five days of building this broke down.

The first lesson was about infrastructure: how do you run a model (especially a fine-tuned open-source LLM) without managing your own GPU server? The answer here was Modal, a serverless platform for running Python functions in the cloud — including on GPUs.

The "hello world" of Modal is refreshingly simple. You define an App

, an Image

(essentially a container spec with pip dependencies), and decorate a normal Python function with @app.function

:

import modal
from modal import Image


app = modal.App("hello")
image = Image.debian_slim().pip_install("requests")


@app.function(image=image)
def hello() -> str:
    import requests

    response = requests.get("https://ipinfo.io/json")
    data = response.json()
    city, region, country = data["city"], data["region"], data["country"]
    return f"Hello from {city}, {region}, {country}!!"


@app.function(image=image, region="eu")
def hello_europe() -> str:
    import requests

    response = requests.get("https://ipinfo.io/json")
    data = response.json()
    city, region, country = data["city"], data["region"], data["country"]
    return f"Hello from {city}, {region}, {country}!!"

What I loved here is the region="eu"

parameter — with one keyword argument you can pin where in the world your function actually executes, which matters for latency, data residency, and sometimes cost.

Calling it locally vs. remotely is just as simple from a notebook:

from hello import app, hello, hello_europe

with app.run():
    reply = hello.local()   # runs on your machine

with app.run():
    reply = hello.remote()  # runs on Modal's cloud

The next step up is running an actual language model. This is where Modal's GPU support and Secrets come in — you don't want to hardcode your Hugging Face token, so you register it once in Modal's dashboard under a name (e.g. huggingface-secret

) and reference it in code:

import modal
from modal import Image


app = modal.App("llama")
image = Image.debian_slim().pip_install("torch", "transformers", "accelerate")
secrets = [modal.Secret.from_name("huggingface-secret")]
GPU = "T4"
MODEL_NAME = "meta-llama/Llama-3.2-3B"

@app.function(image=image, secrets=secrets, gpu=GPU, timeout=1800)
def generate(prompt: str) -> str:
    from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed

    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.padding_side = "right"

    model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto")

    set_seed(42)
    inputs = tokenizer.encode(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(inputs, max_new_tokens=5)
    return tokenizer.decode(outputs[0])

A few things clicked for me here:

gpu="T4"

is timeout=1800

matters because the first call has to download and load model weights — that cold start can take minutes.transformers

imports, tokenizer, model) is imported The really important conceptual jump on Day 1 was going from an ephemeral app (with app.run(): ...

, which spins up and tears down for a single call) to a deployed app:

uv run modal deploy -m pricer_service

Once deployed, the service runs independently of my notebook, and I can call it from anywhere just by referencing it by name:

import modal
Pricer = modal.Cls.from_name("pricer-service", "Pricer")
pricer = Pricer()
reply = pricer.price.remote("Quadcast HyperX condenser mic, connects via usb-c to your computer for crystal clear audio")
print(reply)

This is essentially how you'd put a fine-tuned model "behind an API" for a production system — and it's the foundation for the Specialist Agent, which wraps this exact deployed pricer.

There's also a nice optimization here: by default a Modal container scales down to zero when idle, so the first call after inactivity can take ~30 seconds to wake up. If you're willing to spend a few extra credits, you can keep a container warm:

import modal
Pricer = modal.Cls.from_name("pricer-service", "Pricer")
pricer = Pricer()
pricer.update_autoscaler(scaledown_window=1200)  # stay warm for 20 minutes

Takeaway: Modal turns "deploy a fine-tuned model as a microservice" into a one-line decorator and a one-line CLI command. The mental model — write a normal Python function, decorate it, deploy it, call it like a remote object — is something I'll reuse for any future "specialist model as a service" project.

Day 2 was about a different way to make a frontier model (GPT-5.1) better at a narrow task — Retrieval Augmented Generation (RAG) — and then about combining multiple pricing strategies into one.

The first ingredient is a local, open-source sentence embedding model, which turns text into a 384-dimensional vector capturing its meaning:

from sentence_transformers import SentenceTransformer

encoder = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

vector = encoder.encode(["A proficient AI engineer who has almost reached the finale of AI Engineering Core Track!"])[0]
print(vector.shape)  # (384,)

These vectors get stored — along with the product description and metadata (category, price) — in a Chroma vector database, batched 1,000 items at a time across hundreds of thousands of products:

collection_name = "products"
existing_collection_names = [collection.name for collection in client.list_collections()]

if collection_name not in existing_collection_names:
    collection = client.create_collection(collection_name)
    for i in tqdm(range(0, len(train), 1000)):
        documents = [item.summary for item in train[i: i+1000]]
        vectors = encoder.encode(documents).astype(float).tolist()
        metadatas = [{"category": item.category, "price": item.price} for item in train[i: i+1000]]
        ids = [f"doc_{j}" for j in range(i, i+1000)]
        ids = ids[:len(documents)]
        collection.add(ids=ids, documents=documents, embeddings=vectors, metadatas=metadatas)

collection = client.get_or_create_collection(collection_name)

One of the most satisfying moments was reducing those 384-dimensional vectors down to 3D with t-SNE and seeing the products cluster by category — electronics in one corner, musical instruments in another:

from sklearn.manifold import TSNE
import plotly.graph_objects as go

tsne = TSNE(n_components=3, random_state=42)
reduced_vectors = tsne.fit_transform(vectors)

fig = go.Figure(data=[go.Scatter3d(
    x=reduced_vectors[:, 0],
    y=reduced_vectors[:, 1],
    z=reduced_vectors[:, 2],
    mode='markers',
    marker=dict(size=2, color=colors, opacity=0.7),
    text=[f"Category: {c}<br>Text: {d[:50]}..." for c, d in zip(categories, documents)],
    hoverinfo='text'
)])

fig.update_layout(
    title='3D Chroma Vector Store Visualization',
    scene=dict(xaxis_title='x', yaxis_title='y', zaxis_title='z'),
    width=1200, height=800,
    margin=dict(r=20, b=10, l=10, t=40)
)
fig.show()

It's one thing to be told "embeddings capture semantic similarity" — it's another to literally see the same product categories form tight clusters in 3D space.

The actual RAG technique is simple once the vector store exists: for a new product, find its 5 nearest neighbours, and stuff their descriptions and prices into the prompt as context before asking GPT-5.1 to estimate the price:

def find_similars(item):
    vec = vector(item)
    results = collection.query(query_embeddings=vec.astype(float).tolist(), n_results=5)
    documents = results['documents'][0][:]
    prices = [m['price'] for m in results['metadatas'][0][:]]
    return documents, prices

def make_context(similars, prices):
    message = "For context, here are some other items that might be similar to the item you need to estimate.\n\n"
    for similar, price in zip(similars, prices):
        message += f"Potentially related product:\n{similar}\nPrice is ${price:.2f}\n\n"
    return message

def messages_for(item, similars, prices):
    message = f"Estimate the price of this product. Respond with the price, no explanation\n\n{item.summary}\n\n"
    message += make_context(similars, prices)
    return [{"role": "user", "content": message}]

def gpt_5__1_rag(item):
    documents, prices = find_similars(item)
    response = completion(model="gpt-5.1", messages=messages_for(item, documents, prices), reasoning_effort="none", seed=42)
    return response.choices[0].message.content

This became the heart of the Frontier Agent.

The biggest "aha" of Day 2 was realizing that three completely different approaches to the same problem — estimate a product's price — could be blended into something better than any one of them alone:

gpt_5__1_rag

) — frontier model with retrieved contextspecialist

) — small model, fine-tuned specifically on this taskdeep_neural_network

)

def get_price(reply):
    reply = reply.replace("$", "").replace(",", "")
    match = re.search(r"[-+]?\d*\.\d+|\d+", reply)
    return float(match.group()) if match else 0

def specialist(item):
    return pricer.price.remote(item.summary)

def ensemble(item):
    price1 = get_price(gpt_5__1_rag(item))
    price2 = specialist(item)
    price3 = deep_neural_network(item)
    return price1 * 0.8 + price2 * 0.1 + price3 * 0.1

The weighting (0.8 / 0.1 / 0.1

) was chosen because, when evaluated against held-out test data, the RAG-based frontier model was the strongest individual predictor — but the other two still nudged the final estimate in a useful direction. This is essentially a tiny, hand-tuned mixture-of-experts, and it generalizes to most "estimate a number from text" problems: get a few independent estimators, then blend.

By the end of Day 2, all three of these had been wrapped into proper agent classes — FrontierAgent

, NeuralNetworkAgent

, and EnsembleAgent

— each exposing a simple .price(description)

method, ready to be called by higher-level orchestration.

Day 2 answered "given a deal, how much is it really worth?". Day 3 answered the question that has to come first: "where do the deals come from in the first place, and how do I find out about a good one without staring at a screen?"

The Scanner Agent subscribes to deal RSS feeds, scrapes the raw listings, and then asks a cheap LLM (openai/gpt-oss-20b:free

via OpenRouter) to pick the 5 best-described deals — specifically ones where the price is unambiguous, since deal sites love phrases like "$50 off" which describe the discount, not the price.

The prompt design here was a small lesson in itself — being explicit about edge cases massively improves reliability:

SYSTEM_PROMPT = """You identify and summarize the 5 most detailed deals from a list, by selecting deals that have the most detailed, high quality description and the most clear price.
Respond strictly in JSON with no explanation, using this format. You should provide the price as a number derived from the description. If the price of a deal isn't clear, do not include that deal in your response.
Most important is that you respond with the 5 deals that have the most detailed product description with price. It's not important to mention the terms of the deal; most important is a thorough description of the product.
Be careful with products that are described as "$XXX off" or "reduced by $XXX" - this isn't the actual price of the product. Only respond with products when you are highly confident about the price. 
"""

USER_PROMPT_PREFIX = """Respond with the most promising 5 deals from this list, selecting those which have the most detailed, high quality product description and a clear price that is greater than 0.
You should rephrase the description to be a summary of the product itself, not the terms of the deal.
Remember to respond with a short paragraph of text in the product_description field for each of the 5 items that you select.
Be careful with products that are described as "$XXX off" or "reduced by $XXX" - this isn't the actual price of the product. Only respond with products when you are highly confident about the price. 

Deals:

"""

USER_PROMPT_SUFFIX = "\n\nInclude exactly 5 deals, no more."

Combined with structured output (Pydantic models via .chat.completions.parse(... response_format=DealSelection ...)

), this guarantees the agent returns exactly the shape of data the rest of the pipeline expects — no brittle JSON-parsing of free text required.

The last piece of Day 3 was closing the loop with the outside world: push notifications. Pushover makes this almost embarrassingly easy — register an app, get a user key and an API token, and send a notification with a single HTTP POST:

pushover_user = os.getenv('PUSHOVER_USER')
pushover_token = os.getenv('PUSHOVER_TOKEN')
pushover_url = "https://api.pushover.net/1/messages.json"

def push(message):
    print(f"Push: {message}")
    payload = {"user": pushover_user, "token": pushover_token, "message": message}
    requests.post(pushover_url, data=payload)

This got wrapped into a MessagingAgent

with a .notify(description, deal_price, estimated_value, url)

method — turning "we found a great deal" into "your phone buzzes."

Takeaway: Agentic systems feel magical, but a lot of the magic is just plumbing — RSS feeds in, structured LLM output, push notifications out. Getting the plumbing rock-solid (and the prompts very explicit about edge cases) is what makes the "intelligent" part trustworthy.

This was, for me, the most conceptually important day. Up to this point, every agent was called explicitly by my code: "now run the scanner," "now run the ensemble," "now send a notification." Day 4 flips that around — the LLM itself decides what to do and in what order, by calling tools.

Before wiring up the real agents, the notebook builds three fake functions just to understand the tool-calling loop:

def scan_the_internet_for_bargains() -> str:
    """ This tool scans the internet for great deals and gets a curated list of promising deals """
    print("Fake function to scan the internet - this returns a hardcoded set of deals")
    return test_results.model_dump_json()

def estimate_true_value(description: str) -> str:
    """
    This tool estimates the true value of a product based on a text description of it
    """
    print(f"Fake function to estimating true value of {description[:20]}... - this always returns $300")
    return f"Product {description} has an estimated true value of $300"

def notify_user_of_deal(description: str, deal_price: float, estimated_true_value: float, url: str) -> str:
    """
    This tool notifies the user of a great deal, given a description of it, the price of the deal, and the estimated true value
    """
    print(f"Fake function to notify user of {description} which costs {deal_price} and estimate is {estimated_true_value}")
    return "notification sent ok"

Each tool also needs a JSON Schema describing its name, description, and parameters — this is what actually gets sent to the LLM so it knows what's available and how to call it:

scan_function = {
    "name": "scan_the_internet_for_bargains",
    "description": "Returns top bargains scraped from the internet along with the price each item is being offered for",
    "parameters": {
        "type": "object",
        "properties": {},
        "required": [],
        "additionalProperties": False
    }
}

notify_function = {
    "name": "notify_user_of_deal",
    "description": "Send the user a push notification about the single most compelling deal; only call this one time",
    "parameters": {
        "type": "object",
        "properties": {
            "description": {"type": "string", "description": "The description of the item itself scraped from the internet"},
            "deal_price": {"type": "number", "description": "The price offered by this deal scraped from the internet"},
            "estimated_true_value": {"type": "number", "description": "The estimated actual value that this is worth"},
            "url": {"type": "string", "description": "The URL of this deal as scraped from the internet"}
        },
        "required": ["description", "deal_price", "estimated_true_value", "url"],
        "additionalProperties": False
    }
}

tools = [{"type": "function", "function": scan_function},
         {"type": "function", "function": estimate_function},
         {"type": "function", "function": notify_function}]

The real magic is this loop. The LLM is given the tools and a goal; if it decides to call a tool, the code executes the real Python function and feeds the result back in — and this repeats until the model is satisfied:

def handle_tool_call(message):
    """
    Actually call the tools associated with this message
    """
    results = []
    for tool_call in message.tool_calls:
        tool_name = tool_call.function.name
        raw_args = json.loads(tool_call.function.arguments)
        tool = globals().get(tool_name)

        if tool:
            valid_params = set(inspect.signature(tool).parameters.keys())
            arguments = {k: v for k, v in raw_args.items() if k in valid_params}
            result = tool(**arguments)
        else:
            result = {}

        results.append({"role": "tool", "content": json.dumps(result), "tool_call_id": tool_call.id})
    return results

system_message = "You find great deals on bargain products using your tools, and notify the user of the best bargain."
user_message = """
First, use your tool to scan the internet for bargain deals. Then for each deal, use your tool to estimate its true value.
Then pick the single most compelling deal where the price is much lower than the estimated true value, and use your tool to notify the user.
Then just reply OK to indicate success.
"""
messages = [{"role": "system", "content": system_message}, {"role": "user", "content": user_message}]

done = False
while not done:
    response = openai.chat.completions.create(model=MODEL, messages=messages, tools=tools)
    if response.choices[0].finish_reason == "tool_calls":
        message = response.choices[0].message
        results = handle_tool_call(message)
        messages.append(message)
        messages.extend(results)
    else:
        done = True
response.choices[0].message.content

A subtlety that's easy to miss but really matters in practice: smaller, free-tier models sometimes hallucinate extra arguments in their tool calls (like an empty-string key ""

for a function that takes no parameters at all). The fix — filtering raw_args

down to only the parameters the function's signature actually accepts via inspect.signature

— is the kind of defensive coding that's invisible until you're debugging a mysterious TypeError

at 11pm.

Once the loop works with fake functions, the swap to the real AutonomousPlanningAgent

is almost anticlimactic — same loop, same tool schemas, but scan_the_internet_for_bargains

now really calls the ScannerAgent

, estimate_true_value

really calls the EnsembleAgent

, and notify_user_of_deal

really calls the MessagingAgent

:

DB = "products_vectorstore"
client = chromadb.PersistentClient(path=DB)
collection = client.get_or_create_collection('products')

from agents.autonomous_planning_agent import AutonomousPlanningAgent
agent = AutonomousPlanningAgent(collection)
agent.plan()

Takeaway: Tool/function calling turns an LLM from "a thing that writes text" into "a thing that orchestrates other systems." The hard part isn't the API call — it's (a) writing tight descriptions so the model picks the right tool, and (b) writing tolerant glue code, because the model will occasionally send malformed arguments.

The final day was about productionizing: wrapping everything in a reusable framework with persistent memory, colored logs, and a Gradio dashboard that updates in real time.

DealAgentFramework

is the top-level orchestrator. It owns the Chroma client, lazily creates the PlanningAgent

, and — critically — persists discovered deals to memory.json

so the system remembers what it's already found across restarts:

import os
import sys
import logging
import json
from typing import List
from dotenv import load_dotenv
import chromadb
from agents.planning_agent import PlanningAgent
from agents.deals import Opportunity
from sklearn.manifold import TSNE
import numpy as np

load_dotenv(override=True)

BG_BLUE = "\033[44m"
WHITE = "\033[37m"
RESET = "\033[0m"

CATEGORIES = [
    "Appliances",
    "Automotive",
    "Cell_Phones_and_Accessories",
    "Electronics",
    "Musical_Instruments",
    "Office_Products",
    "Tools_and_Home_Improvement",
    "Toys_and_Games",
]
COLORS = ["red", "blue", "brown", "orange", "yellow", "green", "purple", "cyan"]

def init_logging():
    root = logging.getLogger()
    root.setLevel(logging.INFO)

    handler = logging.StreamHandler(sys.stdout)
    handler.setLevel(logging.INFO)
    formatter = logging.Formatter(
        "[%(asctime)s] [Agents] [%(levelname)s] %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S %z",
    )
    handler.setFormatter(formatter)
    root.addHandler(handler)

class DealAgentFramework:
    DB = "products_vectorstore"
    MEMORY_FILENAME = "memory.json"

    def __init__(self):
        init_logging()
        client = chromadb.PersistentClient(path=self.DB)
        self.memory = self.read_memory()
        self.collection = client.get_or_create_collection("products")
        self.planner = None

    def init_agents_as_needed(self):
        if not self.planner:
            self.log("Initializing Agent Framework")
            self.planner = PlanningAgent(self.collection)
            self.log("Agent Framework is ready")

    def read_memory(self) -> List[Opportunity]:
        if os.path.exists(self.MEMORY_FILENAME):
            with open(self.MEMORY_FILENAME, "r") as file:
                data = json.load(file)
            opportunities = [Opportunity(**item) for item in data]
            return opportunities
        return []

    def write_memory(self) -> None:
        data = [opportunity.model_dump() for opportunity in self.memory]
        with open(self.MEMORY_FILENAME, "w") as file:
            json.dump(data, file, indent=2)

    @classmethod
    def reset_memory(cls) -> None:
        data = []
        if os.path.exists(cls.MEMORY_FILENAME):
            with open(cls.MEMORY_FILENAME, "r") as file:
                data = json.load(file)
        truncated = data[:2]
        with open(cls.MEMORY_FILENAME, "w") as file:
            json.dump(truncated, file, indent=2)

    def log(self, message: str):
        text = BG_BLUE + WHITE + "[Agent Framework] " + message + RESET
        logging.info(text)

    def run(self) -> List[Opportunity]:
        self.init_agents_as_needed()
        logging.info("Kicking off Planning Agent")
        result = self.planner.plan(memory=self.memory)
        logging.info(f"Planning Agent has completed and returned: {result}")
        if result:
            self.memory.append(result)
            self.write_memory()
        return self.memory

    @classmethod
    def get_plot_data(cls, max_datapoints=2000):
        client = chromadb.PersistentClient(path=cls.DB)
        collection = client.get_or_create_collection("products")
        result = collection.get(
            include=["embeddings", "documents", "metadatas"], limit=max_datapoints
        )
        vectors = np.array(result["embeddings"])
        documents = result["documents"]
        categories = [metadata["category"] for metadata in result["metadatas"]]
        colors = [COLORS[CATEGORIES.index(c)] for c in categories]
        tsne = TSNE(n_components=3, random_state=42, n_jobs=-1)
        reduced_vectors = tsne.fit_transform(vectors)
        return documents, reduced_vectors, colors

if __name__ == "__main__":
    DealAgentFramework().run()

A few patterns I want to remember from this file:

init_agents_as_needed

) — spinning up the full agent stack (which includes models and connecting to vector stores) is expensive, so it only happens once, on first use.memory.json

is literally a list of Opportunity

objects (a deal + an estimated value + a discount), serialized via Pydantic's model_dump()

.reset_memory

as a classmethodBG_BLUE

, WHITE

, RESET

) — a small touch, but it makes the live log stream from multiple agents Speaking of colors — the terminal uses ANSI escape codes, but the Gradio UI renders HTML. log_utils.py

is a tiny but clever bridge between the two: it maps each ANSI color combination to a CSS hex color and swaps the escape codes for <span style="color: ...">

tags:

RED = '\033[31m'
GREEN = '\033[32m'
YELLOW = '\033[33m'
BLUE = '\033[34m'
MAGENTA = '\033[35m'
CYAN = '\033[36m'
WHITE = '\033[37m'

BG_BLACK = '\033[40m'
BG_BLUE = '\033[44m'

RESET = '\033[0m'

mapper = {
    BG_BLACK+RED: "#dd0000",
    BG_BLACK+GREEN: "#00dd00",
    BG_BLACK+YELLOW: "#dddd00",
    BG_BLACK+BLUE: "#0000ee",
    BG_BLACK+MAGENTA: "#aa00dd",
    BG_BLACK+CYAN: "#00dddd",
    BG_BLACK+WHITE: "#87CEEB",
    BG_BLUE+WHITE: "#ff7800"
}

def reformat(message):
    for key, value in mapper.items():
        message = message.replace(key, f'<span style="color: {value}">')
    message = message.replace(RESET, '</span>')
    return message

Every agent in the system logs its activity with a different color (set in its own __init__

), so when this gets rendered in the browser, you can visually tell at a glance which agent is talking — the planner, the scanner, the frontier agent, etc. — without reading a single word.

The UI was built up in layers, which I think is a great way to learn Gradio:

Layer 1 — just get something on screen:

with gr.Blocks(title="The Price is Right", fill_width=True) as ui:
    with gr.Row():
        gr.Markdown('<div style="text-align: center;font-size:24px">The Price is Right - Deal Hunting Agentic AI</div>')
    with gr.Row():
        gr.Markdown('<div style="text-align: center;font-size:14px">Autonomous agent framework that finds online deals, collaborating with a proprietary fine-tuned LLM deployed on Modal, and a RAG pipeline with a frontier model and Chroma.</div>')

ui.launch(inbrowser=True)

Layer 2 — add a live data table backed by application state:

with gr.Blocks(title="The Price is Right", fill_width=True) as ui:

    initial_deal = Deal(product_description="Example description", price=100.0, url="https://cnn.com")
    initial_opportunity = Opportunity(deal=initial_deal, estimate=200.0, discount=100.0)
    opportunities = gr.State([initial_opportunity])

    def get_table(opps):
        return [[opp.deal.product_description, opp.deal.price, opp.estimate, opp.discount, opp.deal.url] for opp in opps]

    with gr.Row():
        opportunities_dataframe = gr.Dataframe(
            headers=["Description", "Price", "Estimate", "Discount", "URL"],
            wrap=True,
            column_widths=[4, 1, 1, 1, 2],
            row_count=10,
            col_count=5,
            max_height=400,
        )

    ui.load(get_table, inputs=[opportunities], outputs=[opportunities_dataframe])

ui.launch(inbrowser=True)

A small but important Gradio version note from this layer: in Gradio v5, the height

parameter for Dataframe

was renamed to max_height

— exactly the kind of breaking change that's easy to lose an hour to if you don't know to look for it.

Layer 3 — wire up real agents and make rows clickable:

agent_framework = DealAgentFramework()
agent_framework.init_agents_as_needed()

with gr.Blocks(title="The Price is Right", fill_width=True) as ui:
    ...
    def do_select(opportunities, selected_index: gr.SelectData):
        row = selected_index.index[0]
        opportunity = opportunities[row]
        agent_framework.planner.messenger.alert(opportunity)
    ...
    opportunities_dataframe.select(do_select, inputs=[opportunities], outputs=[])

ui.launch(inbrowser=True)

The fully assembled price_is_right.py

brings everything together: a background thread runs the agent framework's run()

loop, a queue.Queue

-based logging handler streams log lines into the UI in (near) real time, and a 3D Plotly visualization of the product vector store sits alongside the deal table:

import logging
import queue
import threading
import time
import gradio as gr
from deal_agent_framework import DealAgentFramework
from log_utils import reformat
import plotly.graph_objects as go
from dotenv import load_dotenv

load_dotenv(override=True)

class QueueHandler(logging.Handler):
    def __init__(self, log_queue):
        super().__init__()
        self.log_queue = log_queue

    def emit(self, record):
        self.log_queue.put(self.format(record))

def html_for(log_data):
    output = "<br>".join(log_data[-18:])
    return f"""
    <div id="scrollContent" style="height: 400px; overflow-y: auto; border: 1px solid #ccc; background-color: #222229; padding: 10px;">
    {output}
    </div>
    """

def setup_logging(log_queue):
    handler = QueueHandler(log_queue)
    formatter = logging.Formatter(
        "[%(asctime)s] %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S %z",
    )
    handler.setFormatter(formatter)
    logger = logging.getLogger()
    logger.addHandler(handler)
    logger.setLevel(logging.INFO)

class App:
    def __init__(self):
        self.agent_framework = None

    def get_agent_framework(self):
        if not self.agent_framework:
            self.agent_framework = DealAgentFramework()
        return self.agent_framework

    def run(self):
        with gr.Blocks(title="The Price is Right", fill_width=True) as ui:
            log_data = gr.State([])

            def table_for(opps):
                return [
                    [
                        opp.deal.product_description,
                        f"${opp.deal.price:.2f}",
                        f"${opp.estimate:.2f}",
                        f"${opp.discount:.2f}",
                        opp.deal.url,
                    ]
                    for opp in opps
                ]

            def update_output(log_data, log_queue, result_queue):
                initial_result = table_for(self.get_agent_framework().memory)
                final_result = None
                while True:
                    try:
                        message = log_queue.get_nowait()
                        log_data.append(reformat(message))
                        yield log_data, html_for(log_data), final_result or initial_result
                    except queue.Empty:
                        try:
                            final_result = result_queue.get_nowait()
                            yield log_data, html_for(log_data), final_result or initial_result
                        except queue.Empty:
                            if final_result is not None:
                                break
                            time.sleep(0.1)

            def get_plot():
                documents, vectors, colors = DealAgentFramework.get_plot_data(max_datapoints=800)
                fig = go.Figure(
                    data=[
                        go.Scatter3d(
                            x=vectors[:, 0],
                            y=vectors[:, 1],
                            z=vectors[:, 2],
                            mode="markers",
                            marker=dict(size=2, color=colors, opacity=0.7),
                        )
                    ]
                )
                fig.update_layout(
                    scene=dict(
                        xaxis_title="x", yaxis_title="y", zaxis_title="z",
                        aspectmode="manual",
                        aspectratio=dict(x=2.2, y=2.2, z=1),
                        camera=dict(eye=dict(x=1.6, y=1.6, z=0.8)),
                    ),
                    height=400,
                    margin=dict(r=5, b=1, l=5, t=2),
                )
                return fig

            def do_run():
                new_opportunities = self.get_agent_framework().run()
                return table_for(new_opportunities)

            def run_with_logging(initial_log_data):
                log_queue = queue.Queue()
                result_queue = queue.Queue()
                setup_logging(log_queue)

                def worker():
                    result_queue.put(do_run())

                thread = threading.Thread(target=worker)
                thread.start()

                for log_data, output, final_result in update_output(initial_log_data, log_queue, result_queue):
                    yield log_data, output, final_result

            def do_select(selected_index: gr.SelectData):
                opportunities = self.get_agent_framework().memory
                row = selected_index.index[0]
                opportunity = opportunities[row]
                self.get_agent_framework().planner.messenger.alert(opportunity)

            with gr.Row():
                opportunities_dataframe = gr.Dataframe(
                    headers=["Deals found so far", "Price", "Estimate", "Discount", "URL"],
                    wrap=True, column_widths=[6, 1, 1, 1, 3],
                    row_count=10, col_count=5, max_height=400,
                )
            with gr.Row():
                with gr.Column(scale=1):
                    logs = gr.HTML()
                with gr.Column(scale=1):
                    plot = gr.Plot(value=get_plot(), show_label=False)

            ui.load(run_with_logging, inputs=[log_data], outputs=[log_data, logs, opportunities_dataframe])

            timer = gr.Timer(value=300, active=True)
            timer.tick(run_with_logging, inputs=[log_data], outputs=[log_data, logs, opportunities_dataframe])

            opportunities_dataframe.select(do_select)

        ui.launch(share=False, inbrowser=True)

if __name__ == "__main__":
    App().run()

The two patterns I most want to carry forward from this file:

run_with_logging

is a Python yield

s updated state — so the UI refreshes live while a slow agentic process runs, instead of freezing for the whole duration.gr.Timer

for autonomous operation.Timer

set to 300 seconds means the whole "scan → estimate → notify" cycle re-runs automatically every 5 minutes — turning a notebook experiment into something that genuinely behaves like a background agent.A few cross-cutting lessons that apply far beyond this specific project:

0.8 / 0.1 / 0.1

) outperformed any single one on the held-out test set.List[Opportunity]

to memory.json

was enough to give the system continuity across restarts.Putting it all together, DealAgentFramework().run()

now quietly: scans deal feeds, filters to the 5 best-described deals, estimates each one's true value via an ensemble of three models, picks the single best opportunity, saves it to memory, and — if it's a great deal — buzzes my phone. All while a live dashboard shows exactly what's happening and why.

source & further reading

dev.to — original article 10 AI Habits I Wish I'd Built Sooner as a Software Engineer (And You Should Start Today) Top AI Agent Standards to Know in 2026 I Tested DeepSeek V4 and V4 Flash Side by Side — Here's the Truth

What I Learned Building an Autonomous Deal-Hunting Agent System

Run your AI side-project on zahid.host