{"slug": "what-i-learned-building-an-autonomous-deal-hunting-agent-system", "title": "What I Learned Building an Autonomous Deal-Hunting Agent System", "summary": "A developer built a multi-agent AI system called 'The Price Is Right' that autonomously scans the internet for bargains, estimates product value using three pricing techniques, and sends notifications. The system uses Modal for serverless GPU infrastructure and splits tasks among specialized agents coordinated by a planning agent.", "body_md": "Over the past week I built a multi-agent AI system that autonomously scans the internet for bargains, estimates the *true* value of products using three different pricing techniques, and pushes a notification straight to my phone the moment it finds a deal worth acting on. Along the way I picked up a ton of practical, transferable lessons about agentic AI architecture, RAG, fine-tuning vs. prompting, tool calling, and shipping a real (if scrappy) product with a Gradio front end.\n\nThis post is my write-up of the whole journey, with the code that made it work.\n\nThe system, nicknamed **\"The Price Is Right\"**, is built around the idea that no single model is the best at everything. Instead of one giant prompt, the architecture splits the problem into focused agents that each do one job well, coordinated by a planning agent:\n\nHere's how the five days of building this broke down.\n\nThe first lesson was about **infrastructure**: how do you run a model (especially a fine-tuned open-source LLM) without managing your own GPU server? The answer here was [Modal](https://modal.com), a serverless platform for running Python functions in the cloud — including on GPUs.\n\nThe \"hello world\" of Modal is refreshingly simple. You define an `App`\n\n, an `Image`\n\n(essentially a container spec with pip dependencies), and decorate a normal Python function with `@app.function`\n\n:\n\n``` python\n# hello.py\nimport modal\nfrom modal import Image\n\n# Setup\n\napp = modal.App(\"hello\")\nimage = Image.debian_slim().pip_install(\"requests\")\n\n# Hello!\n\n@app.function(image=image)\ndef hello() -> str:\n    import requests\n\n    response = requests.get(\"https://ipinfo.io/json\")\n    data = response.json()\n    city, region, country = data[\"city\"], data[\"region\"], data[\"country\"]\n    return f\"Hello from {city}, {region}, {country}!!\"\n\n# New - added thanks to student Tue H.!\n\n@app.function(image=image, region=\"eu\")\ndef hello_europe() -> str:\n    import requests\n\n    response = requests.get(\"https://ipinfo.io/json\")\n    data = response.json()\n    city, region, country = data[\"city\"], data[\"region\"], data[\"country\"]\n    return f\"Hello from {city}, {region}, {country}!!\"\n```\n\nWhat I loved here is the `region=\"eu\"`\n\nparameter — with one keyword argument you can pin where in the world your function actually executes, which matters for latency, data residency, and sometimes cost.\n\nCalling it locally vs. remotely is just as simple from a notebook:\n\n``` python\nfrom hello import app, hello, hello_europe\n\nwith app.run():\n    reply = hello.local()   # runs on your machine\n\nwith app.run():\n    reply = hello.remote()  # runs on Modal's cloud\n```\n\nThe next step up is running an actual language model. This is where Modal's GPU support and **Secrets** come in — you don't want to hardcode your Hugging Face token, so you register it once in Modal's dashboard under a name (e.g. `huggingface-secret`\n\n) and reference it in code:\n\n``` python\n# llama.py\nimport modal\nfrom modal import Image\n\n# Setup\n\napp = modal.App(\"llama\")\nimage = Image.debian_slim().pip_install(\"torch\", \"transformers\", \"accelerate\")\nsecrets = [modal.Secret.from_name(\"huggingface-secret\")]\nGPU = \"T4\"\nMODEL_NAME = \"meta-llama/Llama-3.2-3B\"\n\n@app.function(image=image, secrets=secrets, gpu=GPU, timeout=1800)\ndef generate(prompt: str) -> str:\n    from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed\n\n    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)\n    tokenizer.pad_token = tokenizer.eos_token\n    tokenizer.padding_side = \"right\"\n\n    model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map=\"auto\")\n\n    set_seed(42)\n    inputs = tokenizer.encode(prompt, return_tensors=\"pt\").to(\"cuda\")\n    outputs = model.generate(inputs, max_new_tokens=5)\n    return tokenizer.decode(outputs[0])\n```\n\nA few things clicked for me here:\n\n`gpu=\"T4\"`\n\nis `timeout=1800`\n\nmatters because the first call has to download and load model weights — that cold start can take minutes.`transformers`\n\nimports, tokenizer, model) is imported The really important conceptual jump on Day 1 was going from an **ephemeral app** (`with app.run(): ...`\n\n, which spins up and tears down for a single call) to a **deployed app**:\n\n```\nuv run modal deploy -m pricer_service\n```\n\nOnce deployed, the service runs independently of my notebook, and I can call it from anywhere just by referencing it by name:\n\n``` python\nimport modal\nPricer = modal.Cls.from_name(\"pricer-service\", \"Pricer\")\npricer = Pricer()\nreply = pricer.price.remote(\"Quadcast HyperX condenser mic, connects via usb-c to your computer for crystal clear audio\")\nprint(reply)\n```\n\nThis is essentially how you'd put a fine-tuned model \"behind an API\" for a production system — and it's the foundation for the **Specialist Agent**, which wraps this exact deployed pricer.\n\nThere's also a nice optimization here: by default a Modal container scales down to zero when idle, so the *first* call after inactivity can take ~30 seconds to wake up. If you're willing to spend a few extra credits, you can keep a container warm:\n\n``` python\nimport modal\nPricer = modal.Cls.from_name(\"pricer-service\", \"Pricer\")\npricer = Pricer()\npricer.update_autoscaler(scaledown_window=1200)  # stay warm for 20 minutes\n```\n\n**Takeaway:** Modal turns \"deploy a fine-tuned model as a microservice\" into a one-line decorator and a one-line CLI command. The mental model — *write a normal Python function, decorate it, deploy it, call it like a remote object* — is something I'll reuse for any future \"specialist model as a service\" project.\n\nDay 2 was about a different way to make a frontier model (GPT-5.1) better at a narrow task — **Retrieval Augmented Generation (RAG)** — and then about combining *multiple* pricing strategies into one.\n\nThe first ingredient is a local, open-source **sentence embedding model**, which turns text into a 384-dimensional vector capturing its meaning:\n\n``` python\nfrom sentence_transformers import SentenceTransformer\n\nencoder = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')\n\n# Pass in a list of texts, get back a numpy array of vectors\nvector = encoder.encode([\"A proficient AI engineer who has almost reached the finale of AI Engineering Core Track!\"])[0]\nprint(vector.shape)  # (384,)\n```\n\nThese vectors get stored — along with the product description and metadata (category, price) — in a **Chroma** vector database, batched 1,000 items at a time across hundreds of thousands of products:\n\n```\ncollection_name = \"products\"\nexisting_collection_names = [collection.name for collection in client.list_collections()]\n\nif collection_name not in existing_collection_names:\n    collection = client.create_collection(collection_name)\n    for i in tqdm(range(0, len(train), 1000)):\n        documents = [item.summary for item in train[i: i+1000]]\n        vectors = encoder.encode(documents).astype(float).tolist()\n        metadatas = [{\"category\": item.category, \"price\": item.price} for item in train[i: i+1000]]\n        ids = [f\"doc_{j}\" for j in range(i, i+1000)]\n        ids = ids[:len(documents)]\n        collection.add(ids=ids, documents=documents, embeddings=vectors, metadatas=metadatas)\n\ncollection = client.get_or_create_collection(collection_name)\n```\n\nOne of the most satisfying moments was reducing those 384-dimensional vectors down to 3D with **t-SNE** and seeing the products cluster by category — electronics in one corner, musical instruments in another:\n\n``` python\nfrom sklearn.manifold import TSNE\nimport plotly.graph_objects as go\n\ntsne = TSNE(n_components=3, random_state=42)\nreduced_vectors = tsne.fit_transform(vectors)\n\nfig = go.Figure(data=[go.Scatter3d(\n    x=reduced_vectors[:, 0],\n    y=reduced_vectors[:, 1],\n    z=reduced_vectors[:, 2],\n    mode='markers',\n    marker=dict(size=2, color=colors, opacity=0.7),\n    text=[f\"Category: {c}<br>Text: {d[:50]}...\" for c, d in zip(categories, documents)],\n    hoverinfo='text'\n)])\n\nfig.update_layout(\n    title='3D Chroma Vector Store Visualization',\n    scene=dict(xaxis_title='x', yaxis_title='y', zaxis_title='z'),\n    width=1200, height=800,\n    margin=dict(r=20, b=10, l=10, t=40)\n)\nfig.show()\n```\n\nIt's one thing to be told \"embeddings capture semantic similarity\" — it's another to literally *see* the same product categories form tight clusters in 3D space.\n\nThe actual RAG technique is simple once the vector store exists: for a new product, find its 5 nearest neighbours, and stuff their descriptions *and prices* into the prompt as context before asking GPT-5.1 to estimate the price:\n\n``` python\ndef find_similars(item):\n    vec = vector(item)\n    results = collection.query(query_embeddings=vec.astype(float).tolist(), n_results=5)\n    documents = results['documents'][0][:]\n    prices = [m['price'] for m in results['metadatas'][0][:]]\n    return documents, prices\n\ndef make_context(similars, prices):\n    message = \"For context, here are some other items that might be similar to the item you need to estimate.\\n\\n\"\n    for similar, price in zip(similars, prices):\n        message += f\"Potentially related product:\\n{similar}\\nPrice is ${price:.2f}\\n\\n\"\n    return message\n\ndef messages_for(item, similars, prices):\n    message = f\"Estimate the price of this product. Respond with the price, no explanation\\n\\n{item.summary}\\n\\n\"\n    message += make_context(similars, prices)\n    return [{\"role\": \"user\", \"content\": message}]\n\ndef gpt_5__1_rag(item):\n    documents, prices = find_similars(item)\n    response = completion(model=\"gpt-5.1\", messages=messages_for(item, documents, prices), reasoning_effort=\"none\", seed=42)\n    return response.choices[0].message.content\n```\n\nThis became the heart of the **Frontier Agent**.\n\nThe biggest \"aha\" of Day 2 was realizing that three completely different approaches to the *same* problem — estimate a product's price — could be **blended** into something better than any one of them alone:\n\n`gpt_5__1_rag`\n\n) — frontier model with retrieved context`specialist`\n\n) — small model, fine-tuned specifically on this task`deep_neural_network`\n\n)\n\n``` python\ndef get_price(reply):\n    reply = reply.replace(\"$\", \"\").replace(\",\", \"\")\n    match = re.search(r\"[-+]?\\d*\\.\\d+|\\d+\", reply)\n    return float(match.group()) if match else 0\n\ndef specialist(item):\n    return pricer.price.remote(item.summary)\n\ndef ensemble(item):\n    price1 = get_price(gpt_5__1_rag(item))\n    price2 = specialist(item)\n    price3 = deep_neural_network(item)\n    return price1 * 0.8 + price2 * 0.1 + price3 * 0.1\n```\n\nThe weighting (`0.8 / 0.1 / 0.1`\n\n) was chosen because, when evaluated against held-out test data, the RAG-based frontier model was the strongest individual predictor — but the other two still nudged the final estimate in a useful direction. This is essentially a tiny, hand-tuned **mixture-of-experts**, and it generalizes to most \"estimate a number from text\" problems: get a few independent estimators, then blend.\n\nBy the end of Day 2, all three of these had been wrapped into proper agent classes — `FrontierAgent`\n\n, `NeuralNetworkAgent`\n\n, and `EnsembleAgent`\n\n— each exposing a simple `.price(description)`\n\nmethod, ready to be called by higher-level orchestration.\n\nDay 2 answered *\"given a deal, how much is it really worth?\"*. Day 3 answered the question that has to come first: *\"where do the deals come from in the first place, and how do I find out about a good one without staring at a screen?\"*\n\nThe **Scanner Agent** subscribes to deal RSS feeds, scrapes the raw listings, and then asks a cheap LLM (`openai/gpt-oss-20b:free`\n\nvia OpenRouter) to pick the 5 *best-described* deals — specifically ones where the price is unambiguous, since deal sites love phrases like \"$50 off\" which describe the *discount*, not the *price*.\n\nThe prompt design here was a small lesson in itself — being explicit about edge cases massively improves reliability:\n\n```\nSYSTEM_PROMPT = \"\"\"You identify and summarize the 5 most detailed deals from a list, by selecting deals that have the most detailed, high quality description and the most clear price.\nRespond strictly in JSON with no explanation, using this format. You should provide the price as a number derived from the description. If the price of a deal isn't clear, do not include that deal in your response.\nMost important is that you respond with the 5 deals that have the most detailed product description with price. It's not important to mention the terms of the deal; most important is a thorough description of the product.\nBe careful with products that are described as \"$XXX off\" or \"reduced by $XXX\" - this isn't the actual price of the product. Only respond with products when you are highly confident about the price. \n\"\"\"\n\nUSER_PROMPT_PREFIX = \"\"\"Respond with the most promising 5 deals from this list, selecting those which have the most detailed, high quality product description and a clear price that is greater than 0.\nYou should rephrase the description to be a summary of the product itself, not the terms of the deal.\nRemember to respond with a short paragraph of text in the product_description field for each of the 5 items that you select.\nBe careful with products that are described as \"$XXX off\" or \"reduced by $XXX\" - this isn't the actual price of the product. Only respond with products when you are highly confident about the price. \n\nDeals:\n\n\"\"\"\n\nUSER_PROMPT_SUFFIX = \"\\n\\nInclude exactly 5 deals, no more.\"\n```\n\nCombined with structured output (Pydantic models via `.chat.completions.parse(... response_format=DealSelection ...)`\n\n), this guarantees the agent returns *exactly* the shape of data the rest of the pipeline expects — no brittle JSON-parsing of free text required.\n\nThe last piece of Day 3 was closing the loop with the outside world: **push notifications**. [Pushover](https://pushover.net/) makes this almost embarrassingly easy — register an app, get a user key and an API token, and send a notification with a single HTTP POST:\n\n```\npushover_user = os.getenv('PUSHOVER_USER')\npushover_token = os.getenv('PUSHOVER_TOKEN')\npushover_url = \"https://api.pushover.net/1/messages.json\"\n\ndef push(message):\n    print(f\"Push: {message}\")\n    payload = {\"user\": pushover_user, \"token\": pushover_token, \"message\": message}\n    requests.post(pushover_url, data=payload)\n```\n\nThis got wrapped into a `MessagingAgent`\n\nwith a `.notify(description, deal_price, estimated_value, url)`\n\nmethod — turning \"we found a great deal\" into \"your phone buzzes.\"\n\n**Takeaway:** Agentic systems feel magical, but a lot of the magic is just *plumbing* — RSS feeds in, structured LLM output, push notifications out. Getting the plumbing rock-solid (and the prompts very explicit about edge cases) is what makes the \"intelligent\" part trustworthy.\n\nThis was, for me, the most conceptually important day. Up to this point, every agent was called *explicitly* by my code: \"now run the scanner,\" \"now run the ensemble,\" \"now send a notification.\" Day 4 flips that around — the LLM itself decides *what to do and in what order*, by calling **tools**.\n\nBefore wiring up the real agents, the notebook builds three *fake* functions just to understand the tool-calling loop:\n\n``` php\ndef scan_the_internet_for_bargains() -> str:\n    \"\"\" This tool scans the internet for great deals and gets a curated list of promising deals \"\"\"\n    print(\"Fake function to scan the internet - this returns a hardcoded set of deals\")\n    return test_results.model_dump_json()\n\ndef estimate_true_value(description: str) -> str:\n    \"\"\"\n    This tool estimates the true value of a product based on a text description of it\n    \"\"\"\n    print(f\"Fake function to estimating true value of {description[:20]}... - this always returns $300\")\n    return f\"Product {description} has an estimated true value of $300\"\n\ndef notify_user_of_deal(description: str, deal_price: float, estimated_true_value: float, url: str) -> str:\n    \"\"\"\n    This tool notifies the user of a great deal, given a description of it, the price of the deal, and the estimated true value\n    \"\"\"\n    print(f\"Fake function to notify user of {description} which costs {deal_price} and estimate is {estimated_true_value}\")\n    return \"notification sent ok\"\n```\n\nEach tool also needs a JSON Schema describing its name, description, and parameters — this is what actually gets sent to the LLM so it knows what's available and how to call it:\n\n```\nscan_function = {\n    \"name\": \"scan_the_internet_for_bargains\",\n    \"description\": \"Returns top bargains scraped from the internet along with the price each item is being offered for\",\n    \"parameters\": {\n        \"type\": \"object\",\n        \"properties\": {},\n        \"required\": [],\n        \"additionalProperties\": False\n    }\n}\n\nnotify_function = {\n    \"name\": \"notify_user_of_deal\",\n    \"description\": \"Send the user a push notification about the single most compelling deal; only call this one time\",\n    \"parameters\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"description\": {\"type\": \"string\", \"description\": \"The description of the item itself scraped from the internet\"},\n            \"deal_price\": {\"type\": \"number\", \"description\": \"The price offered by this deal scraped from the internet\"},\n            \"estimated_true_value\": {\"type\": \"number\", \"description\": \"The estimated actual value that this is worth\"},\n            \"url\": {\"type\": \"string\", \"description\": \"The URL of this deal as scraped from the internet\"}\n        },\n        \"required\": [\"description\", \"deal_price\", \"estimated_true_value\", \"url\"],\n        \"additionalProperties\": False\n    }\n}\n\ntools = [{\"type\": \"function\", \"function\": scan_function},\n         {\"type\": \"function\", \"function\": estimate_function},\n         {\"type\": \"function\", \"function\": notify_function}]\n```\n\nThe real magic is this loop. The LLM is given the tools and a goal; if it decides to call a tool, the code executes the *real* Python function and feeds the result back in — and this repeats until the model is satisfied:\n\n``` python\ndef handle_tool_call(message):\n    \"\"\"\n    Actually call the tools associated with this message\n    \"\"\"\n    results = []\n    for tool_call in message.tool_calls:\n        tool_name = tool_call.function.name\n        raw_args = json.loads(tool_call.function.arguments)\n        tool = globals().get(tool_name)\n\n        if tool:\n            # Some models (especially smaller free ones) sometimes return\n            # stray/invalid keys (like \"\") in the arguments JSON, even for\n            # functions that take no parameters. Filter to only the keys\n            # the function actually accepts.\n            valid_params = set(inspect.signature(tool).parameters.keys())\n            arguments = {k: v for k, v in raw_args.items() if k in valid_params}\n            result = tool(**arguments)\n        else:\n            result = {}\n\n        results.append({\"role\": \"tool\", \"content\": json.dumps(result), \"tool_call_id\": tool_call.id})\n    return results\n\nsystem_message = \"You find great deals on bargain products using your tools, and notify the user of the best bargain.\"\nuser_message = \"\"\"\nFirst, use your tool to scan the internet for bargain deals. Then for each deal, use your tool to estimate its true value.\nThen pick the single most compelling deal where the price is much lower than the estimated true value, and use your tool to notify the user.\nThen just reply OK to indicate success.\n\"\"\"\nmessages = [{\"role\": \"system\", \"content\": system_message}, {\"role\": \"user\", \"content\": user_message}]\n\ndone = False\nwhile not done:\n    response = openai.chat.completions.create(model=MODEL, messages=messages, tools=tools)\n    if response.choices[0].finish_reason == \"tool_calls\":\n        message = response.choices[0].message\n        results = handle_tool_call(message)\n        messages.append(message)\n        messages.extend(results)\n    else:\n        done = True\nresponse.choices[0].message.content\n```\n\nA subtlety that's easy to miss but really matters in practice: smaller, free-tier models sometimes hallucinate extra arguments in their tool calls (like an empty-string key `\"\"`\n\nfor a function that takes no parameters at all). The fix — filtering `raw_args`\n\ndown to only the parameters the function's signature actually accepts via `inspect.signature`\n\n— is the kind of defensive coding that's invisible until you're debugging a mysterious `TypeError`\n\nat 11pm.\n\nOnce the loop works with fake functions, the swap to the **real** `AutonomousPlanningAgent`\n\nis almost anticlimactic — same loop, same tool schemas, but `scan_the_internet_for_bargains`\n\nnow really calls the `ScannerAgent`\n\n, `estimate_true_value`\n\nreally calls the `EnsembleAgent`\n\n, and `notify_user_of_deal`\n\nreally calls the `MessagingAgent`\n\n:\n\n```\nDB = \"products_vectorstore\"\nclient = chromadb.PersistentClient(path=DB)\ncollection = client.get_or_create_collection('products')\n\nfrom agents.autonomous_planning_agent import AutonomousPlanningAgent\nagent = AutonomousPlanningAgent(collection)\nagent.plan()\n```\n\n**Takeaway:** Tool/function calling turns an LLM from \"a thing that writes text\" into \"a thing that orchestrates other systems.\" The hard part isn't the API call — it's (a) writing tight descriptions so the model picks the right tool, and (b) writing tolerant glue code, because the model *will* occasionally send malformed arguments.\n\nThe final day was about **productionizing**: wrapping everything in a reusable framework with persistent memory, colored logs, and a Gradio dashboard that updates in real time.\n\n`DealAgentFramework`\n\nis the top-level orchestrator. It owns the Chroma client, lazily creates the `PlanningAgent`\n\n, and — critically — persists discovered deals to `memory.json`\n\nso the system remembers what it's already found across restarts:\n\n``` python\nimport os\nimport sys\nimport logging\nimport json\nfrom typing import List\nfrom dotenv import load_dotenv\nimport chromadb\nfrom agents.planning_agent import PlanningAgent\nfrom agents.deals import Opportunity\nfrom sklearn.manifold import TSNE\nimport numpy as np\n\nload_dotenv(override=True)\n\n# Colors for logging\nBG_BLUE = \"\\033[44m\"\nWHITE = \"\\033[37m\"\nRESET = \"\\033[0m\"\n\n# Colors for plot\nCATEGORIES = [\n    \"Appliances\",\n    \"Automotive\",\n    \"Cell_Phones_and_Accessories\",\n    \"Electronics\",\n    \"Musical_Instruments\",\n    \"Office_Products\",\n    \"Tools_and_Home_Improvement\",\n    \"Toys_and_Games\",\n]\nCOLORS = [\"red\", \"blue\", \"brown\", \"orange\", \"yellow\", \"green\", \"purple\", \"cyan\"]\n\ndef init_logging():\n    root = logging.getLogger()\n    root.setLevel(logging.INFO)\n\n    handler = logging.StreamHandler(sys.stdout)\n    handler.setLevel(logging.INFO)\n    formatter = logging.Formatter(\n        \"[%(asctime)s] [Agents] [%(levelname)s] %(message)s\",\n        datefmt=\"%Y-%m-%d %H:%M:%S %z\",\n    )\n    handler.setFormatter(formatter)\n    root.addHandler(handler)\n\nclass DealAgentFramework:\n    DB = \"products_vectorstore\"\n    MEMORY_FILENAME = \"memory.json\"\n\n    def __init__(self):\n        init_logging()\n        client = chromadb.PersistentClient(path=self.DB)\n        self.memory = self.read_memory()\n        self.collection = client.get_or_create_collection(\"products\")\n        self.planner = None\n\n    def init_agents_as_needed(self):\n        if not self.planner:\n            self.log(\"Initializing Agent Framework\")\n            self.planner = PlanningAgent(self.collection)\n            self.log(\"Agent Framework is ready\")\n\n    def read_memory(self) -> List[Opportunity]:\n        if os.path.exists(self.MEMORY_FILENAME):\n            with open(self.MEMORY_FILENAME, \"r\") as file:\n                data = json.load(file)\n            opportunities = [Opportunity(**item) for item in data]\n            return opportunities\n        return []\n\n    def write_memory(self) -> None:\n        data = [opportunity.model_dump() for opportunity in self.memory]\n        with open(self.MEMORY_FILENAME, \"w\") as file:\n            json.dump(data, file, indent=2)\n\n    @classmethod\n    def reset_memory(cls) -> None:\n        data = []\n        if os.path.exists(cls.MEMORY_FILENAME):\n            with open(cls.MEMORY_FILENAME, \"r\") as file:\n                data = json.load(file)\n        truncated = data[:2]\n        with open(cls.MEMORY_FILENAME, \"w\") as file:\n            json.dump(truncated, file, indent=2)\n\n    def log(self, message: str):\n        text = BG_BLUE + WHITE + \"[Agent Framework] \" + message + RESET\n        logging.info(text)\n\n    def run(self) -> List[Opportunity]:\n        self.init_agents_as_needed()\n        logging.info(\"Kicking off Planning Agent\")\n        result = self.planner.plan(memory=self.memory)\n        logging.info(f\"Planning Agent has completed and returned: {result}\")\n        if result:\n            self.memory.append(result)\n            self.write_memory()\n        return self.memory\n\n    @classmethod\n    def get_plot_data(cls, max_datapoints=2000):\n        client = chromadb.PersistentClient(path=cls.DB)\n        collection = client.get_or_create_collection(\"products\")\n        result = collection.get(\n            include=[\"embeddings\", \"documents\", \"metadatas\"], limit=max_datapoints\n        )\n        vectors = np.array(result[\"embeddings\"])\n        documents = result[\"documents\"]\n        categories = [metadata[\"category\"] for metadata in result[\"metadatas\"]]\n        colors = [COLORS[CATEGORIES.index(c)] for c in categories]\n        tsne = TSNE(n_components=3, random_state=42, n_jobs=-1)\n        reduced_vectors = tsne.fit_transform(vectors)\n        return documents, reduced_vectors, colors\n\nif __name__ == \"__main__\":\n    DealAgentFramework().run()\n```\n\nA few patterns I want to remember from this file:\n\n`init_agents_as_needed`\n\n) — spinning up the full agent stack (which includes loading models and connecting to vector stores) is expensive, so it only happens once, on first use.`memory.json`\n\nis literally a list of `Opportunity`\n\nobjects (a deal + an estimated value + a discount), serialized via Pydantic's `model_dump()`\n\n.`reset_memory`\n\nas a classmethod`BG_BLUE`\n\n, `WHITE`\n\n, `RESET`\n\n) — a small touch, but it makes the live log stream from multiple agents Speaking of colors — the terminal uses ANSI escape codes, but the Gradio UI renders HTML. `log_utils.py`\n\nis a tiny but clever bridge between the two: it maps each ANSI color combination to a CSS hex color and swaps the escape codes for `<span style=\"color: ...\">`\n\ntags:\n\n```\n# Foreground colors\nRED = '\\033[31m'\nGREEN = '\\033[32m'\nYELLOW = '\\033[33m'\nBLUE = '\\033[34m'\nMAGENTA = '\\033[35m'\nCYAN = '\\033[36m'\nWHITE = '\\033[37m'\n\n# Background color\nBG_BLACK = '\\033[40m'\nBG_BLUE = '\\033[44m'\n\n# Reset code to return to default color\nRESET = '\\033[0m'\n\nmapper = {\n    BG_BLACK+RED: \"#dd0000\",\n    BG_BLACK+GREEN: \"#00dd00\",\n    BG_BLACK+YELLOW: \"#dddd00\",\n    BG_BLACK+BLUE: \"#0000ee\",\n    BG_BLACK+MAGENTA: \"#aa00dd\",\n    BG_BLACK+CYAN: \"#00dddd\",\n    BG_BLACK+WHITE: \"#87CEEB\",\n    BG_BLUE+WHITE: \"#ff7800\"\n}\n\ndef reformat(message):\n    for key, value in mapper.items():\n        message = message.replace(key, f'<span style=\"color: {value}\">')\n    message = message.replace(RESET, '</span>')\n    return message\n```\n\nEvery agent in the system logs its activity with a different color (set in its own `__init__`\n\n), so when this gets rendered in the browser, you can visually tell *at a glance* which agent is talking — the planner, the scanner, the frontier agent, etc. — without reading a single word.\n\nThe UI was built up in layers, which I think is a great way to learn Gradio:\n\n**Layer 1 — just get something on screen:**\n\n```\nwith gr.Blocks(title=\"The Price is Right\", fill_width=True) as ui:\n    with gr.Row():\n        gr.Markdown('<div style=\"text-align: center;font-size:24px\">The Price is Right - Deal Hunting Agentic AI</div>')\n    with gr.Row():\n        gr.Markdown('<div style=\"text-align: center;font-size:14px\">Autonomous agent framework that finds online deals, collaborating with a proprietary fine-tuned LLM deployed on Modal, and a RAG pipeline with a frontier model and Chroma.</div>')\n\nui.launch(inbrowser=True)\n```\n\n**Layer 2 — add a live data table backed by application state:**\n\n```\nwith gr.Blocks(title=\"The Price is Right\", fill_width=True) as ui:\n\n    initial_deal = Deal(product_description=\"Example description\", price=100.0, url=\"https://cnn.com\")\n    initial_opportunity = Opportunity(deal=initial_deal, estimate=200.0, discount=100.0)\n    opportunities = gr.State([initial_opportunity])\n\n    def get_table(opps):\n        return [[opp.deal.product_description, opp.deal.price, opp.estimate, opp.discount, opp.deal.url] for opp in opps]\n\n    with gr.Row():\n        opportunities_dataframe = gr.Dataframe(\n            headers=[\"Description\", \"Price\", \"Estimate\", \"Discount\", \"URL\"],\n            wrap=True,\n            column_widths=[4, 1, 1, 1, 2],\n            row_count=10,\n            col_count=5,\n            max_height=400,\n        )\n\n    ui.load(get_table, inputs=[opportunities], outputs=[opportunities_dataframe])\n\nui.launch(inbrowser=True)\n```\n\nA small but important Gradio version note from this layer: in Gradio v5, the `height`\n\nparameter for `Dataframe`\n\nwas renamed to `max_height`\n\n— exactly the kind of breaking change that's easy to lose an hour to if you don't know to look for it.\n\n**Layer 3 — wire up real agents and make rows clickable:**\n\n```\nagent_framework = DealAgentFramework()\nagent_framework.init_agents_as_needed()\n\nwith gr.Blocks(title=\"The Price is Right\", fill_width=True) as ui:\n    ...\n    def do_select(opportunities, selected_index: gr.SelectData):\n        row = selected_index.index[0]\n        opportunity = opportunities[row]\n        agent_framework.planner.messenger.alert(opportunity)\n    ...\n    opportunities_dataframe.select(do_select, inputs=[opportunities], outputs=[])\n\nui.launch(inbrowser=True)\n```\n\nThe fully assembled `price_is_right.py`\n\nbrings everything together: a background thread runs the agent framework's `run()`\n\nloop, a `queue.Queue`\n\n-based logging handler streams log lines into the UI in (near) real time, and a 3D Plotly visualization of the product vector store sits alongside the deal table:\n\n``` python\nimport logging\nimport queue\nimport threading\nimport time\nimport gradio as gr\nfrom deal_agent_framework import DealAgentFramework\nfrom log_utils import reformat\nimport plotly.graph_objects as go\nfrom dotenv import load_dotenv\n\nload_dotenv(override=True)\n\nclass QueueHandler(logging.Handler):\n    def __init__(self, log_queue):\n        super().__init__()\n        self.log_queue = log_queue\n\n    def emit(self, record):\n        self.log_queue.put(self.format(record))\n\ndef html_for(log_data):\n    output = \"<br>\".join(log_data[-18:])\n    return f\"\"\"\n    <div id=\"scrollContent\" style=\"height: 400px; overflow-y: auto; border: 1px solid #ccc; background-color: #222229; padding: 10px;\">\n    {output}\n    </div>\n    \"\"\"\n\ndef setup_logging(log_queue):\n    handler = QueueHandler(log_queue)\n    formatter = logging.Formatter(\n        \"[%(asctime)s] %(message)s\",\n        datefmt=\"%Y-%m-%d %H:%M:%S %z\",\n    )\n    handler.setFormatter(formatter)\n    logger = logging.getLogger()\n    logger.addHandler(handler)\n    logger.setLevel(logging.INFO)\n\nclass App:\n    def __init__(self):\n        self.agent_framework = None\n\n    def get_agent_framework(self):\n        if not self.agent_framework:\n            self.agent_framework = DealAgentFramework()\n        return self.agent_framework\n\n    def run(self):\n        with gr.Blocks(title=\"The Price is Right\", fill_width=True) as ui:\n            log_data = gr.State([])\n\n            def table_for(opps):\n                return [\n                    [\n                        opp.deal.product_description,\n                        f\"${opp.deal.price:.2f}\",\n                        f\"${opp.estimate:.2f}\",\n                        f\"${opp.discount:.2f}\",\n                        opp.deal.url,\n                    ]\n                    for opp in opps\n                ]\n\n            def update_output(log_data, log_queue, result_queue):\n                initial_result = table_for(self.get_agent_framework().memory)\n                final_result = None\n                while True:\n                    try:\n                        message = log_queue.get_nowait()\n                        log_data.append(reformat(message))\n                        yield log_data, html_for(log_data), final_result or initial_result\n                    except queue.Empty:\n                        try:\n                            final_result = result_queue.get_nowait()\n                            yield log_data, html_for(log_data), final_result or initial_result\n                        except queue.Empty:\n                            if final_result is not None:\n                                break\n                            time.sleep(0.1)\n\n            def get_plot():\n                documents, vectors, colors = DealAgentFramework.get_plot_data(max_datapoints=800)\n                fig = go.Figure(\n                    data=[\n                        go.Scatter3d(\n                            x=vectors[:, 0],\n                            y=vectors[:, 1],\n                            z=vectors[:, 2],\n                            mode=\"markers\",\n                            marker=dict(size=2, color=colors, opacity=0.7),\n                        )\n                    ]\n                )\n                fig.update_layout(\n                    scene=dict(\n                        xaxis_title=\"x\", yaxis_title=\"y\", zaxis_title=\"z\",\n                        aspectmode=\"manual\",\n                        aspectratio=dict(x=2.2, y=2.2, z=1),\n                        camera=dict(eye=dict(x=1.6, y=1.6, z=0.8)),\n                    ),\n                    height=400,\n                    margin=dict(r=5, b=1, l=5, t=2),\n                )\n                return fig\n\n            def do_run():\n                new_opportunities = self.get_agent_framework().run()\n                return table_for(new_opportunities)\n\n            def run_with_logging(initial_log_data):\n                log_queue = queue.Queue()\n                result_queue = queue.Queue()\n                setup_logging(log_queue)\n\n                def worker():\n                    result_queue.put(do_run())\n\n                thread = threading.Thread(target=worker)\n                thread.start()\n\n                for log_data, output, final_result in update_output(initial_log_data, log_queue, result_queue):\n                    yield log_data, output, final_result\n\n            def do_select(selected_index: gr.SelectData):\n                opportunities = self.get_agent_framework().memory\n                row = selected_index.index[0]\n                opportunity = opportunities[row]\n                self.get_agent_framework().planner.messenger.alert(opportunity)\n\n            with gr.Row():\n                opportunities_dataframe = gr.Dataframe(\n                    headers=[\"Deals found so far\", \"Price\", \"Estimate\", \"Discount\", \"URL\"],\n                    wrap=True, column_widths=[6, 1, 1, 1, 3],\n                    row_count=10, col_count=5, max_height=400,\n                )\n            with gr.Row():\n                with gr.Column(scale=1):\n                    logs = gr.HTML()\n                with gr.Column(scale=1):\n                    plot = gr.Plot(value=get_plot(), show_label=False)\n\n            ui.load(run_with_logging, inputs=[log_data], outputs=[log_data, logs, opportunities_dataframe])\n\n            timer = gr.Timer(value=300, active=True)\n            timer.tick(run_with_logging, inputs=[log_data], outputs=[log_data, logs, opportunities_dataframe])\n\n            opportunities_dataframe.select(do_select)\n\n        ui.launch(share=False, inbrowser=True)\n\nif __name__ == \"__main__\":\n    App().run()\n```\n\nThe two patterns I most want to carry forward from this file:\n\n`run_with_logging`\n\nis a Python `yield`\n\ns updated state — so the UI refreshes live while a slow agentic process runs, instead of freezing for the whole duration.`gr.Timer`\n\nfor autonomous operation.`Timer`\n\nset to 300 seconds means the whole \"scan → estimate → notify\" cycle re-runs automatically every 5 minutes — turning a notebook experiment into something that genuinely behaves like a background agent.A few cross-cutting lessons that apply far beyond this specific project:\n\n`0.8 / 0.1 / 0.1`\n\n) outperformed any single one on the held-out test set.`List[Opportunity]`\n\nto `memory.json`\n\nwas enough to give the system continuity across restarts.Putting it all together, `DealAgentFramework().run()`\n\nnow quietly: scans deal feeds, filters to the 5 best-described deals, estimates each one's true value via an ensemble of three models, picks the single best opportunity, saves it to memory, and — if it's a great deal — buzzes my phone. All while a live dashboard shows exactly what's happening and why.", "url": "https://wpnews.pro/news/what-i-learned-building-an-autonomous-deal-hunting-agent-system", "canonical_source": "https://dev.to/m_toqeer/what-i-learned-building-an-autonomous-deal-hunting-agent-system-3n6b", "published_at": "2026-06-15 11:41:48+00:00", "updated_at": "2026-06-15 11:44:57.680842+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-agents", "ai-infrastructure", "machine-learning", "large-language-models"], "entities": ["Modal", "Hugging Face", "Meta", "Llama-3.2-3B", "Gradio", "The Price Is Right"], "alternates": {"html": "https://wpnews.pro/news/what-i-learned-building-an-autonomous-deal-hunting-agent-system", "markdown": "https://wpnews.pro/news/what-i-learned-building-an-autonomous-deal-hunting-agent-system.md", "text": "https://wpnews.pro/news/what-i-learned-building-an-autonomous-deal-hunting-agent-system.txt", "jsonld": "https://wpnews.pro/news/what-i-learned-building-an-autonomous-deal-hunting-agent-system.jsonld"}}