# Project Log #10: I'm Ditching Screenshots. Here's Why.

> Source: <https://dev.to/okeke_chukwudubem_5f3bf49/project-log-10-im-ditching-screenshots-heres-why-3o7a>
> Published: 2026-06-25 18:53:09+00:00

Day 10. OCR and template matching hit their limits. UI hierarchy inspection might be the real answer.

Nine days ago, I was proud of my screenshot-based vision system. ML Kit for text. Template matching for icons. A clever fallback chain that worked most of the time.

Today, I'm ripping most of it out.

**The Breaking Point**

Last week, I tested the agent on a friend's phone. Template matching failed. The same icons I cropped on my device didn't match on his—different screen density, different rendering, different pixel arrangement.

I explored building a multi-resolution icon library. Crop every icon at 5 different DPIs? That's tedious. I explored AI-based icon detection. Train a model to recognize buttons by shape? That's heavy for a phone CPU.

Then I remembered something. Android already knows what's on the screen. It has to—it's rendering the UI. And there's a way to read that information directly.

**Enter UI Hierarchy Inspection**

ADB has a command called `uiautomator dump`

. It spits out an XML file containing every visible UI element on the screen—buttons, text fields, icons, images, everything. Each element has:

`android.widget.Button`

, `android.widget.ImageView`

)This is not a screenshot. This is the app's internal blueprint.

**Why This Changes Everything**

| Screenshot-Based (Old Way) | UI Tree (New Way) |
|---|---|
| Run OCR on a screenshot (1.5–2s) | Run one ADB command (0.5–1s) |
| If text not found, try template matching (2–4s) | Not needed. Icons have content descriptions. |
| Accuracy depends on screen resolution and DPI | Accuracy is 100%—the OS tells you exactly where things are |
| Breaks on different devices | Works across all devices. Same XML structure. |
| Can't detect icons without reference images | Icons are in the tree with coordinates |

**The First Experiment**

I ran `adb shell uiautomator dump`

on my phone, then pulled the XML file. I searched for "send." Here's a snippet of what I found:

```
xml
<node
  class="android.widget.ImageButton"
  content-desc="Send message"
  bounds="[924,1656][1020,1752]"
  clickable="true"
  package="com.whatsapp" />
```


