My agent stack for automating my personal life A user describes using Codex as a personal agent to automate tasks across Gmail, WhatsApp, Telegram, iMessage, Google Drive, and other tools, enabling workflows like drafting emails and updating records with minimal manual effort. My Agent Stack For Automating My Personal Life ยท 10 min read My agent manages my emails, SMS, WhatsApp, Telegram and pretty much everything to automate my personal life. People keep asking me how I use agents in real life. I mean the actual boring things that make a day disappear: reading WhatsApp and Telegram, finding someone's email, searching the web, drafting the intro, updating a document in Google Drive, creating a calendar event, checking who still needs an answer, and doing all of it across the same messy tools I already use. My answer is disappointingly simple. I use Codex as an operator on top of my actual life data. It has tools. It has data connectors. It has skills. It has a source of truth. It has enough permissions to act locally, and enough approval gates that it does not embarrass me in public. That is basically the setup. Tools, data connectors, skills, and taste. I used to do more of this in Claude Code but I have been moving the setup to Codex because GPT-5.5 is currently a better model for this kind of work. The switch from Claude Code to Codex is not really the story. The story is that once a model is good enough, the real leverage comes from wiring it into the world you already live in. MY PERSONAL AGENT STACK ======================= agent: Codex tools: Gmail / Drive / Calendar / Docs / Sheets - gog WhatsApp - wacli Telegram - Telegram connector iMessage / SMS - imsg Browser / Chrome - browser automation macOS apps - AppleScript / UI automation local files - filesystem tools data: Google Drive as source of truth contacts in a Google Sheet mirrored as CSV Notion exported into Drive local memories and instructions skills: inbox-zero contacts google-workspace imessage-cli personal admin workflows The important part is that the agent can move across boundaries. My personal life is not in one app. It is split between Gmail, WhatsApp, Telegram, iMessage, Google Drive, Calendar, Notion, local files, random PDFs, browser sessions, and a contacts spreadsheet that is much more valuable than it looks. A Real Communication Example a-real-communication-example A few days ago a friend sent me a WhatsApp message. She was helping a fast-growing San Francisco AI startup recruit in France and wanted to connect their recruiting manager with a recruiter I know. I did not remember the recruiter's email. I did not know the latest funding news about the startup. I needed to search WhatsApp, search Gmail, find the recruiter's email, search the web, understand why the startup was credible, draft an intro email, include the two job links, show the draft to me, send the email after approval, and then text my friend that it was done. That is normally twenty minutes of annoying app switching. WhatsApp to Gmail to Google search to Gmail again to WhatsApp again. It is not hard work, but it is exactly the kind of work that burns attention because every step is a small context switch. With the agent, I asked for the outcome. It read the WhatsApp thread, searched Gmail for the recruiter's email, researched the startup's funding and recent news on the web, drafted the intro, waited for my approval, sent the email, and then texted my friend that the intro was done. The user-facing part took about ten seconds. The agent did the glue work in seconds This is the killer pattern. The agent is not "answering a question." It is operating across my tools to complete a small real-world workflow aka a "job-to-be-done" The License Plate Example the-license-plate-example Another example is even more boring, which is why I like it. I got a new license plate for my car. I sent photos and context to Codex. It updated the car information Markdown file I keep in Google Drive, changed the license plate, added the registration notes, preserved the existing VIN, insurance, owners, and address, then uploaded the file back to Drive. That alone is useful, but the better version is what happens next. The agent can use browser automation to go update the same information everywhere else: FasTrak, the parking app, insurance portals, DMV-related forms, or any other web app that does not have a clean API. For clean systems, it should use an API or CLI. For messy systems, it can use the browser and it's so good I also now use Computer Use from Codex. This is what personal agents are for. Not dramatic autonomy. Administrative continuity. I was always afraid of Openclaw yolo mode in the background. I appreciate being in control. Google Drive Is My Source Of Truth google-drive-is-my-source-of-truth The most important architectural decision I made was centralizing valuable personal information in Google Drive. For years, a lot of my knowledge lived in Notion. I like Notion as a human workspace, but I do not love it as the primary source of truth for an agent. The API works, but the workspace is too fluid: nested pages, databases, properties, permissions, formatting, backlinks, and a lot of UI-native structure that is pleasant for humans and annoying for models. So I used the Notion API to export the valuable information and move it into Google Drive. I was not trying to perfectly preserve the Notion workspace. I was trying to make the information agent-readable. Most of the useful information in Drive is Markdown or CSV, because those formats are easy for the agent to search, diff, edit, and upload back without ceremony. Google Drive became the source of truth because gogcli gives the agent a simple command line surface for Gmail, Drive, Calendar, Docs, Sheets, Contacts, and Tasks. This is an underrated point. You should not organize your knowledge only for the human UI. You should organize it for the agent's tool path. Agents like stable file IDs, text, tables, Markdown, CSVs, and commands that return JSON. If the agent can search it, download it, edit it, upload it, and cite where it came from, the data is useful. My personal data layer is embarrassingly simple. Google Drive holds the important docs, mostly as Markdown files and CSVs. Contacts live in a Google Sheet mirrored as a CSV. Notion exports land in Drive. Local instructions live in AGENTS.md . Skills live as Markdown files in folders. The source of truth is not elegant. It is legible. DATA LAYER ========== Google Drive: personal docs car docs family docs exported Notion pages PDFs spreadsheets Google Sheets / CSV: contacts phone numbers emails categories locations notes Local files: AGENTS.md memories skills scripts blog repo A lot of personal productivity is just joining across this data. One fact is in WhatsApp. Another is in Gmail. The email address is in Contacts. The date is in Calendar. The document is in Drive. The agent becomes useful when it can cross those boundaries without asking me to be the glue. One of my best investment was to create a contact.csv with the phone number, email, LinkedIn etc. of all the people I know. The Tools the-tools The core tools are boring by design. I use gogcli for Google Workspace, wacli for WhatsApp, imsg for iMessage and SMS, Browser Use or browser automation for web apps, and AppleScript or macOS UI automation when there is no better interface. The hierarchy is simple. APIs and CLIs are best. Local files are great. Browser automation is acceptable. Screen automation is the last resort. BEST TOOL SURFACE ================= API and CLI local file browser automation screen automation This hierarchy matters because agents are only as reliable as their tool surface. Asking a model to click around a website is sometimes necessary, but it is not the happy path. A command like gog gmail messages list or wacli messages list --json is much easier for the model to inspect, retry, and reason about. Here is what the tool layer looks like in practice: Gmail gog gmail messages list '"Firstname Lastname"' --max 10 --include-body gog gmail get