Weekend Build: A Local AI Chief of Staff
Jan 16, 2026
ProjectAII have 847 voice memos on my iPhone. I've listened back to exactly three of them.
Voice is the highest-bandwidth way to capture ideas. We speak at 150 words per minute but type at 40. It should be the ultimate tool for thinking. Instead, itโs a graveyard. Voice memos are "write-only" memory: they go in, but they almost never come out.
Apps like Granola are solving this brilliantly for meetings. But I wanted something different. I didn't just want a transcription tool; I wanted a Chief of Staff.
I wanted a system that:
- Knows my context: It should know what I worked on yesterday.
- Lives locally: My thoughts shouldn't train someone else's model.
- Connects to my brain: The output must live in Obsidian, not a siloed web app.
- Costs $0/month: Just the raw cost of intelligence (API calls).
So last weekend, I built it.
The Vision: Local Granola
The goal was simple: Talk to my phone, see the note in Obsidian. No buttons, no uploading, no "Syncing..." progress bars.
The Workflow:
The Secret Sauce: Context Injection
The problem with most transcription AI is that it's "dumb." It hears audio, but it doesn't know you. If I mention "The Phoenix Project," a generic transcriber might hear "The Feeny Project."
My local agent is smarter because it cheats. Before sending the audio to the AI, it reads my recent Daily Notes from Obsidian.
# The "Context Window"
active_context = read_last_3_daily_notes()
prompt = f"""
You are my Chief of Staff.
Here is what I have been working on recently:
{active_context}
Process this voice memo. Connect these new thoughts to my existing projects.
Fix proper nouns. Extract action items.
"""
Now, the AI isn't just transcribing words; it's connecting dots. It knows that "Project X" refers to the new marketing launch, and it formats the note accordingly.
The "Chief of Staff" Persona
I don't want a transcript. I want executive notes.
The prompt explicitly instructs the model to act as a Chief of Staff. It filters out filler words ("um," "like"), removes repetition, and structures the output into three specific sections:
- Executive Summary: The 30-second read.
- Key Insights & Decisions: The meat of the discussion.
- Action Items: Assigned tasks.
This turns a rambling 10-minute walk-and-talk into a concise, actionable strategy document.
Security: The "Flash Protocol"
I am paranoid about voice data. I don't want my raw audio sitting in a cloud storage bucket forever.
To solve this, I implemented what I call the Flash Protocol:
- Upload: Audio goes to Gemini for processing.
- Process: Wait ~15 seconds for analysis.
- Nuke: The script immediately calls
client.files.delete().
The audio exists on Google's infrastructure for less than a minute. The structured text lives on my local machine forever. The original audio file is archived locally on my hard drive, never leaving my control again.
The Result
The friction is completely gone. I can be on a walk, have an idea, tap Record, and keep moving. By the time I get back to my desk, the idea is filed, formatted, and ready for action in my second brain.
Here is a real example of a "Walk and Talk" note processed by the system:
## ๐ก Idea: Sourdough Automation (09:15)
**๐ Executive Summary**
Proposing a temperature sensor system to automate sourdough starter feeding alerts.
Integrates with Home Assistant to track fermentation activity.
**โ
Action Items**
- [ ] Order ESP32 dev board
- [ ] Test DHT22 sensor in fridge temps
- [ ] Draft MQTT logic for Home Assistant
The Return of Personal Software
For a long time, weโve been trained to buy solutions. If you have a problem, you subscribe to an app.
But apps are built for the average user. They are regression to the mean.
The "API Era" gives us our agency back. You don't need a team of engineers to build a Chief of Staff. You just need some Python glue and a Saturday afternoon.
This system is just a few hundred lines of code and costs pennies to run. But because it was built for my specific workflow, it fits my brain better than any generic app could.
The beauty of the "API Era" is that we finally have the agency to shape our own tools. You don't need a team of engineers to solve your own problems anymoreโsometimes, all you need is a weekend and a little bit of curiosity.