A voice in your
browser.
ARIA is a live voice layer for your browser. She reads your tabs, delivers your morning, answers questions about whatever page you’re on, and politely yells at you when you drift into Twitter for the fourth time today.
Press ⌘ ⇧ A on any page to summon her.
Three modes.
One voice that actually knows your context.
ARIA follows a day-shaped flow — orient, assist, protect focus. No generic chatbot energy. Each mode has a specific job.
Brief Me
A 90-second start-of-day, spoken.
Tell her what you care about (HN, weather, your calendar webhook). She pulls your open tabs, scans the headlines, and reads you a short, editorial briefing. No slide deck. No dashboard. Just the voice that knows what’s going on.
- Open tabs summary + conversational delivery
- Top 5 from Hacker News, filtered by your interests
- Local weather, calendar hooks, anything you webhook in
“Good morning. 6 tabs open, 3 from GitHub. Y Combinator posted about voice agents — want me to read it?”
Talk To This
Open a real conversation with the page you’re on.
Cmd+Shift+A on any page. She reads the article, email, doc, or codebase in front of you, then actually talks with you about it. Ask follow-ups. Draft a reply. Argue with the take. Move on.
- Grabs the visible DOM, not an LLM hallucination of it
- Natural back-and-forth, not prompt-reply ping pong
- Shuts up the moment you close the overlay
Re: Follow-up on your voice-first AI pitch
Hi —
Really enjoyed the demo. Two questions before we loop in the full partnership:
1. How does ARIA handle privacy when extracting page content?
2. What’s your thinking on the on-device vs API tradeoff for latency?
Happy to jump on a call Thursday if that works.
“She’s asking about privacy and latency. Want me to draft a reply citing the on-device fallback?”
Focus Guard
She notices when you drift. You don’t.
Set a goal, list the usual suspects (Twitter, Reddit, YouTube), pick a sass level. When you wander, she interrupts — not with a dialog, not with a red banner, with her voice.
- Three sass tiers: gentle, normal, unhinged
- Quiet hours, allow-lists, and a 3-minute snooze
- Say “break” for five minutes. Say “back to it” to end.
“Babe. You said no Twitter until the PR is up. It’s not up. What are we doing here?”
What she actually
sounds like.
No corporate assistant voice. No “As an AI language model” hedging. ARIA is opinionated, direct, and a little mean when you’ve earned it. Powered by ElevenLabs Conversational AI under the hood.
- ARIA
“Morning. You have 6 tabs open, 3 of them are the same Stripe doc. Should I close the dupes?”
07:43 — Brief Me - YOU
“Yeah. And give me the top HN thread.”
- ARIA
“Top of HN is a write-up on voice agents latency. 4 minutes. Short version or the whole thing?”
- YOU
“Short version.”
- ARIA
“They hit ~480ms round-trip using a streaming TTS and a local wake word. Your setup gets ~620ms — room to improve.”
08:02 — Talk To This · hn.item/41204420 - ARIA
!“Hey. This is your fourth Twitter tab in twenty minutes. What were you supposed to be doing?”
10:11 — Focus Guard triggered - YOU
“Shipping the overlay.”
- ARIA
“Right. Go do that. I'll check back in fifteen.”
Ready in
about four minutes.
ARIA is a Chrome extension. You bring an ElevenLabs API key and pick a voice. She does the rest. Nothing runs on our servers; your keys live in chrome.storage.sync and that’s it.
- 01
Install the extension
Unpack the release zip or clone the repo and load it as an unpacked extension from chrome://extensions. - 02
Drop in your API key
Open the popup, click Settings, paste your ElevenLabs key. Pick a voice. Hit Test to confirm she speaks back. - 03
Pick a sass level
Three tiers. Start at normal. Work your way up to unhingedwhen you’re ready for the truth. - 04
Set Focus Guard (optional)
Tell her today’s goal. Add the distractions you know about (hi, Twitter). Choose quiet hours. She does the rest. - 05
Summon her
Press⌘⇧Aon any page. Or click the icon for a briefing.
Questions
people actually ask.
- No. She only listens during an active session — when you trigger Brief Me, Talk To This, or a Focus Guard interruption. Outside of those moments the microphone is completely off.
- Page content and voice are sent directly from your browser to ElevenLabs over the session. We don't run a server. Your API key stays in chrome.storage.sync on your machine.
- ARIA itself is free and MIT-licensed. You pay ElevenLabs for their API usage — typical daily use is a few cents.
- Chat requires you to look at a window. Voice works while you're reading, writing, or staring out of the window. It keeps your eyes on the task and her in your ear.
- Yes. Sass level 1 is basically a thoughtful assistant. Level 3 is 'brutally honest best friend who just got out of therapy'. Pick your tier.
- Chrome-based browsers (Chrome, Arc, Brave, Edge) work today. Firefox is on the roadmap once MV3 parity settles. Safari is unlikely short-term.