Bring your agent via voice onto your calls — take transcriptions, get real-time help, connect calls as an operator, and anything else you can think of.
A full SIP softphone with an AI agent that listens, speaks, and acts — in real time.
A real-time AI agent joins your calls as a third party — listening, thinking, and speaking with sub-second latency via the OpenAI Realtime API.
Both sides of the conversation are transcribed live with speaker identification. Every word captured and attributed to the right person.
Automatic call logging, recording, and transcript storage. Search through past calls and replay conversations anytime.
Describe who to call in plain English. The AI searches your contacts, builds a call queue, and dials through it automatically.
Send silent commands to the agent mid-call. Adjust its behavior without the other party hearing a thing.
Define reusable agent personas — trivia host, meeting assistant, compliance monitor — and swap instantly mid-session.
Run models on-device for speed and privacy, or route to the cloud for power. Mix providers, swap voices, and wire them together your way.
Every stage is a swappable component. Replace any piece without touching the rest. Your pipeline, your latency budget, your privacy threshold.
Run Whisper, Llama, and other models directly on your hardware. Zero network round-trips, zero data leaving your machine. Sub-100ms latency for transcription when the model runs next to the audio.
Route to OpenAI Realtime, Claude, or GPT-4o when you need frontier reasoning or multilingual support. Switch between providers with a single config change — no code rewrites.
Clone any voice or design one from scratch. Your agent speaks in the voice of your brand, your persona, or your own voice — with ElevenLabs' studio-grade synthesis running in the TTS slot.
Connect your SIP account, configure your agent, and start calling.
Link a SIP provider like Telnyx, Asterisk, or any WebSocket-compatible server. Enter your credentials and register in seconds.
Choose a persona, set custom instructions, and pick a voice. Your agent is ready to join calls as an AI copilot.
Dial out or receive calls — the agent listens, transcribes, and participates in real time. All processing runs locally on your machine.
From sales to support to accessibility — Phonegentic adapts to your workflow.
Handle inbound calls with natural conversation instead of "press 1." Route callers, take messages, and schedule appointments.
The agent silently transcribes both sides, tracks action items, and generates a summary when the call ends.
Score calls on greeting, empathy, and resolution. Flag policy violations in real time with objective, citation-backed feedback.
Live captioning for deaf or hard-of-hearing callers. Real-time transcripts make every call inclusive.
Connect your browser to let the agent read, search, and act on your behalf — with granular privacy controls on every integration.
Your agent reads, summarizes, and drafts emails during calls. Stay in context without switching tabs or losing focus on the conversation.
Check availability, schedule meetings, and resolve conflicts — all by voice while you're on the phone. No more "let me check my calendar and get back to you."
Track live flights, check gate changes, and get arrival times — perfect for travel agents, logistics teams, or anyone coordinating airport pickups.
Your agent can look things up live — verifying facts, finding business info, or pulling up answers during a call so you never have to say "I'll look that up later."
Connect to any SIP provider with WebSocket support. Built primarily for macOS and Linux.
Open source, running locally, and ready to go. Clone the repo and start building.
Get Started on GitHub