VoxisLive Download

Translate Foreign Stream Audio Live — Follow Any Twitch or YouTube Streamer in Your Language

The streamer you want to watch speaks Japanese, Korean, Spanish, or Russian — and you don't. VoxisLive listens to the live stream playing on your Windows PC and speaks a real-time translation in a natural voice, in your language, while the broadcast keeps rolling. No subtitles to read, no virtual audio cable to install, no bot joining the channel. You stay a viewer, hands and eyes free, and follow the stream as if it were in your own tongue.

Watch a foreign-language streamer without taking your eyes off the action

Live streaming is built around presence. The best moments — the clutch play, the unexpected reaction, the chat banter read aloud — happen in the same instant the streamer talks about them. Subtitle tools force you to choose: watch the gameplay or read the caption bar. You can't do both, and a live broadcast never pauses to let you catch up.

VoxisLive delivers the translation as spoken voice instead of text, so it lands through your ears while your eyes stay on the screen. Because the underlying model is a native simultaneous interpreter, it begins translating as the streamer is still talking and stays only a few seconds behind, the way a live human interpreter would. You hear what they're saying close to when they say it, not minutes later in a recap.

How VoxisLive translates a live stream on Windows

VoxisLive captures the audio your Windows sound card is already playing using WASAPI process-loopback. Whatever produces the sound — a Twitch player in your browser, the YouTube app, a Kick stream, a desktop client — its audio reaches your output device, and that is exactly what VoxisLive reads. It does not need access to the stream itself, your account, or the platform; it works at the operating-system audio layer.

This capture is fully driverless. There is no VB-CABLE or virtual audio device to install, nothing new appears in your sound settings, and there is no bot or browser extension involved. VoxisLive also excludes its own translated output from what it captures, so it never accidentally re-translates the voice it just spoke. Start a session in Video/Game mode and it ducks the original stream audio down while the translation plays, then brings it back — so the game sounds, the music, and the streamer's tone stay present underneath the translated voice rather than being muted away.

If you want the full architecture — WASAPI capture, the simultaneous Gemini Live model, and the spoken playback path — the how it works page walks through every stage.

Spoken translation, not a caption bar

The defining difference is that VoxisLive's output is speech. Caption-based tools — for example StreamVox, which offers subtitles in 49+ languages — render text you have to read, which works for a paused video but fights against the live, fast-moving rhythm of a stream. VoxisLive instead synthesizes a natural-sounding voice in your target language, so following a foreign streamer feels closer to watching a dubbed broadcast than reading a transcript.

You're not limited to one or two languages either: VoxisLive translates into 79 target languages, so you choose the language you want to hear and it speaks the stream back to you in that language. If you ever do want a written record of what was said, every session can be exported as a TXT, SRT, or VTT transcript and searched later from your history — but the live experience is voice-first.

Getting started with live stream translation

Install VoxisLive from the Microsoft Store — no reboot, no driver. Developers who want a free, bring-your-own-key build can compile the open-source version from GitHub instead and supply their own Gemini API key at $0. Open VoxisLive, choose your playback device as the source, pick the language you want to hear, and start a Video/Game session. Then open the stream and watch normally.

For the cost side — managed minutes versus the BYOK developer plan — see the pricing page. Streams are just one source VoxisLive handles: browse the full use cases overview, and if you also play foreign-language titles yourself, the game audio translation page covers that scenario. When you're ready, head straight to the download page.

Common questions

Can VoxisLive translate a live Twitch or YouTube stream in real time?

A: Yes. VoxisLive captures whatever audio your Windows sound card is playing through WASAPI process-loopback, so a live Twitch broadcast, a YouTube Live channel, or any stream that plays through your speakers is captured the same way. The model is a native simultaneous interpreter: it translates as the streamer talks and speaks the result aloud in your language, staying a few seconds behind the original.

Do I need a virtual audio cable or a bot to join the stream?

A: No. VoxisLive uses driverless WASAPI process-loopback to read the system audio mix directly. There is no VB-CABLE to install, no virtual device added to your audio settings, and no bot that joins anything — you are simply a viewer. It also excludes its own spoken output so it never re-translates itself.

Will I still hear the streamer's original voice?

A: Yes. In Video/Game mode VoxisLive ducks the original audio down while the translated voice speaks, then lets it return, so you keep the ambience, the game sound, and the streamer's tone underneath the translation rather than replacing them entirely.

How many languages can VoxisLive translate streams into?

A: VoxisLive supports 79 target languages. You pick the language you want to hear, and the spoken translation is delivered in that language regardless of what the streamer is speaking, as long as it is one of the supported source languages.

Hear every language, in real time.

Download