v2.3.2

USER_MANUAL

Complete guide to installing and using AudioBash on Windows and macOS.

AudioBash main window with voice input and quick navigation

01 INTRODUCTION

AudioBash is a voice-controlled terminal application designed to work seamlessly with Claude Code. Instead of typing commands, you can speak them directly into your terminal. The application transcribes your voice input using AI-powered speech recognition and executes commands in a real terminal environment.

Key features

  • Push-to-talk voice input - Hold a hotkey to record, release to transcribe and execute
  • Multi-provider transcription - Choose from Gemini, OpenAI Whisper, Claude, or Groq
  • Real terminal environment - Full PTY (pseudo-terminal) with shell access
  • Agent mode - AI-powered command generation for complex tasks
  • Cross-platform - Works on Windows 10/11 and macOS (Intel & Apple Silicon)

02 INSTALLATION

Windows installation

System requirements

  • Windows 10 or Windows 11
  • 4GB RAM minimum (8GB recommended)
  • 200MB disk space
  • Microphone for voice input

Installation steps

  1. 1
    Download the installer

    Visit the releases page and download AudioBash.Setup.2.1.1.exe

  2. 2
    Run the installer

    Double-click the .exe file. If Windows SmartScreen appears, click "More info" then "Run anyway"

  3. 3
    Launch AudioBash

    Find AudioBash in your Start Menu or use the desktop shortcut

  4. 4
    Grant microphone access

    Windows will prompt for permission on first use - click "Yes" to allow

macOS installation

System requirements

  • macOS 11 (Big Sur) or later
  • Apple Silicon (M1/M2/M3/M4) or Intel processor
  • 4GB RAM minimum (8GB recommended)
  • 200MB disk space
  • Microphone for voice input

Installation steps

  1. 1
    Download the DMG

    Visit the releases page and download:

    • Apple Silicon (M1/M2/M3/M4): AudioBash-2.1.1-arm64.dmg
    • Intel Macs: AudioBash-2.1.1.dmg
  2. 2
    Install the app

    Double-click the DMG, drag AudioBash.app to Applications, then eject the DMG

  3. 3
    First launch (important!)

    Because AudioBash is not notarized by Apple, you must bypass Gatekeeper:

    Method 1 - Right-click (recommended)

    Open Finder → Applications → Right-click AudioBash.app → Select "Open" → Click "Open" in the dialog

    Method 2 - Terminal command

    xattr -cr /Applications/AudioBash.app
  4. 4
    Grant permissions

    macOS will request:

    • Microphone access: System Settings → Privacy & Security → Microphone
    • Accessibility access: System Settings → Privacy & Security → Accessibility (required for global hotkeys)

03 FIRST-TIME SETUP

Step 1: Choose your transcription provider

AudioBash supports multiple AI providers for speech-to-text:

Provider Best for Get API key
Gemini (recommended) Fast, accurate, free tier aistudio.google.com
ElevenLabs Scribe v2 Real-time (~150ms), VAD auto-commit elevenlabs.io
OpenAI Whisper High accuracy platform.openai.com
Claude Anthropic ecosystem console.anthropic.com
Groq Very fast console.groq.com

Step 2: Enter your API key

  1. 1. Click the gear icon in the top-right corner
  2. 2. Select your transcription provider from the dropdown
  3. 3. Paste your API key in the text field
  4. 4. Click "Save"

Step 3: Test voice recording

  1. 1. Press Option+S (Mac) or Alt+S (Windows) to start recording
  2. 2. Say a simple command: "list files in current directory"
  3. 3. Press the hotkey again to stop
  4. 4. Watch the transcription appear in the terminal

04 USING AUDIOBASH

Keyboard shortcuts

Action Windows macOS
Start/stop recording Alt+S Option+S
Cancel recording Alt+A Option+A
Toggle raw/agent mode Alt+M Option+M
Show/hide window Alt+H Option+H
Clear terminal Alt+C Option+C
Cycle layouts Alt+L Option+L
New tab Ctrl+T Cmd+T
Close tab Ctrl+W Cmd+W

Voice recording

AudioBash uses a push-to-talk model:

  1. 1. Press the hotkey to start recording
  2. 2. Speak your command clearly
  3. 3. Release (or press again) to stop and transcribe
Voice recording active

Voice input panel with waveform visualization and RAW/AGENT mode toggle

Recording states

🔴
Recording
🟡
Processing
🟢
Ready
Disabled

Terminal modes

AudioBash has two modes for handling voice input:

Raw mode (default)

Transcribed text is sent directly to the terminal. What you say is exactly what gets typed.

Best for: Simple commands, when you know exactly what to type

Say: "cd projects"
Types: cd projects

Agent mode

An AI interprets your intent and generates the appropriate command. Handles natural language requests.

Best for: Complex tasks, when you're unsure of exact syntax

Say: "show JavaScript files modified this week"
Generates: find . -name "*.js" -mtime -7

Toggle between modes with Alt+M (Windows) or Option+M (Mac).

Multi-tab interface

AudioBash supports multiple terminal tabs:

  • New tab: Click the + button or press Ctrl/Cmd+T
  • Switch tabs: Click on tabs or use Ctrl+Tab
  • Close tab: Click the × on the tab or press Ctrl/Cmd+W
  • Rename tab: Double-click the tab title

Each tab maintains its own working directory, command history, and shell session.

Quick navigation panel

Quick navigation with favorites and recent directories

05 SETTINGS

Access settings by clicking the gear icon in the top-right corner.

Settings - API keys and themes

API keys and visual themes

Settings - Transcription providers

Transcription model selection

Settings - Custom instructions

Custom instructions and vocabulary

Settings - Keyboard shortcuts

All keyboard shortcuts

API keys

Setting Purpose
Transcription API key Speech-to-text (Gemini, OpenAI, etc.)
Claude API key Agent mode intelligence
ElevenLabs API key Text-to-speech responses

Transcription providers

  • Gemini 2.0 Flash - Google's fast, accurate model (recommended)
  • ElevenLabs Scribe v2 - Real-time WebSocket streaming (~150ms latency)
  • OpenAI Whisper - Industry-standard accuracy
  • Groq Whisper - Ultra-fast processing
  • Claude - Uses Anthropic's model

Mobile remote control

Control AudioBash from your phone using the web-based remote interface:

Mobile remote control settings
  • 1. Open Settings and scroll to "Mobile Remote Control"
  • 2. Set a password for secure remote access
  • 3. On your phone, visit remote.audiobash.app
  • 4. Enter your tunnel URL and password to connect
  • 5. Use voice commands or type text directly

For local network only: use the IP-based WebSocket URL shown in Settings (requires same WiFi network)

06 TROUBLESHOOTING

Windows issues

"Windows protected your PC" SmartScreen warning

Click "More info" → "Run anyway". This appears because the app isn't signed with an expensive certificate.

Global shortcuts not working

Check for conflicts with other apps, try running AudioBash as Administrator, or restart the app.

macOS issues

"AudioBash is damaged and can't be opened"

This is Gatekeeper blocking an unsigned app. Fix with:

xattr -cr /Applications/AudioBash.app

Global shortcuts (Option+S) not working

Go to System Settings → Privacy & Security → Accessibility. Find AudioBash, toggle it off and on, then restart the app.

General issues

"API key is invalid" error

Double-check you copied the entire key, verify it's for the correct provider, and check that billing is set up if required.

Transcription is inaccurate

Speak more slowly and clearly, reduce background noise, try a different provider, or check your internet connection.

07 FAQ

Is AudioBash free?

Yes! AudioBash is free and open source. The AI transcription services require API keys, which may have usage costs depending on the provider. Gemini offers a generous free tier.

Does AudioBash work offline?

No. Voice transcription requires internet connectivity to reach the AI provider APIs.

Can I use AudioBash with any shell?

Yes. On Windows, it defaults to PowerShell. On macOS, it uses your default shell (usually zsh or bash). You can start any shell by typing its name.

Can I use AudioBash with Claude Code CLI?

Absolutely! That's exactly what it's designed for. Just type claude in the terminal to start Claude Code, then use voice commands to interact with it.

Where are settings stored?

Windows: %APPDATA%\AudioBash\
macOS: ~/Library/Application Support/AudioBash/

How do I report bugs or request features?

Visit github.com/jamditis/audiobash/issues

How is it connected to the terminal? Is it linked to external terminal windows 1, 2, 3, etc.?

No, AudioBash doesn't link to external terminal windows. It has its own embedded terminal built in. The app uses xterm.js for the terminal display and node-pty to spawn real shell processes (PowerShell on Windows, zsh/bash on macOS). When you speak, the transcribed text is sent directly to the active terminal tab within the app. Each tab runs its own independent shell session - there's no connection to Terminal.app, Windows Terminal, or any other external terminal program.

AudioBash v2.3.2 - Voice-controlled terminal for Claude Code