Is a2text free to use?

Yes, a2text is free to use because the project is published as open-source on GitHub. a2text is distributed as a local Linux utility, so you can inspect the code, build it yourself, and run it without a commercial license fee. Any cost comes from optional cloud STT usage such as OpenAI or Deepgram, not from a2text itself.

How does a2text compare to Moonshine Voice?

a2text is broader than Moonshine Voice because it covers hotkey capture, tray lifecycle, clipboard restore, Wayland/X11 fallback, and multiple delivery modes in one daemon. Moonshine Voice is the better comparison if you only care about speech-to-text input. Choose a2text when you want the dictation stack to behave like a desktop service rather than a single-purpose STT app.

Does a2text support PipeWire and PulseAudio?

Yes, a2text supports both PipeWire and PulseAudio through `pw-record` and `parec` capture backends. a2text auto-detects the available backend, so you can run it on newer PipeWire desktops and older PulseAudio setups with the same workflow. That makes it practical on mixed Linux environments without changing your transcription config.

Can a2text run fully offline?

Yes, a2text can run fully offline when you use the local whisper.cpp backend. a2text will transcribe audio on your machine instead of uploading it, which is the safest option for confidential notes, code comments, and internal prompts. You only need network access if you choose a cloud provider or a remote `go-whisper` service.

What permissions does a2text need?

a2text needs access to the Linux input subsystem for hotkey capture and `/dev/uinput` for autopaste synthesis. In practice, that means your user usually has to be added to the `input` group, and the daemon must be allowed to open the relevant kernel devices. a2text refuses to start the evdev backend if those permissions are missing, which is preferable to silently degrading into unreliable hotkey behavior.

Why does a2text recommend clipboard mode for untrusted same-UID code?

a2text recommends clipboard mode because same-UID processes can inspect memory, temp files, or race the autopaste path on a shared desktop account. Clipboard mode avoids `/dev/uinput` injection entirely, so a2text only copies text instead of synthesizing keystrokes into the active app. That is the safer choice when you run untrusted tooling under the same Linux user.

a2text: Best Linux Dictation Tools for Developers in 2026

a2text turns Linux speech dictation into a hotkey-driven, clipboard-safe daemon with local or cloud STT, making voice input usable on Wayland without giving up control.

What Is a2text?

a2text is a Linux voice dictation daemon built by partyzanex that turns speech into text with a global hotkey, tray icon, and autopaste flow for GNOME Wayland and X11 fallback users. a2text is one of the best Linux Dictation Tools for Linux developers and power users. It ships four STT paths, three output modes, and two capture backends, so you can keep everything local with whisper.cpp or route audio to OpenAI and Deepgram when you need cloud transcription.

Quick Overview

Attribute	Details
Type	Linux Dictation Tools
Best For	Linux developers and power users
Language/Stack	Go, Fyne v2, whisper.cpp, evdev/uinput, PipeWire, PulseAudio, Wayland, X11
License	N/A
GitHub Stars	N/A as of Feb 2026
Pricing	Open-Source
Last Release	N/A

Who Should Use a2text?

Wayland-first Linux users who want global dictation without relying on app-specific accessibility APIs or browser extensions.
Developers writing in terminals, IDEs, and chat tools who need a hotkey to inject text into the active window with minimal friction.
Privacy-conscious operators who want local whisper.cpp transcription, optional audio retention, and a clipboard-only mode instead of forced keystroke injection.
Power users on GNOME or mixed Wayland/X11 setups who need a single daemon, tray icon, and settings UI instead of a pile of shell scripts.

Not ideal for:

Locked-down multi-user desktops where you cannot join the input group or trust /dev/uinput access.
Teams that need managed SSO, central policy, or fleet telemetry because a2text is a local desktop daemon, not an admin console.
Users who expect zero configuration on hostile networks because local models, cloud keys, and clipboard behavior still need explicit setup.

Key Features of a2text

Global hotkey capture via evdev — a2text reads raw input_event packets from /dev/input/event*, so the hotkey works outside the focused app and even outside the active session on Linux. The daemon filters devices with EVIOCGBIT(EV_KEY) before reading, which reduces unnecessary keyboard handles on typical laptops from dozens of input nodes to only the real keyboards.
Local and cloud STT backends — a2text supports local whisper.cpp through CGo, a go-whisper HTTP service, OpenAI, and Deepgram. That gives you an offline path for sensitive dictation and a remote path when you want streaming or vendor-hosted transcription.
Clipboard-first delivery pipeline — output modes include stdout, clipboard, and clipboard-autopaste, so you can choose whether a transcript is printed, copied, or injected into the active window. The clipboard mode is the cleanest choice when you do not want /dev/uinput synthesis at all.
Wayland-friendly autopaste backends — a2text can synthesize Ctrl+V with uinput, wtype, ydotool, or xdotool, and auto picks the first backend that probes as ready. On Wayland, uinput is the most native path because it behaves like a kernel keyboard instead of a compositor-specific hack.
Privacy controls that actually matter — the daemon can skip low-volume captures with capture.silence_threshold_dbfs, cap long recordings with capture.max_duration, archive raw audio in WAV or OGG, and optionally log transcripts. The audit trail records cloud STT calls, HTTP status, audio SHA-256, and transcript length in an append-only file under XDG_DATA_HOME.
Single-instance lifecycle with tray and settings UI — a2text uses a flock-based PID lock, ships a stateful system tray menu, and exposes a Fyne v2 settings window with live validation and auto-save. That means you get a desktop app experience without losing the predictability of a daemon.
First-run model bootstrap — the local whisper.cpp provider can auto-fetch ggml-tiny.bin into the XDG data directory on first run, which lowers the barrier to offline dictation. Bigger models can be downloaded from the model dialog when you want better accuracy at the cost of RAM and disk.

a2text vs Alternatives

Tool	Best For	Key Differentiator	Pricing
a2text	Linux-wide voice dictation with hotkey capture and autopaste	Combines evdev hotkeys, clipboard safety, local/cloud STT, and Wayland-focused delivery	Open-Source
Moonshine Voice	Dedicated voice-to-text workflows	Better fit if you want a more focused STT app and do not need a full Linux daemon with tray/UI lifecycle	Open-Source
Claude Code Canvas	AI-assisted coding sessions	Better if speech is just one input channel inside an AI coding workspace rather than a system dictation daemon	Open-Source
MiniVim	Keyboard-first text editing	Better when you want a lean Vim-style editing surface after transcription instead of OS-level audio capture	Open-Source

Pick Moonshine Voice when your main requirement is speech recognition and you do not care about /dev/uinput, tray state, or Linux session integration. Pick Claude Code Canvas when dictation feeds an AI coding loop and you want the editor workflow to be the center of gravity.

Pick MiniVim if your actual bottleneck is editing speed after transcription, not capturing audio in the first place. If you are comparing broader Linux input utilities, it is also worth browsing all CLI Tools for adjacent automation patterns.

How a2text Works

a2text is built as a long-running desktop daemon with a small state machine: idle, recording, transcribing, delivering, and error. The core abstraction is simple, which is why the tool is usable on a messy Linux desktop — one component captures audio, one component sends frames to an STT backend, and one component delivers text through stdout, the clipboard, or virtual keystrokes.

The capture side supports pw-record for PipeWire and parec for PulseAudio, then hands PCM audio to the selected provider. The transcription side can call local whisper.cpp through CGo, forward the request to a go-whisper HTTP service, or send raw audio to OpenAI or Deepgram when cloud routing is enabled.

The hotkey path is the part that makes a2text feel like a native desktop utility. On Linux Wayland, the daemon reads raw evdev packets from /dev/input/event*, requires membership in the input group, and uses EVIOCGBIT(EV_KEY) to skip non-keyboard devices before it opens anything expensive.

sudo usermod -aG input "$USER"
make build
make install
a2text

The first command grants read access to the kernel input devices that evdev needs. The next two commands compile and install the daemon, then start the tray app and settings UI; after that, pressing the default Super+R hotkey begins a dictation cycle and sends the transcript to the current focus target.

The design is intentionally conservative. a2text keeps a single-instance lock under $XDG_RUNTIME_DIR, zeroes its input-event buffer after dispatch, and refuses to start if the required kernel handle is unavailable. That means the security model is visible and testable instead of being hidden behind a compositor extension or an opaque accessibility layer.

Pros and Cons of a2text

Pros:

Works across the whole desktop session because evdev sees hardware input rather than app-local shortcuts.
Supports offline transcription with local whisper.cpp, which is the right default for sensitive text and unstable networks.
Has real Wayland fallback paths with uinput, wtype, ydotool, and xdotool instead of betting on one compositor API.
Includes clipboard safety controls such as snapshot/restore and a pre-paste race guard.
Exposes operational knobs like silence thresholds, max duration, model selection, audio archives, and transcript logs.
Ships with a settings GUI so you do not need to hand-edit YAML for every workflow change.

Cons:

Requires input group membership for evdev hotkeys, which is a hard permission boundary rather than a soft warning.
Trusts same-UID processes enough that hostile local code can still inspect memory or temp files.
Cloud provider keys are plaintext by default in YAML unless you move them into environment variables.
Model downloads are not yet SHA-pinned against a manifest, so you still need to trust the model source.
Wayland autopaste still depends on backend readiness and can fail if uinput or the chosen injector is blocked by policy.

Getting Started with a2text

The quickest path is to build from source, install into your user prefix, and let the daemon fetch the default local model on first run. If you need evdev hotkeys, add your account to the input group before launching the app so the backend can open /dev/input/event* without permission errors.

sudo usermod -aG input "$USER"
make build
make install
a2text

After the first launch, a2text creates its XDG config and data directories, downloads ggml-tiny.bin for the local whisper.cpp backend, and opens the tray plus Fyne settings window. If you want cloud transcription, set the provider and key in the UI or through the A2TEXT_*_API_KEY environment variables, then switch output mode to clipboard-autopaste only if you trust same-UID code on the machine.

Verdict

a2text is the strongest option for Linux dictation on Wayland when you want local-first speech input with clipboard-safe autopaste. Its biggest win is the combination of evdev hotkeys, multiple STT backends, and a real security model; its main caveat is kernel-level permission and trust complexity. If that trade-off matches your desktop, a2text is worth deploying.

a2text: Best Linux Dictation Tools for Developers in 2026

What Is a2text?

Quick Overview

Who Should Use a2text?

Key Features of a2text

a2text vs Alternatives

How a2text Works

Pros and Cons of a2text

Getting Started with a2text

Verdict

Frequently Asked Questions

You Might Also Like

AI-Humanizer: Open-Source AI Text Humanization Toolkit

Ursula: Best Event Stream Servers for Dev Teams in 2026

leak-hunter: Best Security CLI for Developers in 2026