Speech-to-text app

Hi,
I’m looking for an app to do speech-to-text. There are several posts on the forum about the opposite, but I haven’t found anything for converting voice to text.
I’m looking for an app that will do translations locally, without relying on a permanent internet connection.
I tried Handy (https://handy.computer/), but the app crashes. I just found Speech Notes (https://flathub.org/en/apps/net.mkiol.SpeechNote) through a search.
Have any of you used these apps, or others?

After a bit of searching everything seemed to complicated for me but plenty of people here have more experience.
Speech Note keep coming up as a good option. It's in the Zorin/Gnome Software app.

1 Like


:wink:


1 Like

Have any of you tried it yet?
Is the transcription of good quality? That may depend on the language...

Sorry I have not.

Sorry I have not either.


Using Perplexity search engine in Zen Browser (A.I.?) comes back with:

"The best speech-to-text app on Linux currently is BlabbyAI for Linux, a native desktop application that integrates AI-powered speech-to-text across all major Linux distributions and works system-wide across all Linux applications. It offers highly accurate transcription with automatic punctuation, capitalization, and grammar correction. It is praised for seamless integration with applications like LibreOffice, Thunderbird, Slack, and VS Code, making it a versatile tool for any Linux desktop user.

Other noteworthy options include OpenAI's Whisper AI, which is powerful and supports multiple languages with offline capability, especially when configured for real-time transcription, though it may require some setup effort. There is also Handy, a free, open-source, offline speech-to-text app which prioritizes privacy and works cross-platform. Additionally, Speech Note offers offline transcription with AI and respects user privacy by processing everything locally.

In summary, BlabbyAI stands out for its accuracy, system-wide Linux integration, and smart formatting features. Whisper AI is a great open-source tool for those willing to configure it, while Handy and Speech Note offer solid offline, privacy-focused alternatives. These apps collectively represent the best current Linux speech-to-text solutions in terms of functionality, accuracy, and usability.blabby+4

Add to follow-up

  1. Speech to Text for Linux | Mint, Ubuntu, X11, Wayland
  2. https://www.reddit.com/r/accessibility/comments/1frhlvv/free_cheap_speechtotext_software_for_linux_or/
  3. https://slashdot.org/software/speech-to-text/linux/
  4. Speech Note Transcribes Voice to Text on Linux - OMG! Linux
  5. 13 Best Free Linux Speech Recognition Tools - LinuxLinks
  6. How to Use Whisper AI for Live Audio Transcription on Linux
  7. https://wisprflow.ai
  8. GitHub - cjpais/Handy: A free, open source, and extensible speech-to-text application that works completely offline.
  9. https://www.reddit.com/r/linuxquestions/comments/xb6nld/voice_dictation_software_recommended_for_linux/
  10. https://www.youtube.com/watch?v=VDMbWUfHsbk"

For many years Debian was the only distribution with a special program:
" Julian is a special version of Julius, a high-performance, small-footprint large vocabulary continuous speech recognition (LVCSR) engine, designed for grammar-based speech recognition. It runs smoothly on Debian and Linux systems and can be used for real-time speech-to-text conversion from microphone input or audio files. To use Julian on Debian, you typically clone the Julius repository, install dependencies, configure settings through .jconf files such as mic.jconf for microphone input, and run the Julius executable with the appropriate configuration to perform speech recognition. Julian differs in focusing on grammar-based recognition rather than open vocabulary, making it well-suited for command and control applications with a limited vocabulary set.

Julius supports real-time decoding using a two-pass strategy for accurate recognition with low memory requirements, and it works on various hardware from microcomputers to servers. You can install and run Julius and Julian by downloading from the official GitHub repository, editing the configuration files for your audio input, and running commands in the terminal. Typical usage involves running Julius with the -C option to specify configuration files and -input mic for live microphone input. This approach allows running a local speech-to-text engine without cloud dependency.

In summary, for speech-to-text on Debian with Julian:

  • Install dependencies and clone the Julius GitHub repo.
  • Configure Julian with proper .jconf files to define grammar and microphone input.
  • Run the executable (e.g., julius -C mic.jconf) for real-time speech recognition.
    This provides a robust, open-source solution for local speech-to-text needs on Debian Linux.

References are based on the official Julius GitHub documentation and Linux usage examples for Julian speech recognition.github+1​youtube​

  1. GitHub - julius-speech/julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine
  2. Text to Speech With Linux. : 9 Steps - Instructables
  3. https://www.youtube.com/watch?v=A-oNmgJ7qcw
  4. Speech Recogition using Julius in Linux | Achu's TechBlog
  5. GitHub - synesthesiam/voice2json: Command-line tools for speech and intent recognition on Linux
  6. https://www.linux.org/threads/text-to-speech-software-for-debian.29807/
  7. https://www.reddit.com/r/debian/comments/1137xt5/voice_to_text/
  8. Debian Accessibility Speech recognition packages
  9. http://voice2json.org
  10. Ubuntu Community Hub"

[Personal humorous side topic. Many years ago I bought a cheap standard version of Dragon Dictate in an Electronics shop for £4.50. It was aimed at Windows/Office 95. I was not impressed, uninstalled it ... and what did it also uninstall? You guessed it ... Office 95! :rofl: ]

1 Like

For your information, I tried SpeechNote, but I find it less effective than Handy, which I use on my Mac.

I had an error message related to PipeWire when launching Handy, which I was able to resolve by following this advice: https://forum.zorin.com/t/microphone-not-detected-by-pipewire/20496/22

Specifically, I followed this tutorial (in French): https://fr.linux-terminal.com/?p=6316

Commands executed as root:

sudo apt install pipewire-audio-client-libraries libspa-0.2-bluetooth libspa-0.2-jack
sudo apt install wireplumber pipewire-media-session-
sudo cp /usr/share/doc/pipewire/examples/alsa.conf.d/99-pipewire-default.conf /etc/alsa/conf.d/
sudo cp /usr/share/doc/pipewire/examples/ld.so.conf.d/pipewire-jack-*.conf /etc/ld.so.conf.d/
sudo ldconfig
sudo apt remove pulseaudio-module-bluetooth

Commands executed as user:

systemctl --user --now disable pulseaudio.service pulseaudio.socket
systemctl --user --now enable wireplumber.service

PipeWire now works well. Handy recognizes my microphone properly, and I no longer get the error at launch.

However, it is not very stable. The app crashes regularly. For example, I get warning messages like "fast text entry is not possible on X11" or it just crashes outright.

(handy:236411): Gdk-ERROR **: 10:37:07.900: The program 'handy' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadImplementation (server does not implement operation)'.
  (Details: serial 5744 error_code 17 request_code 20 (core protocol) minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the GDK_SYNCHRONIZE environment
   variable to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)

How is the Handy app installed? Did you install it from here?

Are you on XOrg or Wayland?

I'm on Xorg

1 Like

The best org is an Xorg. :smiling_face_with_sunglasses: