I’m setting up a laptop with Mint (Cinnamon) for a person who needs text-to-speech software. It seems like most of the nice-sounding ones are proprietary. Any recommendations for FOSS alternatives? And any ideas why this is an underdeveloped area for open source?

  • Shareni@programming.dev
    link
    fedilink
    arrow-up
    14
    ·
    edit-2
    10 months ago

    It seems like most of the nice-sounding ones are proprietary.

    That’s pretty standard. Most FOSS projects don’t have corporations feeding them 100’s of thousands of dollars. Even when they do, well people still say gimp is far worse than ps. Blender is one of the rare complex projects that can compete with proprietary alternatives.

    And any ideas why this is an underdeveloped area for open source?

    My best guess is that it’s really expensive and time consuming. I’d be surprised if those really good proprietary models didn’t cost $100k+ just for training.

  • ninpnin@sopuli.xyz
    link
    fedilink
    arrow-up
    5
    ·
    10 months ago

    AFAIK all of the state of the art TTS models are openly available on huggingface.co or similar. However, I’m not sure if there are nice front ends/UIs for them

  • valvin@beehaw.org
    link
    fedilink
    arrow-up
    4
    ·
    10 months ago

    TTS with coqui xTTS is fun to run with a known voice (10sec wav file is enought). It requires some resources but far less than STT like faster-whisper. I think the main issue is not running them but integrate them with the OS/softwares.

  • h3ndrik@feddit.de
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    10 months ago

    It’s been an underdeveloped topic for some time. espeak-ng is available on most distros and has some integrations available that somewhat tie it into the desktop. There are more modern solutions that sound way better. For example Coqui’s xtts2, maybe Piper which is part of Home Assistand nowadays. If your language is English, you got quite some more solutions available to choose from. But it’s a mixed bag if they sound nice, are easy to install (that also depends on which Linux distro you use and if it’s available as a package) and if they tie into the rest of the system. I’m not an expert on this, but I’d also like to have TTS and STT available on my Linux desktop witout putting to much effort into it.

  • shortwavesurfer@monero.town
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 months ago

    As a blind person, I think it’s mostly due to the fact that Linux is only on 4% of desktop. So not many blind people are using it and therefore demanding better software.