local assistant: tweaks (#33136)

* local assistant: tweaks

* Update source/voice_control/voice_remote_local_assistant.markdown

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* specify meaning of local. input by coderabbitai

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
This commit is contained in:
c0ffeeca7 2024-06-07 14:13:09 +02:00 committed by GitHub
parent 62b52ed2a2
commit 9078b2ccd7
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -13,11 +13,11 @@ related:
In Home Assistant, the Assist pipelines are made up of various components that together form a voice assistant.
For each component you can choose from different options. There is a speech-to-text and text-to-speech option that runs entirely local.
For each component, you can choose from different options. There is a speech-to-text and text-to-speech option that runs entirely local. No data is sent to external servers for processing.
The speech-to-text option is [Whisper](https://github.com/openai/whisper). It's an open source AI model that supports [various languages](https://github.com/openai/whisper#available-models-and-languages). We use a forked version called [faster-whisper](https://github.com/guillaumekln/faster-whisper). On a Raspberry Pi 4, it takes around 8 seconds to process incoming voice commands. On an Intel NUC it is done in under a second.
The speech-to-text option is [Whisper](https://github.com/openai/whisper). It's an open source AI model that supports [various languages](https://github.com/openai/whisper#available-models-and-languages). We use a forked version called [faster-whisper](https://github.com/guillaumekln/faster-whisper). On a Raspberry Pi 4, it takes around 8 seconds to process incoming voice commands. On an Intel NUC, it is done in under a second.
For text-to-speech we have developed [Piper](https://github.com/rhasspy/piper). Piper is a fast, local neural text-to-speech system that sounds great and is optimized for the Raspberry Pi 4. It supports [many languages](https://rhasspy.github.io/piper-samples/). On a Raspberry Pi, using medium quality models, it can generate 1.6s of voice in a second.
For text-to-speech, we have developed [Piper](https://github.com/rhasspy/piper). Piper is a fast, local neural text-to-speech system that sounds great and is optimized for the Raspberry Pi 4. It supports [many languages](https://rhasspy.github.io/piper-samples/). On a Raspberry Pi, using medium quality models, it can generate 1.6s of voice in a second.
## Prerequisites
@ -57,9 +57,9 @@ For the quickest way to get your local Assist pipeline started, follow these ste
- Enter a name. You can pick any name that is meaningful to you.
- Select the language that you want to speak.
- Under **Conversation agent**, select **Home Assistant**.
- Under **Speech-to-text**, select **faster-whisper**.
- Under **Text-to-speech**, select **piper**.
- Depending on your language, you may be able to select different language variants.
- Under **Speech-to-text**, select **faster-whisper**. Select the language.
- Under **Text-to-speech**, select **piper**. Select the language.
- Depending on your language, you may be able to select different language variants.
- If you like, pick one of the predefined wake words.
![Select wake word](/images/assist/assist_predefined_wakeword.png)
- You can even [define your own wake word](/voice_control/create_wake_word/). This is not difficult to do, but you will need to set aside a bit of time for this.
@ -77,4 +77,3 @@ View some of the options in the video below. Explained by Mike Hansen, creator o
<lite-youtube videoid="Tk-pnm7FY7c" videoStartAt="1589" videotitle="Configure your local Assist pipeline for your setup"></lite-youtube>
The options are also documented in the add-on itself. Go to the {% my supervisor_addon addon="core_whisper" title="**Whisper**" %} or the {% my supervisor_addon addon="core_piper" title="**Piper**" %} add-on and open the **Documentation** page.