diff --git a/source/voice_control/voice_remote_local_assistant.markdown b/source/voice_control/voice_remote_local_assistant.markdown index 05676a7b5b5..7bf74c3677d 100644 --- a/source/voice_control/voice_remote_local_assistant.markdown +++ b/source/voice_control/voice_remote_local_assistant.markdown @@ -13,11 +13,11 @@ related: In Home Assistant, the Assist pipelines are made up of various components that together form a voice assistant. -For each component you can choose from different options. There is a speech-to-text and text-to-speech option that runs entirely local. +For each component, you can choose from different options. There is a speech-to-text and text-to-speech option that runs entirely local. No data is sent to external servers for processing. -The speech-to-text option is [Whisper](https://github.com/openai/whisper). It's an open source AI model that supports [various languages](https://github.com/openai/whisper#available-models-and-languages). We use a forked version called [faster-whisper](https://github.com/guillaumekln/faster-whisper). On a Raspberry Pi 4, it takes around 8 seconds to process incoming voice commands. On an Intel NUC it is done in under a second. +The speech-to-text option is [Whisper](https://github.com/openai/whisper). It's an open source AI model that supports [various languages](https://github.com/openai/whisper#available-models-and-languages). We use a forked version called [faster-whisper](https://github.com/guillaumekln/faster-whisper). On a Raspberry Pi 4, it takes around 8 seconds to process incoming voice commands. On an Intel NUC, it is done in under a second. -For text-to-speech we have developed [Piper](https://github.com/rhasspy/piper). Piper is a fast, local neural text-to-speech system that sounds great and is optimized for the Raspberry Pi 4. It supports [many languages](https://rhasspy.github.io/piper-samples/). On a Raspberry Pi, using medium quality models, it can generate 1.6s of voice in a second. +For text-to-speech, we have developed [Piper](https://github.com/rhasspy/piper). Piper is a fast, local neural text-to-speech system that sounds great and is optimized for the Raspberry Pi 4. It supports [many languages](https://rhasspy.github.io/piper-samples/). On a Raspberry Pi, using medium quality models, it can generate 1.6s of voice in a second. ## Prerequisites @@ -57,9 +57,9 @@ For the quickest way to get your local Assist pipeline started, follow these ste - Enter a name. You can pick any name that is meaningful to you. - Select the language that you want to speak. - Under **Conversation agent**, select **Home Assistant**. - - Under **Speech-to-text**, select **faster-whisper**. - - Under **Text-to-speech**, select **piper**. - - Depending on your language, you may be able to select different language variants. + - Under **Speech-to-text**, select **faster-whisper**. Select the language. + - Under **Text-to-speech**, select **piper**. Select the language. + - Depending on your language, you may be able to select different language variants. - If you like, pick one of the predefined wake words. ![Select wake word](/images/assist/assist_predefined_wakeword.png) - You can even [define your own wake word](/voice_control/create_wake_word/). This is not difficult to do, but you will need to set aside a bit of time for this. @@ -77,4 +77,3 @@ View some of the options in the video below. Explained by Mike Hansen, creator o The options are also documented in the add-on itself. Go to the {% my supervisor_addon addon="core_whisper" title="**Whisper**" %} or the {% my supervisor_addon addon="core_piper" title="**Piper**" %} add-on and open the **Documentation** page. -