home-assistant.io/google_cloud.markdown at 8eb0bbcf1481cf7242d0993444294ecb901b2f3d

jeans/home-assistant.io

Fork 0

mirror of https://github.com/home-assistant/home-assistant.io.git synced 2025-06-28 13:06:48 +00:00

Franck Nijhof 4aa6c87f27

2024.10: Beta release notes (#34923 )

2024-09-25 20:44:17 +02:00

7.2 KiB

Raw Blame History

title, description, ha_category, ha_release, ha_config_flow, ha_iot_class, ha_codeowners, ha_domain, ha_platforms, ha_integration_type

title

description

ha_category

ha_release

ha_config_flow

ha_iot_class

ha_codeowners

ha_domain

ha_platforms

ha_integration_type

Google Cloud

Google Cloud Platform integration.

Speech-to-text

Text-to-speech

Voice

0.95

true

Cloud Push

@lufton

@tronikos

google_cloud

stt

tts

service

The Google Cloud integration allows you to use Google Cloud Platform APIs and integrate them into Home Assistant.

{% include integrations/config_flow.md %}

Obtaining service account file

Visit Cloud Resource Manager.
Click CREATE PROJECT button at the top.
Specify convenient Project name and click CREATE button.
Make sure that billing is enabled for your Google Cloud Platform project.
Enable needed Cloud API visiting one of the links below or APIs library, selecting your Project from the dropdown list and clicking the Continue button:
- Text-to-speech
- Speech-to-text
Set up authentication:
1. Visit this link
2. From the toolbar above the Service account list, select Create service account.
3. In the Service account name field, enter any name.
If you are requesting a text-to-speech API key:
1. Don't select a value from the Role list. No role is required to access this service.
2. Click Create. If a note appears, warning that this service account has no role, you may ignore that.
3. Return to the Service account list page and click on the service account you created in step 5 to see the details for this service account.
4. Choose the Keys tab within the details view for this service account.
5. In the Add Key dropdown, select Create New Key.
6. Specify a JSON key type and click Create.
7. A [serviceaccountname].json file will download to your browser.
8. Upload this file when asked in the integration setup.

Google Cloud text-to-speech

Google Cloud text-to-speech converts text into human-like speech in 380+ voices across 50+ languages and variants. It applies groundbreaking research in speech synthesis and Google's powerful neural networks to deliver high-fidelity audio. With this easy-to-use API, you can create lifelike interactions with your users that transform customer service, device interaction, and other applications.

Pricing

The Cloud text-to-speech API is priced monthly based on the number of characters to synthesize into audio sent to the service. For up-to-date pricing, see here.

Text-to-speech configuration

Below settings can be configured in the options of the integration and in the options parameter of the tts.speak service.

{% configuration %} language: description: "Default language of the voice, e.g., en-US. Supported languages, genders and voices listed here. Also there are extra not documented but supported languages (see dropdown here)." required: false type: string default: en-US gender: description: "Default gender of the voice, e.g., male. Supported languages, genders and voices listed here." required: false type: string default: neutral voice: description: "Default voice name, e.g., en-US-Wavenet-F. Supported languages, genders and voices listed here. Important! This parameter will override language and gender parameters if set." required: false type: string encoding: description: "Default audio encoder. Supported encodings are ogg_opus, mp3 and linear16." required: false type: string default: mp3 speed: description: "Default rate/speed of the voice, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed." required: false type: float default: 1.0 pitch: description: "Default pitch of the voice, in the range [-20.0, 20.0]. 20 means increase of 20 semitones from the original pitch. -20 means decrease of 20 semitones from the original pitch." required: false type: float default: 0.0 gain: description: "Default volume gain (in dB) of the voice, in the range [-96.0, 16.0]. If unset, or set to a value of 0.0 (dB), will play at normal native signal amplitude. A value of -6.0 (dB) will play at approximately half the amplitude of the normal native signal amplitude. A value of +6.0 (dB) will play at approximately twice the amplitude of the normal native signal amplitude. Strongly recommend not to exceed +10 (dB) as there's usually no effective increase in loudness for any value greater than that." required: false type: float default: 0.0 profiles: description: "An identifier which selects 'audio effects' profiles that are applied on (post synthesized) text-to-speech. Effects are applied on top of each other in the order they are given. Supported profile ids listed here." required: false type: list default: "[]" text_type: description: "Default text type. Supported text types are text and ssml. Read more on what is that and how to use SSML here." required: false type: string default: "text" {% endconfiguration %}

Full example

A tts.speak service call can look like:

service: tts.speak
target:
  entity_id: tts.google_cloud
data:
  cache: true
  media_player_entity_id: media_player.living_room_display
  message: this is a test
  language: en-US
  options:
    gender: male
    voice: en-US-Wavenet-F
    encoding: linear16
    speed: 0.9
    pitch: -2.5
    gain: -5.0
    text_type: ssml
    profiles:
      - telephony-class-application
      - wearable-class-device

Google Cloud speech-to-text

Google Cloud speech-to-text converts audio into text transcriptions for 125 languages and variants.

Pricing

Speech-to-text is priced based on the amount of audio successfully processed by the service each month, measured in increments of one second. For up-to-date pricing, see here under the Speech-to-text v1 API.

Speech-to-text configuration

{% configuration %} stt_model: description: "One of the transcription models here. Defaults to latest_short." required: false type: string {% endconfiguration %}

7.2 KiB Raw Blame History

Obtaining service account file

Google Cloud text-to-speech

Pricing

Text-to-speech configuration

Full example

Google Cloud speech-to-text

Pricing

Speech-to-text configuration

7.2 KiB

Raw Blame History