--- title: Google Cloud description: Google Cloud Platform integration. ha_category: - Speech-to-text - Text-to-speech - Voice ha_release: 0.95 ha_config_flow: true ha_iot_class: Cloud Push ha_codeowners: - '@lufton' - '@tronikos' ha_domain: google_cloud ha_platforms: - stt - tts ha_integration_type: service --- The Google Cloud integration allows you to use [Google Cloud Platform](https://cloud.google.com/) APIs and integrate them into Home Assistant. {% include integrations/config_flow.md %} ## Obtaining service account file 1. Visit [Cloud Resource Manager](https://console.cloud.google.com/cloud-resource-manager). 2. Click `CREATE PROJECT` button at the top. 3. Specify convenient `Project name` and click `CREATE` button. 4. [Make sure that billing is enabled for your Google Cloud Platform project](https://cloud.google.com/billing/docs/how-to/modify-project). 5. Enable needed Cloud API visiting one of the links below or [APIs library](https://console.cloud.google.com/apis/library), selecting your `Project` from the dropdown list and clicking the `Continue` button: - [Text-to-speech](https://console.cloud.google.com/flows/enableapi?apiid=texttospeech.googleapis.com) - [Speech-to-text](https://console.cloud.google.com/flows/enableapi?apiid=speech.googleapis.com) 6. Set up authentication: 1. Visit [this link](https://console.cloud.google.com/apis/credentials/serviceaccountkey) 2. From the toolbar above the `Service account` list, select `Create service account`. 3. In the `Service account name` field, enter any name. If you are requesting a text-to-speech API key: 4. Don't select a value from the Role list. **No role is required to access this service**. 5. Click `Create`. If a note appears, warning that this service account has no role, you may ignore that. 6. Return to the `Service account` list page and click on the service account you created in step 5 to see the details for this service account. 7. Choose the `Keys` tab within the details view for this service account. 8. In the `Add Key` dropdown, select `Create New Key`. 9. Specify a `JSON` key type and click `Create`. 10. A `[serviceaccountname].json` file will download to your browser. 11. Upload this file when asked in the integration setup. ## Google Cloud text-to-speech [Google Cloud text-to-speech](https://cloud.google.com/text-to-speech/) converts text into human-like speech in [380+ voices across 50+ languages and variants](https://cloud.google.com/text-to-speech/docs/voices). It applies groundbreaking research in speech synthesis and Google's powerful neural networks to deliver high-fidelity audio. With this easy-to-use API, you can create lifelike interactions with your users that transform customer service, device interaction, and other applications. ### Pricing The Cloud text-to-speech API is priced monthly based on the number of characters to synthesize into audio sent to the service. For up-to-date pricing, see [here](https://cloud.google.com/text-to-speech/pricing). ### Text-to-speech configuration Below settings can be configured in the options of the integration and in the `options` parameter of the `tts.speak` service. {% configuration %} language: description: "Default language of the voice, e.g., `en-US`. Supported languages, genders and voices listed [here](https://cloud.google.com/text-to-speech/docs/voices). Also there are extra not documented but supported languages (see dropdown [here](https://cloud.google.com/text-to-speech/#streaming_demo_section))." required: false type: string default: en-US gender: description: "Default gender of the voice, e.g., `male`. Supported languages, genders and voices listed [here](https://cloud.google.com/text-to-speech/docs/voices)." required: false type: string default: neutral voice: description: "Default voice name, e.g., `en-US-Wavenet-F`. Supported languages, genders and voices listed [here](https://cloud.google.com/text-to-speech/docs/voices). **Important! This parameter will override `language` and `gender` parameters if set**." required: false type: string encoding: description: "Default audio encoder. Supported encodings are `ogg_opus`, `mp3` and `linear16`." required: false type: string default: mp3 speed: description: "Default rate/speed of the voice, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed." required: false type: float default: 1.0 pitch: description: "Default pitch of the voice, in the range [-20.0, 20.0]. 20 means increase of 20 semitones from the original pitch. -20 means decrease of 20 semitones from the original pitch." required: false type: float default: 0.0 gain: description: "Default volume gain (in dB) of the voice, in the range [-96.0, 16.0]. If unset, or set to a value of 0.0 (dB), will play at normal native signal amplitude. A value of -6.0 (dB) will play at approximately half the amplitude of the normal native signal amplitude. A value of +6.0 (dB) will play at approximately twice the amplitude of the normal native signal amplitude. Strongly recommend not to exceed +10 (dB) as there's usually no effective increase in loudness for any value greater than that." required: false type: float default: 0.0 profiles: description: "An identifier which selects 'audio effects' profiles that are applied on (post synthesized) text-to-speech. Effects are applied on top of each other in the order they are given. Supported profile ids listed [here](https://cloud.google.com/text-to-speech/docs/audio-profiles)." required: false type: list default: "[]" text_type: description: "Default text type. Supported text types are `text` and `ssml`. Read more on what is that and how to use SSML [here](https://cloud.google.com/text-to-speech/docs/ssml)." required: false type: string default: "text" {% endconfiguration %} ### Full example A `tts.speak` service call can look like: ```yaml service: tts.speak target: entity_id: tts.google_cloud data: cache: true media_player_entity_id: media_player.living_room_display message: this is a test language: en-US options: gender: male voice: en-US-Wavenet-F encoding: linear16 speed: 0.9 pitch: -2.5 gain: -5.0 text_type: ssml profiles: - telephony-class-application - wearable-class-device ``` ## Google Cloud speech-to-text [Google Cloud speech-to-text](https://cloud.google.com/speech-to-text) converts audio into text transcriptions for [125 languages and variants](https://cloud.google.com/speech-to-text/docs/speech-to-text-supported-languages). ### Pricing Speech-to-text is priced based on the amount of audio successfully processed by the service each month, measured in increments of one second. For up-to-date pricing, see [here](https://cloud.google.com/speech-to-text/pricing) under the Speech-to-text v1 API. ### Speech-to-text configuration {% configuration %} stt_model: description: "One of the transcription models [here](https://cloud.google.com/speech-to-text/docs/transcription-model). Defaults to `latest_short`." required: false type: string {% endconfiguration %}