home-assistant.io/google_generative_ai_conversation.markdown at a5bb5daa70f60e26bd04ccfac9cd14f4cb7ca6b0

jeans/home-assistant.io

Fork 0

mirror of https://github.com/home-assistant/home-assistant.io.git synced 2025-04-27 23:07:34 +00:00

Franck Nijhof 11a2f66ec6

Merge branch 'current' into rc

2024-12-03 18:09:37 +01:00

5.8 KiB

Raw Blame History

title

description

ha_category

ha_release

ha_iot_class

ha_config_flow

ha_codeowners

ha_domain

ha_integration_type

ha_platforms

Google Generative AI

Instructions on how to integrate Google Generative AI as a conversation agent

Voice

2023.6

Cloud Polling

true

@tronikos

google_generative_ai_conversation

service

conversation

diagnostics

docs	title
/voice_control/voice_remote_expose_devices/	Exposing entities to Assist

docs	title
/voice_control/assist_create_open_ai_personality/	Create an AI personality

url	title
https://aistudio.google.com/app/apikey	Google Generative AI API key

url	title
https://ai.google.dev/	Google Generative AI

The Google Generative AI integration adds a conversation agent powered by Google Generative AI in Home Assistant. It can optionally be allowed to control Home Assistant.

Controlling Home Assistant is done by providing the AI access to the Assist API of Home Assistant. You can control what devices and entities it can access from the {% my voice_assistants title="exposed entities page" %}. The AI is able to provide you information about your devices and control them.

This integration does not integrate with sentence triggers.

This integration requires an API key to use, which you can generate here, and to be in one of the available regions.

{% include integrations/config_flow.md %}

Generate an API Key

The Google Generative AI API key is used to authenticate requests to the Google Generative AI API. To generate an API key take the following steps:

Visit the API Keys page to retrieve the API key you'll use to configure the integration.

On the same page, you can see your plan: free of charge if the associated Google Cloud project doesn't have billing, or pay-as-you-go if the associated Google Cloud project has billing enabled. Comparison of the plans is available at this pricing page. The major differences include: the free of charge plan is rate limited, free prompts/responses are used for product improvement, and the free plan is not available in all regions.

{% include integrations/option_flow.md %}

{% configuration_basic %} Instructions: description: Instructions for the AI on how it should respond to your requests. It is written using Home Assistant Templating. Control Home Assistant: description: If the model is allowed to interact with Home Assistant. It can only control or provide information about entities that are exposed to it. Recommended settings: description: If enabled, the recommended model and settings are chosen. {% endconfiguration_basic %}

If you choose to not use the recommended settings, you can configure the following options:

{% configuration_basic %} Model: description: Model used to generate response. Temperature: description: Creativity allowed in the responses. Higher values produce a more random and varied response. A temperature of zero will be deterministic. Top P: description: Probability threshold for top-p sampling. Top K: description: Number of top-scored tokens to consider during generation. Maximum Tokens to Return in Response: description: The maximum number of words or "tokens" that the AI model should generate. Safety settings: description: Thresholds for different harmful categories. {% endconfiguration_basic %}

Talking to Super Mario

You can use an OpenAI Conversation integration to talk to Super Mario and, if you want, have him control devices in your home.

The tutorial is using OpenAI, but this could also be done with the Google Generative AI integration.

Actions

Generate content

{% tip %} This action isn't tied to any integration entry, so it won't use the model, prompt, or any of the other settings in your options. If you only want to pass text, you should use the conversation.process action. {% endtip %}

Allows you to ask Gemini Pro or Gemini Pro Vision to generate content from a prompt consisting of text and optionally images. This action populates response data with the generated content.

Data attribute	Optional	Description	Example
`prompt`	no	The prompt for generating the content.	Describe this image
`image_filename`	yes	File names for images to include in the prompt.	/tmp/image.jpg

{% raw %}

action: google_generative_ai_conversation.generate_content
data:
  prompt: >-
    Very briefly describe what you see in this image from my doorbell camera.
    Your message needs to be short to fit in a phone notification. Don't
    describe stationary objects or buildings.
  image_filename: /tmp/doorbell_snapshot.jpg
response_variable: generated_content

{% endraw %}

The response data field text will contain the generated content.

Another example with multiple images: